Posts by Aurum
log in
1) Message boards : Number crunching : SSE2 with app_info? (Message 2815)
Posted 5 hours ago by Aurum
My point is energy efficiency, not run time.
2) Message boards : Number crunching : OUT of tasks (Message 2812)
Posted 1 day ago by Aurum
I suppose it can be a pain in the butt and not ideal since you'd expect the full 5 days to complete and report the work unit.

I personally only set 4 days of work for my boinc clients on this project. I rarely get a cancelled unit and If I do its one I haven't started yet.


I never set a 5 day cache. Where'd you get that idea?
I always use either 0.5/0.01 days or 1.0/0.01 days. I prefer RZM but lately there's been so much demand for WUs that lets me dry out. Maybe the races are over and I should go back to RZM.
I'm not the one that waited until near the deadline. I'm one of the ones that got the replacement WUs. Then within a couple of hours of the deadline for the original WUs they submit them at the last minute and the replacement WUs get Server Aborted. If the replacement WUs were not sent out until after the deadline then this should never happen.
3) Message boards : Number crunching : OUT of tasks (Message 2803)
Posted 5 days ago by Aurum
Does the Work Generator need a little adjustment? I've been getting a lot of Server Aborts. It appears that shortly before the Deadline another pair of WUs is sent out to new computers then when the WUs actually get submitted before the Deadline the duplicates get Server Aborted. E.g.,
http://gene.disi.unitn.it/test/workunit.php?wuid=36782082
http://gene.disi.unitn.it/test/workunit.php?wuid=36765727
http://gene.disi.unitn.it/test/workunit.php?wuid=36669384
Waiting until after the actual Deadline to send out replacements might be more efficient.
4) Message boards : Number crunching : SSE2 with app_info? (Message 2792)
Posted 8 days ago by Aurum
We may need to give up the Louisiana Purchase.

Well there goes the farm belt :-)

But does any one agree with the authors that SSE uses less CPU Wattage than other instruction sets?
5) Message boards : Number crunching : SSE2 with app_info? (Message 2779)
Posted 16 days ago by Aurum
With the Northern Hemisphere on fire I'm surprised no one has taken any interest in this paper I posted about reducing CPU power consumption.

I switched all computers to SSE2.
6) Message boards : Number crunching : SSE2 with app_info? (Message 2763)
Posted 8 Jul 2022 by Aurum
Still curious about the topic of which instruction set is the most energy efficient or at least runs CPU at coolest temperature under full work load.
Found this interesting looking paper:
Thermal design power and vectorized instructions behavior, Amina Guermouche & Anne-Cécile Orgerie, CONCURRENCY & COMPUTATION: PRACTICE & EXPERIENCE, Feb 2021.
https://hal.archives-ouvertes.fr/hal-03185821/document

I haven't read the whole thing yet but it seems to imply SSE has the lowest power ratio for both CPU operation and DRAM. They also test AVX512.
"AVX2 extension adds fused multiply add instructions (FMA)."
Does this mean what we label as FMA is also AVX2?
They use the term memory-bound which I confess I don't understand. E.g., https://www.intel.com/content/www/us/en/develop/documentation/vtune-help/top/reference/cpu-metrics-reference/memory-bound.html
Are TN-Grid WUs memory-bound?
Do TN-Grid WUs ever trigger Intel CPU Turbo Boost?
7) Message boards : Number crunching : SSE2 with app_info? (Message 2745)
Posted 27 Jun 2022 by Aurum
I'm not sure I can tell the difference. I have an older Ensupra Energy Monitor. The newer ones have a graph. I thought it had an averaging feature but I can't find the instructions. The range I saw has a big overlap and may be confounded by GPU or RAM fluctuations.
8) Message boards : Number crunching : SSE2 with app_info? (Message 2744)
Posted 24 Jun 2022 by Aurum
5 Watts on a 16c/32t CPU is well worth the effort. Think I'll get out the watt meter and run a test. Thanks for the tip.
9) Message boards : Number crunching : SSE2 with app_info? (Message 2739)
Posted 24 Jun 2022 by Aurum
... also do a "chmod a+x"

Thanks. That is what I was missing. I had done the other stuff.

I'm lazy, I make a copy of my app_config and rename it, comes with permissions preloaded :-)
10) Message boards : Number crunching : SSE2 with app_info? (Message 2738)
Posted 24 Jun 2022 by Aurum
I would like to run the SSE2 version rather than the fma version on a Linux (Ubuntu 20.04.4) machine for reduced heat production.

We just started another heat wave here. Do you have a feel for the percent Wattage reduction?

Warning: For an app_info.xml to take effect you must restart BOINC. When you do it will delete every TN-Grid WU you have and DL only SSE2 WUs and start fresh.
11) Message boards : Number crunching : Curious (Message 2714)
Posted 8 Jun 2022 by Aurum
Nice work Valter! Mine have all ULed and back to normal :-)
12) Message boards : Number crunching : Curious (Message 2711)
Posted 7 Jun 2022 by Aurum
I have two finished tasks that I haven't been able to get uploaded to the server all day for some reason.

Tried all the tricks that I know, but nothing has worked. Just keep getting retried. Counts are 11 and 14 attempts so far.

I also have two tasks from different computers that refuse to upload is there a way to rectify this or should they just be aborted?

Don't abort them. They can be aborted from the server if that becomes necessary. They do not interfere with new WUs DLing so we can keep on crunching.
13) Message boards : Number crunching : Curious (Message 2701)
Posted 5 Jun 2022 by Aurum
On the server status page there is a higher than usual number of "tasks in progress". Will check tomorrow, while back in the office, if there is something strange on the server.
That does seem high. Does it include the Ready To Start WUs as well?
Every few days my Ready To Start WUs accumulate to almost 300 and I switch preferences to Resource Zero Mode. TN-GRID works really well in RZM and never seems to give me more than one extra WU waiting in the wings. But at Resource 100% it does not seem to honor the BOINC preference for how much work to buffer. All my computers are set to either 0.5 or 1.0 days but you send more than that. I believe some projects limit the maximum amount of WUs to twice the number of CPU threads and GPUgrid limits it to twice the number of GPUs.
It's been a good while since I've noticed the server running out of available WUs. Nice work tuning it up.
14) Message boards : Number crunching : Curious (Message 2700)
Posted 5 Jun 2022 by Aurum
I haven't had any in a while but occasionally I will receive a task that runs for under 1 hour (Ryzen 9 3900X). I am guessing these are just shorter tasks. Has anybody else noticed the shorter tasks?

The last workunit of every gene expansion batch is shorter than the others, it is the one containing "wu-294" in its name

Bingo! I perused my list of running WUs and the one that's running will finish in under an hour.
15) Message boards : Number crunching : Curious (Message 2697)
Posted 5 Jun 2022 by Aurum
Sure I've seen a number that ran around an hour. Right now I have plenty that are running around 2.5 hours.
I have 31 completed WUs that won't upload. They're all files under 10 kb and Retrying never sends them up. At some point they do seem to have gone up but the list has been growing longer each day for the last week.
My Validation Pending has grown to over 5000 when normally it's well under 2000 which seems odd since they finish faster.
Just curious.
16) Message boards : Number crunching : Curious (Message 2694)
Posted 4 Jun 2022 by Aurum
Recently the "Last 10 Days" value, not sure what the units are, maybe genes/day, has dropped from 88 to 66. Server status indicates there's still a lot computers working here. Time to complete WUs is still around 4 hours. I've long thought it'd be nice to have a long term chart to understand the change in genes/day.
http://gene.disi.unitn.it/test/gene_science.php
17) Message boards : Science : Miscellaneous gene-related news (Message 2693)
Posted 4 Jun 2022 by Aurum
Evidence mounts for alternate origins of Alzheimer’s plaques

https://scienceblog.com/531082/evidence-mounts-for-alternate-origins-of-alzheimers-plaques/


Reminds me of mucopolysaccharidosis. MPS is caused by a defective enzyme that is supposed to break down mucopolysaccharides for disposal so they accumulate in the lysosomes. The low pH in lysosomes is to unfold and open up proteins so they can be digested by enzymes. The article implies they've yet to figure out what reduces the acidity of neuronal lysosomes. This is a topic well worth keeping an eye on.
18) Message boards : Number crunching : Bad hosts topic (Message 2676)
Posted 18 May 2022 by Aurum
Looks like it's fixed. Installed psensor and watched the CPU temps:
sudo apt-get install psensor psensor-server -y

The E5-2699 v3 has a max CPU temp of 76.4 C before it downclocks the CPUs to protect them and CPUs were all high some up to 88 C. https://www.cpu-world.com/CPUs/Xeon/Intel-Xeon%20E5-2699%20v3.html

It had a CoolMaster CPU cooler and they have the lowest quality fans. Replaced the fan. Also checked thermal paste and it was hard so I replaced with my ThermalRight TF-8. http://www.thermalright.com/product/tf8-thermal-paste-2g/
With TF-8 it stays soft and pliable for months whereas others dry out and harden forming voids with low thermal conductivity.

CPU: Topology: 18-Core model: Intel Xeon E5-2699 v3 bits: 64 type: MT MCP
L2 cache: 45.0 MiB
Speed: 2733 MHz min/max: 1200/3600 MHz Core speeds (MHz): 1: 2608 2: 2609 3: 2610
4: 2295 5: 2610 6: 2611 7: 2611 8: 2611 9: 2611 10: 2611 11: 2611 12: 2612 13: 2609
14: 2612 15: 2612 16: 2612 17: 2612 18: 2612 19: 2613 20: 2613 21: 2613 22: 2613
23: 2299 24: 2609 25: 2614 26: 2614 27: 2614 28: 2614 29: 2614 30: 2614 31: 2609
32: 2609 33: 2609 34: 2610 35: 2610 36: 2610
19) Message boards : Number crunching : Bad hosts topic (Message 2675)
Posted 18 May 2022 by Aurum
Think I see what's happening, the CPU clocks are slowed down:
sudo inxi -C
CPU: Topology: 18-Core model: Intel Xeon E5-2699 v3 bits: 64 type: MT MCP
L2 cache: 45.0 MiB
Speed: 241 MHz min/max: 1200/3600 MHz Core speeds (MHz): 1: 250 2: 240 3: 235 4: 237
5: 248 6: 236 7: 236 8: 235 9: 234 10: 240 11: 223 12: 222 13: 255 14: 221 15: 212
16: 219 17: 230 18: 250 19: 307 20: 226 21: 221 22: 246 23: 238 24: 236 25: 228
26: 263 27: 249 28: 219 29: 221 30: 291 31: 236 32: 234 33: 245 34: 231 35: 237
36: 238

I had these two lines in my cc_config:
<process_priority>3</process_priority> <process_priority_special>2</process_priority_special>
I never noticed they caused a problem before so I just left them alone. I changed them to:
<process_priority>0</process_priority> <process_priority_special>0</process_priority_special>
and the clocks sped up:
CPU: Topology: 18-Core model: Intel Xeon E5-2699 v3 bits: 64 type: MT MCP
L2 cache: 45.0 MiB
Speed: 1247 MHz min/max: 1200/3600 MHz Core speeds (MHz): 1: 1205 2: 1295 3: 1199
4: 1343 5: 1277 6: 1210 7: 2525 8: 2310 9: 2051 10: 1199 11: 1226 12: 1325 13: 1282
14: 1300 15: 1387 16: 1199 17: 1247 18: 1199 19: 2298 20: 1199 21: 1199 22: 1199
23: 1291 24: 1254 25: 1199 26: 1199 27: 1877 28: 1980 29: 2001 30: 1199 31: 1199
32: 1287 33: 1199 34: 2179 35: 1209 36: 1850
Well that was with CPU utilization at 12% (4t). When I switched it back to 95% (34t) all the clocks went back to 250ish.
I guess it's better to just let the CPU regulate itself and not force it into a different state. Maybe it's just something about the E5-2699 v3.
20) Message boards : Number crunching : Bad hosts topic (Message 2672)
Posted 17 May 2022 by Aurum
I suspended all but 5 and went away. They're running much faster now. The CPU is 18c/36t so I'm going to work my way up to 18 WUs. Feels like the problem when WUs load too much into the L3 cache and choke the CPU traffic cop.

What feels so strange is that TN-Grid is the only project I'm running and yet this only affected a single computer. I sorted BoincTasks by WU name and other WUs with the same prefix on other computers are running at normal speed.

Does anyone know of a utility that monitors CPU cache utilization?


Next 20

Main page · Your account · Message boards


Copyright © 2022 CNR-TN & UniTN