log in |
Message boards : Number crunching : Bad hosts topic
Author | Message |
---|---|
Found a bad host: http://gene.disi.unitn.it/test/show_host_detail.php?hostid=25653 | |
ID: 2287 · Reply Quote | |
Found a bad host: http://gene.disi.unitn.it/test/show_host_detail.php?hostid=25653 Good spot, that’s as bad as they come. | |
ID: 2288 · Reply Quote | |
This probably belongs to someone that even don't know what it happens here... Anyway, I just blacklisted it | |
ID: 2289 · Reply Quote | |
I have a slug of WUs that are going to take over 3 days to run, e.g. http://gene.disi.unitn.it/test/workunit.php?wuid=35037615 | |
ID: 2670 · Reply Quote | |
I have a slug of WUs that are going to take over 3 days to run, e.g. http://gene.disi.unitn.it/test/workunit.php?wuid=35037615 Probably not you PC. I also have the occasional WU that takes double the normal time or more where a wingman running similar kit takes the normal time. Why? I don’t know, I just accept it and carry on. Latest example, my R9/3900 took 20,000 seconds, wingman’s R7/5800x took the normal 10,000 seconds. | |
ID: 2671 · Reply Quote | |
I suspended all but 5 and went away. They're running much faster now. The CPU is 18c/36t so I'm going to work my way up to 18 WUs. Feels like the problem when WUs load too much into the L3 cache and choke the CPU traffic cop. | |
ID: 2672 · Reply Quote | |
I suspended all but 5 and went away. They're running much faster now. The CPU is 18c/36t so I'm going to work my way up to 18 WUs. Feels like the problem when WUs load too much into the L3 cache and choke the CPU traffic cop. That makes sense, the Ryzen t series has twice the L3 cache of the 3 series. | |
ID: 2674 · Reply Quote | |
Think I see what's happening, the CPU clocks are slowed down: <process_priority>3</process_priority>
<process_priority_special>2</process_priority_special>
I never noticed they caused a problem before so I just left them alone. I changed them to: <process_priority>0</process_priority>
<process_priority_special>0</process_priority_special>
and the clocks sped up:CPU: Topology: 18-Core model: Intel Xeon E5-2699 v3 bits: 64 type: MT MCP L2 cache: 45.0 MiB Speed: 1247 MHz min/max: 1200/3600 MHz Core speeds (MHz): 1: 1205 2: 1295 3: 1199 4: 1343 5: 1277 6: 1210 7: 2525 8: 2310 9: 2051 10: 1199 11: 1226 12: 1325 13: 1282 14: 1300 15: 1387 16: 1199 17: 1247 18: 1199 19: 2298 20: 1199 21: 1199 22: 1199 23: 1291 24: 1254 25: 1199 26: 1199 27: 1877 28: 1980 29: 2001 30: 1199 31: 1199 32: 1287 33: 1199 34: 2179 35: 1209 36: 1850 Well that was with CPU utilization at 12% (4t). When I switched it back to 95% (34t) all the clocks went back to 250ish. I guess it's better to just let the CPU regulate itself and not force it into a different state. Maybe it's just something about the E5-2699 v3. | |
ID: 2675 · Reply Quote | |
Looks like it's fixed. Installed psensor and watched the CPU temps: sudo apt-get install psensor psensor-server -y The E5-2699 v3 has a max CPU temp of 76.4 C before it downclocks the CPUs to protect them and CPUs were all high some up to 88 C. https://www.cpu-world.com/CPUs/Xeon/Intel-Xeon%20E5-2699%20v3.html It had a CoolMaster CPU cooler and they have the lowest quality fans. Replaced the fan. Also checked thermal paste and it was hard so I replaced with my ThermalRight TF-8. http://www.thermalright.com/product/tf8-thermal-paste-2g/ With TF-8 it stays soft and pliable for months whereas others dry out and harden forming voids with low thermal conductivity. CPU: Topology: 18-Core model: Intel Xeon E5-2699 v3 bits: 64 type: MT MCP L2 cache: 45.0 MiB Speed: 2733 MHz min/max: 1200/3600 MHz Core speeds (MHz): 1: 2608 2: 2609 3: 2610 4: 2295 5: 2610 6: 2611 7: 2611 8: 2611 9: 2611 10: 2611 11: 2611 12: 2612 13: 2609 14: 2612 15: 2612 16: 2612 17: 2612 18: 2612 19: 2613 20: 2613 21: 2613 22: 2613 23: 2299 24: 2609 25: 2614 26: 2614 27: 2614 28: 2614 29: 2614 30: 2614 31: 2609 32: 2609 33: 2609 34: 2610 35: 2610 36: 2610 | |
ID: 2676 · Reply Quote | |
Message boards :
Number crunching :
Bad hosts topic