log in |
Message boards : Number crunching : Wu stuck? (TCGA workunits)
Author | Message |
---|---|
Hi, | |
ID: 1385 · Reply Quote | |
The input file is made by some 'computational chunks', usually any chunks run for more or less the same time, so it was easy to decide how many chunks to put inside a workunit and to forecast the overall computational time. | |
ID: 1386 · Reply Quote | |
Thanks for the explanation and quick feedback Valterc. | |
ID: 1387 · Reply Quote | |
Just had one that was stuck on 16+ hours and still at 17% on a relatively fast and previously reliable machine. Aborted it before I came over here to check... | |
ID: 1388 · Reply Quote | |
>> Have another one that's getting close to 75% and 7.6 hours on a very fast machine. That one's still running. | |
ID: 1389 · Reply Quote | |
I also have to abort my task unfortunately. | |
ID: 1390 · Reply Quote | |
Any work-unit is built up from a certain number of small computational pieces, in the TCGA experiments the numbers are 1000 or 200. The checkpoint is written at he end of every chunk. Obviously, if one chunk is one of the 'abnormal' ones, running for hours, there will be no checkpoint for a long period of time, unfortunately. | |
ID: 1391 · Reply Quote | |
My hyper-threaded i7-6700K Windows 10 system has 1 validated TCGA task (148365_Hs_TCGA-AR_wu-124_1543429375131_2, 561.1 credits for 11:59:57 runtime), 1 pending validation (148368_Hs_TCGA-KLF6_wu-102_1543433840285_2, 9:10:54 runtime), 4 running and 5 in its work queue.
| |
ID: 1393 · Reply Quote | |
The TCGA workunits are problematic. If you happen to have one and find it like 'frozen', i.e. no progress after a long time, feel free to abort it. | |
ID: 1395 · Reply Quote | |
3 of those tasks have now completed:
| |
ID: 1396 · Reply Quote | |
The other task mentioned in my previous message was pre-empted for 9 hours and is now running again: Now completed with 30:52:42 runtime and 16 checkpoints made, validation pending. ____________ "The ultimate test of a moral society is the kind of world that it leaves to its children." - Dietrich Bonhoeffer | |
ID: 1406 · Reply Quote | |
The output of the 'problematic' TCGA workunits were, in some cases, different if calculated on Windows or Linux. There are some validation errors because of this so I wrote down some code in order to give credits even for the TCGA invalids. See this workunit as an example: http://gene.disi.unitn.it/test/workunit.php?wuid=17810593 | |
ID: 1408 · Reply Quote | |
The output of the 'problematic' TCGA workunits were, in some cases, different if calculated on Windows or Linux. There are some validation errors because of this so I wrote down some code in order to give credits even for the TCGA invalids. See this workunit as an example: http://gene.disi.unitn.it/test/workunit.php?wuid=17810593 Very nice and much appreciated. | |
ID: 1421 · Reply Quote | |
Crunched on my 1950X and waiting for validation: | |
ID: 1435 · Reply Quote | |
The last one is probably one of the longest I ever seen (4,704.77 credits...). This one https://gene.disi.unitn.it/test/workunit.php?wuid=17810524 is probably the record until now. | |
ID: 1436 · Reply Quote | |
GPUGRID has two queues - short one and long one. Can you maybe do the same here? | |
ID: 1437 · Reply Quote | |
GPUGRID has two queues - short one and long one. Can you maybe do the same here? The TCGA batch behavior was unexpected. Workunits like those, without checkpoints for a very long time and somewhat unpredictable running time are not for BOINC. We don't have any plan to distribute very long workunits in the future and, for sure, workunits like the TCGA ones. Just wanted to point up that the TCGAz workunits behave correctly. | |
ID: 1438 · Reply Quote | |
My i7 has just completed the _8 task from workunit 148368_Hs_TCGA-KLF6_wu-154_1543433935389 with 57:17:44 runtime, 41:07:29 of it being between the 15th and 16th checkpoints. | |
ID: 1444 · Reply Quote | |
Message boards :
Number crunching :
Wu stuck? (TCGA workunits)