Help with invalid tasks and computation errors?
log in

Advanced search

Message boards : Number crunching : Help with invalid tasks and computation errors?

Author Message
autouzi
Send message
Joined: 14 Jan 20
Posts: 3
Credit: 0
RAC: 0
United States
Message 1665 - Posted: 14 Jan 2020, 3:39:48 UTC

Would anybody who knows the terminology mind taking a look at my invalid tasks and errors to see why they are happening? I use GRC Pool, so you will need to use the link to my PC at the bottom of text.

Primarily, I am interested in the computation errors and why this is happening. I have not been able to replicate any instability in any other tests. I am also confused with the invalid tasks and how they can be invalid without being a computation error.

Any help is appreciated!
Link to computer 56280
http://gene.disi.unitn.it/test/results.php?hostid=56280&offset=0&show_names=0&state=0&appid=

Profile valterc
Project administrator
Project tester
Send message
Joined: 30 Oct 13
Posts: 454
Credit: 22,637,230
RAC: 9,630
Italy
Message 1667 - Posted: 14 Jan 2020, 10:05:49 UTC - in response to Message 1665.

Sometimes, for a lot of different reasons, the computation ends 'correctly' but the results are not. This is the reason we implement the 'redundancy' feature (one result is marked correct if it is bit-wise identical to another one).

We also know we have a small bug in our code (very infrequent, that we were not able to catch). In some cases, when the computation of a task is stopped at the very beginning, before the first checkpoint, the output file become 'inconsistent', thus the computation will produce an 'invalid' result (it can happen if you see 'Start from checkpoint: 1' in the log). Keeping a small workunit queue (thus avoiding BOINC going into 'rush' mode) will mitigate this problem.

Error 194 is sometimes an effect of the computer being unresponsive, too much load, see http://wuprop.boinc-af.org/forum_thread.php?id=402

autouzi
Send message
Joined: 14 Jan 20
Posts: 3
Credit: 0
RAC: 0
United States
Message 1668 - Posted: 15 Jan 2020, 1:22:31 UTC - in response to Message 1667.

So fairly normal. Thank you for your response and all you do for this project! This is my favorite project available on GRC Pool because of the potential to help us better understand the complex subject of genetics.

Timber
Send message
Joined: 20 Jan 20
Posts: 5
Credit: 540,906
RAC: 0
Canada
Message 1672 - Posted: 21 Jan 2020, 17:57:16 UTC

3 failed (and errored) tasks so far on a Ryzen 7 1800x, running Windows 10.
An example of one of the errored tasks:
https://gene.disi.unitn.it/test/result.php?resultid=46767050
the machine this is happening on:
https://gene.disi.unitn.it/test/show_host_detail.php?hostid=56399
At least it's not a day of work lost. None of the tasks have shown as invalid, yet.

Jim1348
Send message
Joined: 29 Dec 16
Posts: 41
Credit: 6,353,353
RAC: 10,216
United States
Message 1673 - Posted: 21 Jan 2020, 20:36:37 UTC - in response to Message 1672.

3 failed (and errored) tasks so far on a Ryzen 7 1800x, running Windows 10.

It could be the segfault error. I see them on my Ryzen 1700 occasionally, but not on my Ryzen 2700. (And my Ryzen 1700 is one of the "fixed" versions, produced after they introduced the fix.)

M0CZY
Avatar
Send message
Joined: 8 Nov 19
Posts: 1
Credit: 535
RAC: 11
United Kingdom
Message 1917 - Posted: 9 Aug 2020, 19:45:55 UTC

The application gene@home PC-IM v1.10 (sse2) doesn't work on my 32-bit Linux computer (Computer ID 54726).
It runs for about 15 seconds, then ends in Computation error (Exit status 193 (0xc1) EXIT_SIGNAL).
My /proc/cpuinfo file contains:

processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 13 model name : Intel(R) Pentium(R) M processor 2.26GHz stepping : 8 microcode : 0x20 cpu MHz : 2267.000 cache size : 2048 KB physical id : 0 siblings : 1 core id : 0 cpu cores : 1 apicid : 0 initial apicid : 0 fdiv_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov clflush dts acpi mmx fxsr sse sse2 ss tm pbe nx bts cpuid est tm2 pti bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit bogomips : 4522.40 clflush size : 64 cache_alignment : 64 address sizes : 32 bits physical, 32 bits virtual power management:
so it does support sse2.

A typical Stderr output looks like:
<core_client_version>7.16.6</core_client_version> <![CDATA[ <message> process exited with code 193 (0xc1, -63)</message> <stderr_txt> Start @ Sun Aug 9 20:24:17 2020 SIGILL: illegal instruction Stack trace (7 frames): ../../projects/gene.disi.unitn.it_test/gene_pcim_v1.10_linux32__sse2[0x8072b8a] linux-gate.so.1(__kernel_sigreturn+0x0)[0xb7eecd14] ../../projects/gene.disi.unitn.it_test/gene_pcim_v1.10_linux32__sse2[0x804c974] ../../projects/gene.disi.unitn.it_test/gene_pcim_v1.10_linux32__sse2[0x80528ac] ../../projects/gene.disi.unitn.it_test/gene_pcim_v1.10_linux32__sse2[0x804b239] /lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0xb7be6e91] ../../projects/gene.disi.unitn.it_test/gene_pcim_v1.10_linux32__sse2[0x804b4a3] Exiting... </stderr_txt> ]]>

I am unable to tell what I have done wrong.
The non-sse2 app seems to work, but is probably a lot slower than the sse2 version.

manalog
Send message
Joined: 5 Oct 15
Posts: 24
Credit: 336,677
RAC: 283
Italy
Message 1918 - Posted: 10 Aug 2020, 9:08:55 UTC - in response to Message 1917.

You are doing nothing wrong: the linux 32 app is compiled having a 'core2' processor target, so is not a 'pure' sse2 pentium4 compatible app ranher it probably ingorporates some newer extensions. I have compiled an sse2 version tdat runs on p4 and should run also on yours if it supports sse2. If not, I compiled also a sse version that sdould run even on pentium IiI. Now I am travelling, I will post tdem on this forum for you before friday ;)
Or you can compile it by yourself, just check my post 'sse3 optimization and android binary' on this post and figure out how to do it.


Post to thread

Message boards : Number crunching : Help with invalid tasks and computation errors?


Main page · Your account · Message boards


Copyright © 2020 CNR-TN & UniTN