log in

Advanced search

Message boards : Number crunching : Curious

Author Message
Send message
Joined: 16 Mar 17
Posts: 2
Credit: 583,368
RAC: 2,965
Message 2356 - Posted: 8 Aug 2021, 17:09:36 UTC

If you work unit is marked invalid is there any information that could show that it was invalid for sure?

I tried task details but any I had invalid were the same task details as everyone else.

Task details doesn't show any helpful data. Like for example 3 replications all ended with that same number (thou in past I've seen that number different on a few invalids) Would be nice to know if its some instability in my cluster etc.

Start @ Sat Jul 31 13:08:11 2021
Finish @ Sun Aug 1 12:33:07 2021

Just curious how that part works :)

Profile valterc
Project administrator
Project tester
Send message
Joined: 30 Oct 13
Posts: 522
Credit: 27,657,605
RAC: 7,956
Message 2357 - Posted: 9 Aug 2021, 11:22:12 UTC - in response to Message 2356.
Last modified: 18 Aug 2021, 10:21:31 UTC

We use "redundancy" (two results from different computers must be exactly the same) in order to check if a workunit is "valid" (successful computation). If not a third copy of the workunit is sent to another computer. At the end, if two results are identical they are declared "OK" and all the others are marked "invalid".
So, an "invalid" is, briefly, a computation that reached its normal end but with something wrong in it, usually incorrect numeric calculations. There are many possible reasons for getting an "invalid", like overheating or a faulty hardware component. In this case I suggest to run a stress test, like Prime95, on your computer and check the results.
There is also a known bug in our code that may flag a result as invalid, this may happen if you stop your calculation at the very beginning, before the first checkpoint.

Post to thread

Message boards : Number crunching : Curious

Main page · Your account · Message boards

Copyright © 2022 CNR-TN & UniTN