Posts by valterc
log in
1) Message boards : Number crunching : Weird hosts (Message 1483)
Posted 7 days ago by Profile valterc
Ok, I have blacklisted it, setting its max_results_day field to -1, let see what happens.

The problem here is that I do not have any owner's e-mail address (because of the gridcoin pool mechanism)
2) Message boards : Number crunching : New TCGA workunits (TCGAz) (Message 1482)
Posted 7 days ago by Profile valterc
293 hours because it's prime.
Edit: just looked, think it was between 281 and 282 hours. 281 is prime too...
You received > 7,000 credits. Cool. Wonder why the first guy who finished it didn't get credits?

It didn't validate correctly, nevertheless I just assigned credits.
3) Message boards : Number crunching : Weird hosts (Message 1472)
Posted 11 days ago by Profile valterc
Dear all, while monitoring the overall situation of the server I just noticed this host: http://gene.disi.unitn.it/test/show_host_detail.php?hostid=33492.
Very fast execution times (suspiciously fast) and no valid results, the format of the output results is also strange.

The host belongs to a gridcoin pool, if someone were able to contact the owner please do it.

If you happen to find other 'strange' hosts, please tell me. I'm also open to suggestions about how to handle this situation.
4) Message boards : Number crunching : New TCGA workunits (TCGAz) (Message 1471)
Posted 11 days ago by Profile valterc
Yep, thanks, I also cancelled, server side, those workunits. Will check if it worked.
5) Message boards : Web site : Other projects. (Message 1468)
Posted 12 days ago by Profile valterc
On my home page, at the bottom, is a list of other projects I am or have crunched. I noticed that Einstein was missing from that list.

If I remember correctly Einstein is now requesting explicit user's permission for exporting user's data and statistics.
6) Message boards : Number crunching : New TCGA workunits (TCGAz) (Message 1467)
Posted 12 days ago by Profile valterc
Sounds fair to me. Thanks!

Ok, I did the 'credit granting' for all the results that were "aborted by user" or "can't validate" belonging to the "Too many total results" workunits, like this one http://gene.disi.unitn.it/test/workunit.php?wuid=17810097.
The ones without credits are those without any recent host statistics (I cannot figure out the values)
7) Message boards : Number crunching : New TCGA workunits (TCGAz) (Message 1464)
Posted 13 days ago by Profile valterc
I will give credits for every 'aborted by user' result inside a "Too many total results" workunit. I will calculate the average credit per hour of the involved pc, grant credits proportionally, plus a 20% bonus. For result 37630664 inside http://gene.disi.unitn.it/test/workunit.php?wuid=17810256, runtime of 2,230,059.45 sec will result in about 10k credits

37630664 run time: 2230059.45166 hostid: 3526 (n: 74) average credit : 143.67223166234 average runtime: 36048.057297297 factor: 0.0039855748807055 credit: 8888.068933016 credit+bonus: 10665.682719619


let me know if you agree with this (you need to abort all your still running TCGA wus)
8) Message boards : Number crunching : New TCGA workunits (TCGAz) (Message 1460)
Posted 18 days ago by Profile valterc
There are 9 workunits belonging to the TCGA 'bad' patch still 'around'. Some of them were canceled automatically by the server because they hit the "errors: Too many total results" limit. At his point there is no reason to keep the workunits running.
I'm thinking about giving some credits for those results (even if aborted), but I have to write down some code. I will do that while back at work, the next week.
9) Message boards : News : Server stop for maintenance (2018-12-27, 9:30 CEST) (Message 1457)
Posted 19 days ago by Profile valterc
It seems that the system is up and running, with a brand new o.s., Just tell me if you notice something strange.
(I had some problems with the new mysql policy about null values, this prevented posting to the forum, and possibly other things)
10) Message boards : News : Server stop for maintenance (2018-12-27, 9:30 CEST) (Message 1455)
Posted 26 days ago by Profile valterc
The next 27th December the server will be down for some planned maintenance tasks, starting at around 9:30 CEST (GMT+1). The most important will be upgrading the o.s. to the latest stable release. Hopefully, the server will be back online the afternoon of the same day.
11) Message boards : Number crunching : New TCGA workunits (TCGAz) (Message 1453)
Posted 27 days ago by Profile valterc
Do not worry, if you need to abort them do that. There are about 18 workunits of the TCGA batch still around (of a total of 1240), some of them are probably the most problematic ones.
12) Message boards : Number crunching : No stats sent to BoincStats today?! (Message 1451)
Posted 28 days ago by Profile valterc
Ok, my fault (a stupid typo inside a crontab line). It should be fixed now, thanks for the alert.
13) Message boards : Number crunching : New TCGA workunits (TCGAz) (Message 1442)
Posted 16 Dec 2018 by Profile valterc
There are about 30 TCGA workunits still around and for sure those are the very long ones. Theoretically those could run forever. The reason is that for a certain, very rare, type of input the algorithm's completely is exponential. We usually manage to adjust the input dataset in order to avoid this but in the current case there were an issue that we were able to fix only after the workunits were distributed. The results are of scientific value, of course, but without a checkpoint inside the critical section of the algorithm this is not the kind of computation to do inside the BOINC framework.

Anyway, I will wait for them for another couple of days then I will abort them server side, I will figure out a way to give credits even in this case.
14) Message boards : Number crunching : Wu stuck? (TCGA workunits) (Message 1438)
Posted 14 Dec 2018 by Profile valterc
GPUGRID has two queues - short one and long one. Can you maybe do the same here?

The TCGA batch behavior was unexpected. Workunits like those, without checkpoints for a very long time and somewhat unpredictable running time are not for BOINC. We don't have any plan to distribute very long workunits in the future and, for sure, workunits like the TCGA ones.

Just wanted to point up that the TCGAz workunits behave correctly.
15) Message boards : Number crunching : Wu stuck? (TCGA workunits) (Message 1436)
Posted 14 Dec 2018 by Profile valterc
The last one is probably one of the longest I ever seen (4,704.77 credits...). This one https://gene.disi.unitn.it/test/workunit.php?wuid=17810524 is probably the record until now.

Anyway, there are only a few (TCGA) workunits still around, probably the longest...
16) Message boards : Number crunching : sse2 vs avx (Message 1434)
Posted 13 Dec 2018 by Profile valterc
had to abort sse2 after 10 hours but with 442 days remaining. there were no other cpu tasks running other then this project. Looking HERE sse2 and avx had same problem but fma and an anon succeeded. wonder what the "anon" was.

"aborted by user" is not 'technically' an error, it's a user's choice. I agree that if I saw a workunit stuck at 5% with an estimated time for completion of days (even if the estimate were completely wrong) I would also be tempted to abort it. The 'problematic' behavior of the TCGA workunits doesn't depend on the version of the application.

The sse2 problem of some Ryzen cpu with the current application is a 'real' problem: the app crashes with an 'illegal instruction' error.

I just sent an e-mail to Daniel (the user who actually wrote the sse2 code) asking for hints.
17) Message boards : Number crunching : sse2 vs avx (Message 1433)
Posted 13 Dec 2018 by Profile valterc
Is fma faster than avx on the Ryzen 1700?

It seems to be just slightly, though they are so close that it would take longer-term testing to be sure. I think it would be easier for the project to find the best extension for a given processor type, and just use it.

I'll drink to that. None of my machines ever get the fma version so I don't even have the latest executable to test. Looking at the current applications it doesn't even show an fma version for Windows. Maybe v1.10 was the last fma version? I wish we could just pick the app(s) we want to run through project preferences.

The latest fma version for Windows (v1.10) was available only as a 'test' app, mainly because problems (crashes) with the earliest bios for Ryzen cpus. The difference in speed between sse, avx and fma are not so huge so we decided not to build the v1.11 fma version for Windows.
18) Message boards : Number crunching : sse2 vs avx (Message 1424)
Posted 10 Dec 2018 by Profile valterc
Just wanted to add that if you want/need to use the anonymous platform mechanism (app_info.xml) the safest way to proceed is:
. wait until your workunits queue is empty
. stop boinc
. copy/edit the app-info.xml file inside the project directory
. start boinc
19) Message boards : Number crunching : New TCGA workunits (TCGAz) (Message 1417)
Posted 7 Dec 2018 by Profile valterc
Well, I got this https://gene.disi.unitn.it/test/result.php?resultid=37659353 that spent almost 24 hours without any (visible) progress and eventually got validated. This one https://gene.disi.unitn.it/test/result.php?resultid=37623246 ran for about three days and still needs to be validated. The longest one, by now, is this one https://gene.disi.unitn.it/test/workunit.php?wuid=17810415 that got more than 2300 credits.
However, there is no 'theoretical' computational limit in the algorithm. A task may run forever.
It would be useful to let the tasks run, even for many days, by taking care not to stop BOINC in between (because of the lack of checkpointing in the 'critical' period.

I'm constantly monitoring the situation, adding credits for 'invalid' and 'too late to validate' workunits.
20) Message boards : Number crunching : New TCGA workunits (TCGAz) (Message 1414)
Posted 5 Dec 2018 by Profile valterc
OK, just to summarize: the TCGA workunits are the problematic ones, really long and with long periods of time (hours) without any checkpoint. Nevertheless we would be very happy if you let them run until completion.

The new TCGAz should behave in the usual way, i.e. you shouldn't worry if you got some.


Next 20

Main page · Your account · Message boards


Copyright © 2019 CNR-TN & UniTN