Author |
Message |
valtercProject administrator Project tester Send message
Joined: 30 Oct 13 Posts: 624 Credit: 34,677,535 RAC: 0
|
We have to think a way to give users a 'reasonable' (not too high nor too low) number of credits for the work done. My idea is to grant something similar to the TRP sieve subproject of Primegrid. I have some statistics for different type of CPUs. Just to give an idea a single validated task on a Intel Core i7 CPU 930 @ 2.80GHz (6 threads reserved for boinc out of 8) will get 120.14 credits for about 45 minutes of computation. |
|
|
|
We were thinking that could be an option to give to each client a fixed number of points for each tile computed. This will give more points to the clients that have more processes running in parallel (i.e. a CPU that run 4 gene@home projects on 4 cores will give 4 * the-number-of-tiles-executed).
Of course we can discuss about this on friday (22 Nov) during the client meeting. |
|
|
|
Hi Francesco,
Server side supports a daemon called trickle message. This will help clients communicate with a server during the execution of a workunit and also inform to the server a summary of a computational state at client. So that, we can consider to grant some credits to a client without waiting the completion of a workunit. This trickle-up messages should be sent from client to server, but I have no idea how to send from client to server. Can you please check it?
PS: I think you can find http://boinc.berkeley.edu/trac/wiki/TrickleApi to get more information. |
|
|
valtercProject administrator Project tester Send message
Joined: 30 Oct 13 Posts: 624 Credit: 34,677,535 RAC: 0
|
Hi Francesco,
Server side supports a daemon called trickle message. This will help clients communicate with a server during the execution of a workunit and also inform to the server a summary of a computational state at client. So that, we can consider to grant some credits to a client without waiting the completion of a workunit. This trickle-up messages should be sent from client to server, but I have no idea how to send from client to server. Can you please check it?
PS: I think you can find http://boinc.berkeley.edu/trac/wiki/TrickleApi to get more information.
The trickle message mechanism is designed to incrementally grant credits if the jobs are very very long (more then one week). For example if there is an attached hardware sensor which collect data from time to time and send it to the server (acting as a "Non Computationally Intensive" workunit).
But the question is: if someone will abort his job after some time (or the task will abort for some reasons) do we really have any use of the partial, and potentially incorrect, result received? Credits are granted only if results are useful.
If we use the simple redundant validation there will be always a couple of good results returned. If we were able to avoid redundancy we theoretically double the power of the grid. |
|
|
valtercProject administrator Project tester Send message
Joined: 30 Oct 13 Posts: 624 Credit: 34,677,535 RAC: 0
|
We were thinking that could be an option to give to each client a fixed number of points for each tile computed. This will give more points to the clients that have more processes running in parallel (i.e. a CPU that run 4 gene@home projects on 4 cores will give 4 * the-number-of-tiles-executed).
This is automatic. The more workunits you are able to run in parallel the more credit you receive.
|
|
|
valtercProject administrator Project tester Send message
Joined: 30 Oct 13 Posts: 624 Credit: 34,677,535 RAC: 0
|
Remember that credit are granted by the validator. If the validator will catch an invalid result (for any reason) it will simply mark the workunit as invalid and send another one, granting zero credit for the invalid one.
If we find someone cheating we can think about some other measures, like zeroing all his/her credits and/or ban the user.
We also really don't need two output files (I have some ideas to discuss with the 'client' group, we can change the output format, but this is not top priority right now). This could solve the potential problem of one user returning a correct result but trying to cheat on the 'score'.
We have also to decide how many credits we want too grant (starting from the so called 'score'), once this has been decided we don't have any use of the 'score', there is no need to store it into the database.
The credit should be granted according to *our* rules (not strictly the boinc ones...). This could be tricky to achieve. The scheme method to use is the 'credit_from_wu' one. Better try to contact other boinc administrators and ask them what they did.... (I will try to contact guys at primegrid)
|
|
|
valtercProject administrator Project tester Send message
Joined: 30 Oct 13 Posts: 624 Credit: 34,677,535 RAC: 0
|
I will post here a conversation I had with one of the primegrid's administrators:
Hi Michael. I'm coordinating the efforts of a group of students (Computer Science) trying to setup a boinc server for a new project. Everything is work in progress by now but you can look here http://gene.disi.unitn.it/test/index.php for more information (please keep it secret for now....)
Briefly we are thinking about how to grant credit to the users. We don't want to use the 'credit-new' scheme but would like to adopt a method like the one you implemented on primegrid. We have workunits that will process 'tiles', we don't know in advance how many 'tiles' will be processed but, eventually, we will know this number and would like to grant credits according to it. I think that the amount of credits should be more or less the same of yours TRP Sieve subproject (well, I like it... I'm the second one in rank worldwide....).
Do you have any hints about how to proceed, starting from validator.cpp ? (we have version 7.2.4)
Thank you in advance for any suggestions
This was his answer
It depends on whether the run-time of each task is predictable. If you can reliably predict that workunit X will take Y seconds on your benchmark computer, then it's simply a matter of building that logic into the validator. You figure out how many seconds the task should take, then multiply that by a scaling factor, and that's your credit.
Figuring out the scaling factor isn't that complicated:
BOINC credit is supposed to be "X credits per unit of computation", but the "unit of computation" part isn't completely unambiguous. You want to avoid having credit that's noticeably out of line of other projects. A lot of participants are in it only for the credit, so if you're too low they won't run your project, and if it's too high other projects will be forced to raise their credit. So getting it "just right" is what your goal is.
When benchmarking, ALWAYS run just one task on one core, with the rest of the machine idle. If the other cores are running, it could modify the speed of whatever you're testing.
For example, let's say your reference computer is your desktop machine. Assume this machine runs one of our TRP sieve tasks in 109 minutes, and each of those tasks is worth 109 credits. Your reference machine is therefore capable of doing 1 credit per minute, per core. The scaling factor (assuming you're calculating run times in seconds), would be 1/60, or 0.166. If you can then predict that a task in workunit X will take Y seconds to run, the credit C you should grant is C=Y*0.166.
Sometimes it's difficult to predict the run time. We used the standard BOINC credit system for LLR until recently because it was hard to prredict LLR's run times. It's complicated, but we've figured that out and have switched LLR to our own credit system. It wasn't easy, however.
If you really can't predict the run times then the best option is to use BOINC's credit system.
Mike
and my further reply
Thank you Mike for your answer. We obviously want to give credits in line with other projects (that's why I chose TRP sieve as a reference). We cannot exactly predict the run times. We get the number of computed 'objects' only when we get the result back.
We will try to start some more benchmarks on a variety of computers. If the execution time is more or less the same we could opt for fixed credit. If the runtimes are too different (ie the standard deviation is too high) I guess that we have no other option that using the standard boinc credit system.
Thank you again
valter
|
|
|
valtercProject administrator Project tester Send message
Joined: 30 Oct 13 Posts: 624 Credit: 34,677,535 RAC: 0
|
I also contacted Jon "Slicker" Sonntag (admin of Collatz) which gave me some very detailed info
If you do not plan to use CreditNew (and who can blame you given that they recently figured out that it was not working properly and awarded much less credit than it should), you may want to consider the original cobblestone credit system. It assumes that a machine that is rated at 1 GFLOP would earn 200 credits per day. It will take more work and time to get it right, but once you get the calculations figured out, you shouldn't ever need to adjust it.
In your BOINC client, from the Advanced menu, choose "Run CPU Benchmarks" and then review the event log. You will see something like:
See: http://en.wikipedia.org/wiki/BOINC_Credit_System
and: http://www.unitedboinc.com/en/boinc-info/56-info/185-computation-credit
12/18/2013 8:02:21 AM | | Running CPU benchmarks
12/18/2013 8:02:52 AM | | Benchmark results:
12/18/2013 8:02:52 AM | | Number of CPUs: 8
12/18/2013 8:02:52 AM | | 3046 floating point MIPS (Whetstone) per CPU
12/18/2013 8:02:52 AM | | 10202 integer MIPS (Dhrystone) per CPU
You have to decide whether your project uses more floating point or more integer operations per second. Or, you can use an average (Whetstone + integer MIPS / 2).
GigaFLOPs = RAC/200
BOINC reports the megaflops and not gigaflops, so you need to divide by 1000.The numbers are also per CPU so you have to multiple by the numer of CPUs or Cores.
Using the above numbers, that means:
RAC = 200 * ((3046 whetstone + 10202 drystone)/2)/ 1000 megaflops per gigaflop * 8 cpus
or
RAC = 10598.4
or
The computer should earn 10,598.4 credits per day or 0.1226666667 credits per second or 0.015333333 credits per second per cpu.
To calculate how many credits to award, start by running a single workunit in standalone mode (outside of BOINC) and time it. Take 86400 seconds per day / seconds_to_complete_the_workunit and then multiply by the number of credits/second/cpu.
For example, if a mini_collatz workunit takes 4 hours and 20 minutes to run on the above CPU, it would be calculated as: 260 minutes * 60 seconds per minute * 0.015333333 credits per cpu per second = 239.19 credits.
If the workunits vary in size, then you will need an average in order to arrive at the correct credits and then come up with a way to adjust for the slightly shorter or longer run times of each workunit. For Collatz, I run about 20 workunits and figure out the average time and use that time to calculate the credits. I think take the average total steps from the results and use that to figure out the credits per step. I don't need to calculate anything differently for the GPUs as they earn the same credit per total steps as the CPUs do. Dr. Andersen doesn't really like my method because my GPU applications are anywhere from 30 to 100 times faster than the CPU applications depending upon the GPU whereas the SETI GPU applications are less than 10 times faster (the last time I checked anyway) so he feels that Collatz awards too much credit for the GPUs. The CreditNew is more of a socialist approach to awarding credit where everyone gets the same credit with the same hardware regardless of the project. The "old" credit system was more capitalistic in that it awarded credit for the work that was actually done. The way I see it is why should the SETI application which only utilizes 50% of the GPU award the same credit as the Collatz application which utilizes 99% of the GPU? That would be like giving the same amount of prize money to every competitor in a race regardless of what place they finish the race. The CreditNew doesn't take into account if people overclock or under-clock their hardware which can have drastic affects on the processing time.
II Thessalonians 3:10 "He that does not work, neither shall he eat."
In that spirit, rather than award them the same credit to all similar hardware like CreditNew does, I choose to award credits for the work actually done and not the work their computer __could__ do.
Also, it is my opinion that the credits should also be adjusted if the application uses less than 99% of the CPU as that means there are I/O or network constraints and the credits should not be calculated strictly on CPU time when there are other computer resources being used as well.
Likewise, if a new CPU has some new feature that allows it to finish the work faster, shouldn't it get more credit per second than the older computer that doesn't have that feature? Or, if one project makes use of all those additional features which speeds up the processing whereas another project has a poorly optimized application written by a "hello world" programmer and the new CPU takes the same time to complete the work unit as an older CPU, should the same computer earn the same credits on both projects? I think not.
As far as bypassing CreditNew, you need to:
1. create a "compute granted credit" method which calculates the credit using your chosen method
2. right before the validator calls "grant_credit" which inserts the credit into the database, comment out the current method and insert your own call to "compute_granted_credit" (google it to see how it used to be done).
It is declared like:
extern double compute_granted_credit(WORKUNIT&, vector& results);
And used like:
if (!no_credit) {
//JMS - bypass credit new
//result.granted_credit = credit;
result.granted_credit = compute_granted_credit(wu,result);
//JMS - end bypass credit new
grant_credit(host, result.sent_time, result.granted_credit, result.hostid);
}
There are at least two places in the validator.cpp code where that will need to be done.
And also a useful hint
One other thing. When you figure out how many credits to grant per WU or per whatever you choose as your measurement, you may want to have the number of credits read from a file rather than hard-coded in the validator. That way, you don't have to re-build and re-deploy every time you want to make a change. |
|
|
valtercProject administrator Project tester Send message
Joined: 30 Oct 13 Posts: 624 Credit: 34,677,535 RAC: 0
|
Ok, that's all. I started giving away invitation codes to test the application. This is the most important thing right now. We may postpone a little bit everything related to credits and use the default boinc system, just for now.
Please think carefully about the credit system (we may also find a way to have a meeting focused on this). |
|
|
valtercProject administrator Project tester Send message
Joined: 30 Oct 13 Posts: 624 Credit: 34,677,535 RAC: 0
|
Credits given by now are surely too few. We want to give credits more or less the same way (ie the same amount) given by Asteroids or by the Primegrid' TRP Sieve subproject.
When we will start to use the new credit system we will also multiply each user total credits by a (correction) factor. |
|
|