Report Deadlines
log in

Advanced search

Message boards : Number crunching : Report Deadlines

Author Message
additude
Send message
Joined: 3 Feb 19
Posts: 10
Credit: 710,675
RAC: 0
United States
Message 1492 - Posted: 9 Feb 2019, 16:57:49 UTC
Last modified: 9 Feb 2019, 17:17:28 UTC

A few days ago (6 days) I turned 4 of my ARM machines over to crunch outside my team and I am running Ubuntu headless. Clean fresh install, updated, installed BOINC and attached to TN-GRID project. I also attached to Universe Project.

I have downloaded WU's from both, but my machines will only run TN-GRID WU's and I believe the reason has something to do with report deadlines.


I'm monitoring today and I every TN-GRID WU I'm processing has a due date of today, Feb. 9th. If due dates are in local time, I finished one WU less that 30 minutes before it's due time. If the time is in GMT, I'm 4 hrs and 30 minutes late.


So far I've done 200 WU's in 6 days... so there's no slack time on my part.....

I haven't ran a single Universe WU yeat because of report deadline precidence of TN-GRID WU's....

I think....

I'm just wondering if there is any sensible explanaition...

Thanks.


This is a link to WU's I received on 4th...due on the 9th....

http://gene.disi.unitn.it/test/results.php?userid=2295&offset=260&show_names=0&state=0&appid= Due in 5 days....


That with other WU's I have 229 in progress......

Jim1348
Send message
Joined: 29 Dec 16
Posts: 87
Credit: 21,013,002
RAC: 0
United States
Message 1493 - Posted: 9 Feb 2019, 22:53:21 UTC - in response to Message 1492.

A five day deadline is normal. It appears that you downloaded too many initially, and that is why the last ones are having trouble finishing on time. It is probably due to inaccurate time estimates of the initial group of work units. That should be corrected over time, and after the first batch, it should be better (i.e., you download less each time you request them). That is the way the BOINC scheduler is supposed to work, and sometimes it does.

But I am on Linux and Windows, and assume it will work out OK on ARM also.

additude
Send message
Joined: 3 Feb 19
Posts: 10
Credit: 710,675
RAC: 0
United States
Message 1494 - Posted: 10 Feb 2019, 11:35:27 UTC - in response to Message 1493.

Cool....I'm hoping it will iron itself out...
But a 5 day WU is a pretty tight schedule, I think especially on machines that are not dedicated.


Thanks.

Profile Buro87 [Lombardia]
Send message
Joined: 23 Nov 16
Posts: 100
Credit: 4,000,541
RAC: 0
Italy
Message 1495 - Posted: 11 Feb 2019, 8:56:15 UTC
Last modified: 11 Feb 2019, 8:58:57 UTC

It happen on all project. if you have a lot wu with tight deadline, boin try to finish them. For example yesterday on WCG with subproject FAH (1 day deadline)and MIP (10 days), boinc complete firs all FAH wu because i have set request for 3 days work.

If you have the internet connection always on, try to set request work for 1 day or less. In that case boinc will switch project because it see that wu are not so in hurry

additude
Send message
Joined: 3 Feb 19
Posts: 10
Credit: 710,675
RAC: 0
United States
Message 1499 - Posted: 14 Feb 2019, 14:20:39 UTC - in response to Message 1495.

Yea, Thanks.

I ended up doing a nomorework and then I had to suspend TN for 24 hours...


I put my TN settings to 10% and 2 days work.


Once I get a balance between projects I'll reinstate everything...

Thanks again.

-Wes

Profile valterc
Project administrator
Project tester
Send message
Joined: 30 Oct 13
Posts: 623
Credit: 34,677,535
RAC: 1
Italy
Message 1500 - Posted: 15 Feb 2019, 10:24:34 UTC - in response to Message 1499.

For a long time TN-Grid had workunits with a 4 days deadline. A couple of months ago I added another day. I could obviously add another day to the deadline, maybe reducing the number of workunits a computer can cache. It's just matter of balancing the overall system. The reason of having a short deadline is mainly because a lot of workunits are really returned too late (even months later).

additude
Send message
Joined: 3 Feb 19
Posts: 10
Credit: 710,675
RAC: 0
United States
Message 1501 - Posted: 15 Feb 2019, 12:47:22 UTC - in response to Message 1500.

Hi,
I've been doing this for quite awhile and although I haven't done TN-Grid until recently, I do run multiple projects on one machine and when running more than one project there has to be a harmonious balance between them. Most other projects are 10 days to 14 days on report deadlines.

Case 1. I see some WU's can take 20 hours to complete, while some are 5 or 6 hours. Well, 5 hours is 1 hour a day of processing over 5 days. 20 is 4 hours a day. A percentage of folks don't run dedicated machines like I do and they don't leave their machines powered on 24/7. These folks are being excluded from the project because they can't meet the imposed report deadlines. I have some completed WU's that are "Completed, too late to validate" and that means I spent 15 hours of my time for nothing. BOINC refers to this as "Waste". If that's going to be the case then I am forced to move on to a different project because I am forfeiting any incentive to process these WU's.

Case 2. BOINC Manager has it's own Priority Algorithm (EDF) that prioritizes WU's by due dates. If I'm processing two projects at any processing percentage between them, 50/50, 1/99, etc. and one project has a 14 day due date and the other a 5 day due date then you can see how BOINC will tend to favor the 5 day WU's and override any settings I have for split time processing between projects. Even if I have it tilted strongly in favor of the other project, BOINC will override that for a due date priority.
It's not just about setting the cache size to 1. That is defeating the purpose and need of having a WU cache at all to compensate for connection problems, etc. It is proven by BOINC that a 1 cache value increase "Waste" dramatically.

TN-Grid supplies WU's that the community processes and returns for an incentive. That incentive, as simple as it is, is a numerical value, status and a badge signifying accomplishment.
TN-Grid needs results as soon as it can get them all returned. It would be great if every BOINCer had TN-Grid project on their machines, but it is not that way. A 5 day due date is a further restriction that limits the number of WU's that can be processed by everyone attached to the project. Assigned tasks that do not get processed need to be re-issued.

You may think that a 5 day due date provides a improved project performance, quicker turn-around, but that in itself is limiting, a restriction that may be causing extended results reporting by restricting to the number of users that have the "means" to process a task in 5 days or less...and those that don't participate because they don't even want to deal with it.

Universe, Asteroids, Seti - all 10-14 days. There is valid logic behind that number.

In reality, a minimal report deadline is directly comparable to a minimal cache size. The lesser they are the higher the increased value of "Waste".

I recommend setting report deadline dates to 10 days and monitoring this to see what scheduling and results performance does. It might take 4 weeks or a bit longer to balance and get into a rhythm. But you should continually see improvements in efficiency.

Just think about it.

Jim1348
Send message
Joined: 29 Dec 16
Posts: 87
Credit: 21,013,002
RAC: 0
United States
Message 1504 - Posted: 15 Feb 2019, 14:11:39 UTC - in response to Message 1501.

I always keep the default buffer of 0.1 + 0.5 days, and like short turnarounds. You can always argue for special cases, but projects are designed to maximize the science (or should be designed that way).

additude
Send message
Joined: 3 Feb 19
Posts: 10
Credit: 710,675
RAC: 0
United States
Message 1505 - Posted: 15 Feb 2019, 14:26:11 UTC - in response to Message 1501.
Last modified: 15 Feb 2019, 14:33:57 UTC

I just configured TN-Grid to allow-more-work and it downloaded ~50 Tasks for each of 4 machines, due in 5 days.....which means that for each machine I have 960 / 10 = 96 hours of available processing time for TN over the course of 5 days. I have TN-Grid set to 10% of two projects, 3day buffer +2days.
So I feel that this is over-extended so I would need to adjust the buffer to 1day +0 days...But that will affect my other project because this is a BOINC global setting.
I also have the global_prefs_override.xml file set to 5 days.
So I will have to disable global_prefs_override.xml...

I think this is going to trigger the (EDF) priority in Boinc and I will lose all processing of my second project even though it's set to 90%

This is just an example of the problem I am experiencing because of the 5 day report deadline.

additude
Send message
Joined: 3 Feb 19
Posts: 10
Credit: 710,675
RAC: 0
United States
Message 1506 - Posted: 15 Feb 2019, 14:45:22 UTC - in response to Message 1504.
Last modified: 15 Feb 2019, 15:12:43 UTC

Agreed Jim1348 if you're a single project researcher, but there are more projects than just one; and Boinc and the account managers are all designed and equipped to run multiple projects simultaneously.

As far as "short turnarounds" go in relation to due dates, that's technically an irrelevant consideration because the task gets completed when the task gets completed.

If the "due date" is at 10 days and the task is completed in 1 day, then so be it.

Move on to the next task, then the next... The due date has no relevancy in that respect.

Jim1348
Send message
Joined: 29 Dec 16
Posts: 87
Credit: 21,013,002
RAC: 0
United States
Message 1507 - Posted: 15 Feb 2019, 15:56:47 UTC - in response to Message 1506.
Last modified: 15 Feb 2019, 16:13:00 UTC

Agreed Jim1348 if you're a single project researcher, but there are more projects than just one; and Boinc and the account managers are all designed and equipped to run multiple projects simultaneously.

But not all projects run well with each other. CPDN is a famous example, with typical runs of 7 to 10 days (or more), and due dates of one year (ridiculous).
It is best to just run it by itself, if you are willing to devote a machine to it.

But I run multiple projects all the time. However, my machines run 24/7 and usually have no problem getting the work done. The complaints are usually from people who run their machines part time (especially laptops), and want the projects to accommodate their desire for long buffers. I wouldn't, since that delays the science for all of them.

As far as "short turnarounds" go in relation to due dates, that's technically an irrelevant consideration because the task gets completed when the task gets completed.

But the due date is relevant if the tasks don't get completed on time. Again, many people want the projects to accommodate their usage of their machines. I would do it the other way, and keep short buffers so that the machines can accommodate how the crunchers use them. Otherwise, the whole system (for all projects) is delayed, results are returned late, etc.

(It is up to the projects to decide how best to maximize their output of course. I am just offering my own opinions.)

additude
Send message
Joined: 3 Feb 19
Posts: 10
Credit: 710,675
RAC: 0
United States
Message 1508 - Posted: 15 Feb 2019, 17:57:16 UTC - in response to Message 1507.
Last modified: 15 Feb 2019, 18:13:14 UTC

Hi Jim1348,

I appreciate your feedback.
What I read from what you've told me is that an extended task due date wouldn't affect you or your process one way or the other.

Extended or not, you just keep on doing what you do and remain completely unaffected one way or the other so it really makes no difference in the process you follow, or your results.

That's the beauty of it. If you weren't informed of it, you wouldn't even be aware of it in your process.

One thing I would not be in total agreement with you about is when you say "It is up to the projects to decide how best to maximize their output".
I would remind you that there are numerous configuration settings used for personalizing the BOINC process to suit a users needs and requirements for efficiency and maximizing thru-put. BOINC has also detailed a few technical studies focusing on BOINC efficiencies, report deadlines being a part of those trials.

From my experience, extending a task due date has more benefit than detriment. Especially when it has none, or almost zero ( I don't know of any ), impact on current users or even current tasks in process.

It's as simple as trial and error. There is no harm to try, it is simple to implement and it is just as simple to reset if something doesn't work out.

Jim1348
Send message
Joined: 29 Dec 16
Posts: 87
Credit: 21,013,002
RAC: 0
United States
Message 1509 - Posted: 15 Feb 2019, 22:29:23 UTC - in response to Message 1508.
Last modified: 15 Feb 2019, 22:29:54 UTC

From my experience, extending a task due date has more benefit than detriment. Especially when it has none, or almost zero ( I don't know of any ), impact on current users or even current tasks in process.

If it is OK with the project, it is OK with me. But some projects need the result of one work unit to calculate the starting point for another. That is the case with Folding@Home in particular. In fact, they developed their own client for it (they don't use BOINC), so that as one work unit is 99% complete, they start downloading the next, in order to minimize the overlap. In effect, they keep a zero buffer. There are some BOINC ones that do something similar, except that they are not so strict in their requirements. It depends on the science.

But in the extreme case of CPDN, since they have a one-year time limit, if a machine crashes and the work unit is lost, they don't even know about it for a year. So they can't send out a replacement until then. If you want to get your science done, you need something faster.


Post to thread

Message boards : Number crunching : Report Deadlines


Main page · Your account · Message boards


Copyright © 2024 CNR-TN & UniTN