Posts by valterc
log in
21) Message boards : News : Summer break (Message 3270)
Posted 10 Jul 2023 by Profile valterc
I just put the last Vitis vinifera genes into our queue. With the current pace it will take about a week, maybe less, to distribute them all. The new datasets are not ready yet... Also, I will be on vacation the next two weeks, and it wouldn't be advisable to start a new work without having the possibility to easily oversee the system.
Therefore, in a few days, the work generator will not be able to produce new tasks, only resends will be floating around.
I hope to be able to restart the system with a new project at the beginning of August.

Thank you all!
22) Message boards : Number crunching : OUT of tasks (Message 3268)
Posted 6 Jul 2023 by Profile valterc
Morning,

Which is next project? Current will be over in a few days...

Thx¡
Javi F

Don't really know right now, there are two possibilities, both related to Humans. The scientists are working on the input datasets, but during the summertime, everything tends to slow down.
23) Message boards : News : The end of the FANTOM-1 experiment (Message 3266)
Posted 5 Jul 2023 by Profile valterc
As a follow-up, I would like to mention that all the results of the FANTOM-1 experiment have been collected and locally double-checked. A couple of gene expansions were "corrupted" probably because of the consequences of our faulty file-system. However, this issue has been fixed, and now all the data is available to the scientists. We are also working on setting up a suitable way to publish the results, which will include a dedicated web application and scientific publications.
Summertime is here, and related vacations will somewhat slow this process.
24) Message boards : Number crunching : OUT of tasks (Message 3217)
Posted 26 May 2023 by Profile valterc
I might try this solution but I don't think it will work, I will try to explain why.
The speed of the work generator depends only on the number of genes in our dataset. The program itself, from the computational point of view, is rather simple; it builds 2000 random permutation of (in case of Vitis) around 22000 numbers, then it makes tiles (slices) of them and (now) it packs them 600 per workunits. It's a little bit more complex than this but this gives you the idea.
The speed of the PC algorithm on a "tile" (the application that you run) depends on the "structure" of the dataset and on Vitis-Vespucci is faster than on the previous one (Human-FANTOM).
Right now a single run of the work generator builds 77 workunits (154 tasks because of the validation requisites). Say that, for example, the result is computed, on a ideal computer, in one hour. So this will keep busy that ideal computer for 77 hours. If I, theoretically, will pack all the tiles into a single workunit this will keep that computer busy for 77 hours, exactly the same time but increasing the risks of computational errors. The time of the work generator will be the same, say a few seconds faster because of creating just one file instead of 77.
Anyway I may be wrong... I will slightly increase the tiles per workunits starting from the next batch (it will take a couple of days)
25) Message boards : Number crunching : OUT of tasks (Message 3209)
Posted 24 May 2023 by Profile valterc
We are at 0 (Zero) available tasks again.

I suggest running multiple work generating in parallel

This would be good but we don't have the resources. Only one instance of the work generator can be run at the same time (it needs a rewrite). Also our current hardware would not be able to support it. We need new hardware and a (slightly new) work generator, we are working on it, but it is not that easy (finding money)
26) Message boards : Number crunching : OUT of tasks (Message 3205)
Posted 23 May 2023 by Profile valterc
I just modified wus*core from 8 to 6 (don't remember if I also need to restart the server), also reduced the deadline from 5 to 4 days.
27) Message boards : Number crunching : OUT of tasks (Message 3202)
Posted 23 May 2023 by Profile valterc
I don't like your definition of a work unit. A work unit is one unit that BOINC worker downloads aka "expansion" in your term is a "work unit" in my term.

"Work unit" = 1 unit of work downloaded by BOINC = 1 task

What you have there is not a work unit but kore like a gene-slice

Anyway we need more work units and bigger queue on the BOUNC server side of those available work units

That's exactly the definition of workunit. What you actually download is a collection of 600 computational chunks (you also downloaded the expression dataset, shared among the genes), Any chunk is the run of the PC algorithm on a tile, size 1000, made up with the "seed" gene and a random subset of the other genes.
A single gene expansion is made up of 77 workunits. We wait until all of them come back and build up the gene expansion list.
You can inherit this also by looking at the name of a workunit, like:
236784_Vv_vv-VIT-01s0150g00140_wu-57_1684831345366
236784 a internal id
Vv the organism
vv-VIT-01s0150g00140 the gene name
wu-57 workunit 57 (out of 77)
1684831345366 a timestamp
28) Message boards : Science : The new Vitis vinifera project is underway (Message 3200)
Posted 23 May 2023 by Profile valterc
Sorry for that, but I made a mistake in my previous statement... it's not 800 but 600
29) Message boards : Number crunching : OUT of tasks (Message 3198)
Posted 23 May 2023 by Profile valterc
> Another solution to this problem is to double workunit lenght

But generating longer (or larger?) work units are harder or the same as small work units ?
So they will take up 6 hours of CPU time rather than 3 hours on average ?

There are, right now, 600 small computational chunks inside a workunit (the last one for every gene is smaller). I could easily increase or decrease that number, the computational time is proportional to it. The work generation time is almost independent of it so it will take around the same time to split an "expansion" in, say, 77 (right now) or 144 or 35 workunits.
The choice of the right number is a compromise between a lot of things. A very fast workunit will overuse the network connection, a very long one will not be good for people with slow computers or intermittent dedicated time and increases the chance of wasting resources due to computational errors. The deadline should also be adjusted accordingly.
30) Message boards : Science : Homo Sapiens (OneGenE - FANTOM-1) - End (Message 3186)
Posted 22 May 2023 by Profile valterc
You may see some Hs (FANTOM) workunits floating around again. After a preliminary check of the results we figured out the need to expand another dozen of genes.
31) Message boards : Number crunching : OUT of tasks (Message 3185)
Posted 22 May 2023 by Profile valterc
Since there are 22393 genes in V. Vinefera project, could we run 22393 work-unit generators in parallel, on different servers?

Just a dozen in parallel would be more than enough (without stressing the system: db and storage)
32) Message boards : Number crunching : OUT of tasks (Message 3181)
Posted 21 May 2023 by Profile valterc
How difficult is it to build parallel version of the work generator?

It shouldn't be too difficult, not real parallelism, it would be more than enough to have a version that supports multiple instances running at the same time.
Tha problems are that it's a rather complicated code that should carefully interact with the local gene db and it's written in python (a language that I refused to learn;)
33) Message boards : Number crunching : OUT of tasks (Message 3179)
Posted 20 May 2023 by Profile valterc
What would it take to generate work units faster? New software? More CPU cores? Faster cores?
Amount of available work units is pretty low for both TN-GRID and Rosetta projects.

A brand new server :)
And a new parallel version of the work generator
34) Message boards : Number crunching : OUT of tasks (Message 3176)
Posted 16 May 2023 by Profile valterc
After moving to the new storage system, is the work generator still limited by the 600 WUs per 14 minutes?

Maybe a little bit faster (better I/O) but the performance is similar
35) Message boards : Science : The new Vitis vinifera project is underway (Message 3175)
Posted 16 May 2023 by Profile valterc
A single gene expansion is made up by 77 workunits, each one contains 800 small computational chunks (PCs), the last one (#77) contains just 400 of them. (I may change the numbers of chunks per workunit in the future)
36) Message boards : News : The Vespucci-2023 Vitis vinifera OneGenE experiment (Message 3168)
Posted 15 May 2023 by Profile valterc
The Vespucci-2023 Vitis vinifera OneGenE experiment is officially started, for more scientific info about it please look at this thread https://gene.disi.unitn.it/test/forum_thread.php?id=370.
Thank you all for your support!
37) Message boards : Science : The new Vitis vinifera project is underway (Message 3166)
Posted 12 May 2023 by Profile valterc
By now I just put a few genes into the queue, if everything goes well I will switch to full throttle this Monday, also adding the progress counter.
38) Message boards : News : Maintenance (filesystem and o.s.) (Message 3162)
Posted 11 May 2023 by Profile valterc
What is current upgrade status?

The first (test) workunits will be out today
39) Message boards : News : Maintenance (filesystem and o.s.) (Message 3159)
Posted 3 May 2023 by Profile valterc
fixed it, and made better ;)

(it was Ubuntu 18, I prefer to make just one step at a time)
40) Message boards : News : Maintenance (filesystem and o.s.) (Message 3157)
Posted 3 May 2023 by Profile valterc
Update:
- the server is now running using the new filesystem
- the server has been updated to Ubuntu 20
I will keep BOINC switched off for a while until I fix the usual PHP problems in the web pages (please tell me if you find any of them)

The next experiment should be on Vitis vinifera but the preparation work for another one on Humans is also almost done.


Previous 20 · Next 20

Main page · Your account · Message boards


Copyright © 2024 CNR-TN & UniTN