proterna_vae64lat dataset
log in

Advanced search

Message boards : Science : proterna_vae64lat dataset

Author Message
Profile valterc
Project administrator
Project tester
Send message
Joined: 30 Oct 13
Posts: 623
Credit: 34,677,535
RAC: 3
Italy
Message 2303 - Posted: 10 May 2021, 14:34:51 UTC
Last modified: 12 May 2021, 9:08:16 UTC

There is a new data set floating around: proterna_vae64lat

It has been made by "crossing", using machine learning techniques, two different datasets collected from human tissues: protein and RNA expression levels. More (scientific) information will follow.

There will be just a small set of workunits available, at least at the beginning. You may recognize them from the workunit's name containing the string "Hs_PRT-genename". This kind of workunit should be very fast to compute (around half an hour on a decent computer).

Please let me know if you notice something strange with them.

Falconet
Send message
Joined: 21 Dec 16
Posts: 105
Credit: 3,092,711
RAC: 0
Portugal
Message 2304 - Posted: 10 May 2021, 21:08:15 UTC - in response to Message 2303.

Thanks for the info.
____________

Profile Buro87 [Lombardia]
Send message
Joined: 23 Nov 16
Posts: 100
Credit: 4,000,541
RAC: 0
Italy
Message 2306 - Posted: 12 May 2021, 10:46:16 UTC - in response to Message 2304.

Great news 😁

Aurum
Send message
Joined: 18 Jul 18
Posts: 97
Credit: 291,386,295
RAC: 0
United States
Message 2307 - Posted: 12 May 2021, 12:11:46 UTC - in response to Message 2303.

More (scientific) information will follow.

Waiting with bated breath.

toma
Project scientist
Send message
Joined: 6 Jun 18
Posts: 3
Credit: 0
RAC: 0
Italy
Message 2308 - Posted: 12 May 2021, 19:48:59 UTC

The Genotype-Tissue Expression (GTEx) project is an ongoing effort to build a public resource of tissue-specific human gene expression. The Enhancing GTEx (eGTEx) project extends the GTEx project to combine gene expression with additional molecular measurements on the same tissues. Within the eGTEx project, protein and RNA expression levels in 32 normal human tissues from 14 individuals were obtained by quantitative mass spectrometry and RNA sequencing, respectively. The original dataset was downloaded from https://www.gtexportal.org/home/datasets. Data on protein and RNA levels for approximately 12000 genes were integrated with an approach based on a variational autoencoder, obtaining a latent representation composed of 64 variables.
____________

Aurum
Send message
Joined: 18 Jul 18
Posts: 97
Credit: 291,386,295
RAC: 0
United States
Message 2309 - Posted: 13 May 2021, 9:51:21 UTC

Thanks toma. That sounds like the combinatorials are endless. I'm not familiar with the field of gene expression but things like various skin diseases and osteoporosis come to mind. Are there examples of how eGTEx might lead to new or better diagnostics or treatments?

Profile valterc
Project administrator
Project tester
Send message
Joined: 30 Oct 13
Posts: 623
Credit: 34,677,535
RAC: 3
Italy
Message 2312 - Posted: 18 May 2021, 14:08:37 UTC

I just inserted in the queue 51 genes (51*18 workunits) related to leukemia (see https://omim.org/), to be "expanded" within the proterna_vae64lat dataset.

Profile [VENETO] boboviz
Send message
Joined: 12 Dec 13
Posts: 183
Credit: 4,641,505
RAC: 0
Italy
Message 2313 - Posted: 20 May 2021, 5:29:09 UTC - in response to Message 2312.

I just inserted in the queue 51 genes (51*18 workunits) related to leukemia (see https://omim.org/), to be "expanded" within the proterna_vae64lat dataset.


Great!!

toma
Project scientist
Send message
Joined: 6 Jun 18
Posts: 3
Credit: 0
RAC: 0
Italy
Message 2314 - Posted: 20 May 2021, 20:50:09 UTC - in response to Message 2309.

Unfortunately the eGTEx dataset does not currently contain samples from bones or skin, therefore our first analyses will be focused on genes responsible for neurodegenerative diseases and hematologic malignancies.


Post to thread

Message boards : Science : proterna_vae64lat dataset


Main page · Your account · Message boards


Copyright © 2024 CNR-TN & UniTN