The Gene Expansion Network project

Gene@home is a scientific project belonging to the infrastructure TrentoGrid. It is aimed to expand networks of genes, and to perform this task it exploits the computational power of volunteers through the BOINC platform, which allows distributed computing.
The experiment is done on the plant Arabidopsis Thaliana.

Every living being has a genetic code and a set of genes, needed to produce proteins starting from coded pieces of information. Genes are necessary for life and maintenance of organisms, and are expressed inside cells: the contained information is transcribed and translated into proteins.
This gene expression phenomenon, based on a complex chain of events in which some particular proteins act on genes regions, can be simplified through a causal relationship between two genes.
Causality is a kind of cause-and-effect binding between two variables: it means that the occurrence of the one is cause of the appearance of the other.

The gene expression information is usually represented in Gene Regulatory Networks (GRN), which use edges to indicate the causal relationship between two genes. This representation is very useful to predict and manipulate the behavior of a system.
Every GRN can be expanded, in order to add or suggest new genes related to the ones already known; this allows to amplify the research and the analysis of a network. However, there are just few methods available to perform the expansion, which is still an open challenge in the Bioinformatics world.

To perform the GRN expansion, gene@home exploits an algorithm called PC-IM.
It is an iterative implementation of the PC algorithm, which founds a gene network and studies its causal relationships, aimed to estimate if a list of new genes can have a causal relationship with an already known GRN.
In particular, the new genes are partitioned in blocks and merged with the GRN; afterwards the PC is applied on each block to look for new possible relationships. At the end of the process the algorithm self-evaluates its performance, and basing on this decides the final network to return as an output.