Welcome to Institute of Genomics
Genoscope Home Page
The CNRGH is the French national research center which enables a response to scientific questions necessitating high throughput sequencing and genotyping thanks to the development and deployment of innovative integrated technologies. The organization of the CNRGH enables optimization of genetic and genomic research on human diseases by creating indispensable links between cohort constitution (DNA samples), identification of the responsible genes, and study of the transcriptome and epigenome.
List of the main projects of the Institut de Génomique
The information technology and bio-informatics laboratories of the Institut de Génomique (IG) intervene in the generation of data and their primary and secondary processing. In those initial stages, the data generated by sequencers are analyzed in order to estimate their quality, filter them, interpret them and annotate them. The calculated information is then organized and distributed in comprehensible form to the biology research teams.
The IT infrastructure of the IG is centered on the data. The storage capacity based on the servers of files interfaced with the network is of the order of 1 PB (petabyte). The computation resources directly connected with the data servers are mainly bioprocessor servers, x86_64 (about 500 cores, typically 8 GB/core).
The monitoring of sample and sequencing operation management is ensured by a laboratory integrated management system (LIMS) developed in-house or with contract organizations. The tools enable daily monitoring of operations, tracking all the processes from sample receipt through DNA extraction to sequencing and computerized analyses while also enabling centralization of the metrics enabling quality control of the data generated.
The 'raw' data generated by sequencing are managed by a set of IT procedures which compute a set of 'quality metrics' intended to verify that the sequencing operations have been correctly implemented in compliance with the specifications. The results of the various calculations conducted (e.g. calculation of the 'sequence coverage rate', duplication rate, contamination, etc.) are then reviewed and validated by the quality team prior to data provision to the scientific teams.
Data interpretation is implemented by a suite of software, the bioinformatics pipelines. The scientific software is developed by our teams and by a broad scientific community enabling provision of numerous tools to the biologist teams. At the CNRGH, we support pipelines in various fields, of which:
CEA is a French government-funded technological research organisation in four main areas: low-carbon energies, defense and security, information technologies and health technologies. A prominent player in the European Research Area, it is involved in setting up collaborative projects with many partners around the world.