Life Science datasets

OVERVIEW: This demonstrator will leverage EOSC resources to enhance science reproducibility of datasets uploaded to the European Genome-phenome Archive (EGA). By doing this the new dataset will also be made available in a FAIR manner, adding metadata according to the attributes that have been chosen to contribute the strongest to the FAIR principles. Pipelines will be developed as part of this demonstrator to automate the later process. This pilot will have a pragmatic impact by demonstrating how to make analyses portable (tools and workflows), how to increase findability, how to leverage security technologies for sensible data, how to deploy the workflow into a cloud and how to make data FAIR. It will also have a long-term impact by increasing the usability of EGA hosted data by assuring to potential users that up-to-date versions of an assured quality are available to download

SCIENTIFIC OBJECTIVES OF THE DEMONSTRATOR:

The European Genome-phenome Archive (EGA) (https://ega-archive.org) is a repository that facilitates access and management for long-term archival of bio-molecular data. Enhancing data analysis reproducibility and exploring new added-value services by leveraging EOSC resources are the main objectives of this SD. Applying the FAIR principles (Findability, accessibility, interoperability and reusability) to our data sets and information associated is a great mission we have accepted from the community

A set of results data has been reproduced using a portable version of the pipeline.
The same result set has been updated by re-analyzing it with a current version.
FAIRfied metadata on both result sets is available at a testing EGA server and/or at an appropriate repository.

MAIN ACHIEVEMENTS:

A set of results data has been reproduced using a portable version of the pipeline.
The same result set has been updated by re-analyzing it with a current version of the pipeline and the reference data.
FAIRfied metadata on both result sets is available at a testing EGA server and/or at an appropriate repositor

IMPACT: This pilot will have a pragmatic impact by demonstrating how to make analyses portable (tools and workflows), how to increase findability, by using persistent identifiers, how to leverage security technologies for sensible data, how to deploy the workflow into a cloud and how to make data FAIR. It will also have a long term impact by increasing the usability of EGA hosted data by assuring to potential users that up-to-date versions of an assured quality are available to download.

The success of the project will be monitored using well defined user cases and insuring their reproducibility across sites and platforms. This monitoring will occur through space (i.e. across sites) and time (i.e. reproduction and updating of existing results).
The potential scientific, and socio-economical impact is extremely significant at a time when insilico analysis are being routinely deployed in a medical context with this approach expected to dominate the so called precision medicine in the next decade.

RECOMMENDATIONS FOR THE IMPLEMENTATION

Being in possession of huge amounts of data is a first step but not enough to achieve the main goal: foster research. There exists a need for adding usefulness to the bio-molecular data the repositories currently store. The EOSC project is a unique framework to add this necessary layer of standardization and interoperability while unifying and discovering the files and associated pipelines.

Public Attachment:

ega_life_science_datasets.pdf

fliera4_ega_web.pdf

NEWS & PUBLICATIONS

05 July 2019

EOSCpilot maps key deliverables for use by EOSC Executive Board Working Groups

During the review meeting of the EOSCpilot project earlier this month, EOSCpilot Coordinator Juan
04 June 2019

EOSCpilot Rounds Up Key Contributions to the EOSC

As EOSCpilot officially ended last week, 31 May 2019, we take a look back at some of the key cont
23 May 2019

EOSC Data Interoperability Ensure Availability of Scientific Data

The overall objective of task 6.2 was to complement the FAIR principles by providing a strategy and a set of recommendations for the EOSC to improve the availability of research data to users and services through an open cloud infrastructure.
22 May 2019

EOSCpilot Delivers Final EOSC Architecture Recommendation

Earlier this month, EOSCpilot delivered its pioneering work on what the European Open Science Clo
21 May 2019

An EOSC Roadmap for Service Portfolio

Shaping the definition of the Servie Porfolio of EOSC has been a significant activity in the EOSCpilot project. Along with the identification of services, the roadmap also focused on the implementation of the EOSC Service Portfolio Management.

Search form

Life Science datasets

You are here