Life Sciences – Genome Research - Bioimaging: Mining a large image repository to extract new biological knowledge about human gene function

SD ID: Mining a large image repository to extract new biological knowledge about human gene function


Organisations & contacts

Prof Jason Swedlow, University of Dundee, Euro-BioImaging

Dr Alvis Brazma and Dr Jan Ellenberg, EMBL, Euro-BioImaging

Dr Jean-Karim Hériché, EMBL

Mr Balaji Ramalingam, Mr Josh Moore and Dr Simon Li, University of Dundee, Open Microscopy Environment


OVERVIEW:  The Astrophysical community has set up a new suite of cutting-edge Milky Way surveys that provide a homogeneous coverage of the entire Galactic Plane and that have already started to transform the view of our Galaxy as a global star formation engine. New instruments have delivered information of unprecedented depth and spatial detail spanning the electromagnetic spectrum. The proposed approach is the integration in the EOSCpilot e-infrastructure of a visual analytics environment based on VisIVO (Visualization Interface for the Virtual Observatory) and its module VLVA (ViaLactea Visual Analytics).


The VisIVO Visual analytics tool is devoted to study star formation regions handling massive and heterogeneous volumes of information related to several galactic surveys processed by data mining algorithms and advanced analysis techniques. Thanks to the highly interactive visual interfaces the tool offers scientists the opportunity for in-depth understanding of massive, noisy, and high-dimensional data. The aforementioned challenges demand an increasing archiving and computing resources as well as a federated and interoperable virtual environment as the one being developed within the European Open Science Cloud enabling collaboration and re-use of data and knowledge.

  • Comprehensive machine learning analysis

  • Dataset reuse

  • Reusable infrastructure and analysis for generating value from published image data


The project is expected to produce the following outcomes for the astrophysics community engaged in star formation studies:(i) integration of visual analytics tools with EOSC services; (ii) optimization of the archiving of multi-wavelength surveys; (iii) increase of computing resources for analysis (e.g. for calculation of spectral energy distributions); (iv) a federated virtual environment enabling collaboration and re-use of data and knowledge.


  • Demonstrate the use of the infrastructure for users to run their own analysis via the cloud on publicly available image data sets

  • Show how image data can be reused in a research context

  • Showcase the results reuse by the community e.g. for searching images by similarity, to implement supervised machine-learning methods to mine the repository, etc.


The VisIVO Science Demonstrator is requiring the use of the European Open Science Cloud technologies for the archive services and the intensive analysis employing the connection with the ViaLactea Science Gateway. The implementation of the Science Demonstrator needed the integration of the ViaLactea Science Gateway with the federated Identity Providers (via the EGI Check-in service) and with the EGI Federated Cloud to expand the computing capabilities making use of a dedicated virtual appliance stored into the EGI Applications Database.


Public Attachment: