To build the Virtual Platform several workstreams are conducted in a coordinated and user-driven way, in order to create its components. Each workstream has a different Work Focus (WF).
To know more about our strategic planning:
- This WF builds on the results of the community surveys performed in years one and two of the project to map and transmit research needs from ERNs and researchers. We identify and engage with different types of stakeholders in order to describe their needs with regards to the virtual platform, and facilitate their participations subsequently in the agile development cycles. These stakeholders include RD clinicians, researchers, biobank / registries owners, patient representatives as well as partners from other Pillars of EJP RD. In a streamlined workflow, WF Overall Architecture then assesses the captured use cases for functional and non-functional requirements.
- Co-chairs: Mary Wang (Fondazione Telethon, IT) and Marco Roos (Leiden University Medical Center, NL)
- The EJP RD would like rare disease data resources to be prepared for efficient computer processing by making them ‘findable, accessible, interoperable, and reusable, for humans and computers’ (FAIR). Many activities in Pillar 2 work towards this goal. ‘FAIRification’ pertains to research and development of the process of making RD resources FAIR. Targets of FAIRification range from central resources to locally maintained resources, from the access point of a database to the data records in a database, and from core components of the Virtual Platform to distributed resources such as ERN registries. Research encompasses identifying standards and tools to help make resources FAIR, defining smart guidelines for different types of stakeholders, and investigating management models for FAIRification projects.
- Background: The purpose of the FAIR principles is to make discovery and analysis over multiple data resources as efficient as possible, regardless of where the data are or how large resources are. The 15 guiding principles by which to achieve this are an IRDiRC recognized resource. FAIR is for humans and computers. Especially the ‘for computers’ aspect has our attention as an efficiency booster towards the goals of IRDiRC and the rare disease community. While web sites are a familiar ‘common way‘ to express for humans what an RD resource is, few RD resources have a ‘common way’ to express for computers what the resource is, how to access it, and what data elements mean (a prerequisite for efficient data integration). Without FAIR, it takes analysts months to find and prepare data for every analysis. FAIRification in the EJP RD therefore aims to facilitate a FAIR ecosystem where every RD resource expresses in a common way for computers how it and its data records can be found, accessed, interoperated, and reused. This is a challenge, because it requires domain expertise and technical expertise, and workflows and guidelines compiled from more than 80 standards that address various aspects of FAIR. We note that FAIR data is not ‘Open’ data: FAIR implies that access conditions are made explicit and can be assessed by computers.
- This WF aims to serve the needs for depositing, integrating and storing quality–controlled data and metadata produced by EJP-RD partners and the overall RD community by building on existing resources including registries, patient cohorts, biobanks, cell lines, mouse models, raw omics data and genome-phenome platforms. Work is being done to guide data producers to submit data to appropriate public repositories and resources, making them discoverable through the platform.
- The results from this WF can be accessed through the EJP-RD Deliverable D11.12
- The aim of this WF is to support and upscale software and tools for the analysis of existing and new data provided by ERNs or other EJP-RD funded projects. Tasks for this WF have been divided in 3 different groups: User-friendly Genomics Analysis (Involving tasks to improve online functionalities and tools), Cloud computing and Multi-Omics Analysis (with the aim to create new cloud instances for analysis), and Information and Annotation Resources (which work on improving annotation tools).
- Results from this WF are, among others, the creation of a new tool to collate Phenotypic information (GPAP-Phenostore), the addition of new variant interpretation tools for DECIPHER, the creation of new Cloud Instances with pipelines for data analysis, or the improvement of tools for splice motif identification (DOLPHIN system with a manuscript under preparation), and tools for variant prediction (Ensembl VEP).
- The aim of this work focus is to build a knowledge base of relevant rare diseases biological pathways for the existing pathway portal available on WikiPathways which is one of the core analysis resources for this project. The relevant diseases are selected in collaboration with ERNs and Solve-RD. To create these pathways we use literature review including FAIR text-mining predictive methods, data mining and expert input from ERNs and FAIR to identify disease-relevant genes and use the knowledge available in neXtProt and other resources, e.g. protein-protein interactions, co-expression, co-localisation, etc. to create a network structure.
- We created a portal on WikiPathways dedicated to rare genetic diseases [http://raredisease.wikipathways.org]. To the date, this portal contains 68 rare disease pathways. 45 of those have been created within EJP-RD so far. In collaboration with the WF Use case the section about congenital anomalies of the kidney and urinary tract (CAKUT) pathways have been created. Other pathways like the group of pathways around disorders of sex development and fertility, ciliopathies and rare thyroid cancers were created after the requests from ERN experts.
- RDF background of WikiPathways allows federated queries with other resources from the VP
This WF aims to enable mapping of genes to variants (and vice versa) and collect information about variants from different resources and include these steps into a multi omics data analysis workflow and disease network building.
The WF AOP/Environment aims to develop strategies to include toxicology, nutrition and drug data into rare genetic disease networks which allows e.g. investigation of overlapping targets of rare diseases and drugs/nutritional compounds.
- WF Networks is the cumulative workflow development work focus in which all information and workflow parts from pathways, variant information and drug/toxicology/nutrition information is included. This WF aims to develop these workflows to include all this information into disease network construction and to draw conclusions from these networks.
- We are working on specific Use Cases to demonstrate the added value of FAIRification and data analysis and integration workflows developed within WP13. Currently, there are four use cases:
- Huntingtons disease (from a public transcriptomics dataset)
- Congenital anomalies of the kidney and urinary tract (CAKUT, ERKNet) – INSERM, University of Toulouse (contact Joost Schanstra)
- Inclusion body myositis – (IBM, EURO-NMD) – (Tampere University Hospital), (contact Mridul Johari, Marco Sarvarese, Bjarne Udd) -
- Idiopathic non-cirrhotic intrahepatic portal hypertension (INCPH, Rare-Liver)
(Hospital Clinic. IDIBAPS and CIBERehd. Barcelona) (Juan Carlos Garcia – Pagan)