Training Webinar: Real-World data, Machine learning and Deep analytics in rare diseases: Regulatory grade data collection for marketing authorization submissions – what is buzz, what is realistic?

  • 26 janvier 2024 - 14:00
  • Online

Marc Van Dijk (UCB) and Luis Pinheiro (EMA) will present views of industry and regulators on various facets of collecting, processing, and using data for research, regulatory and healthcare decisions. The two keynote talks will provide an overview of patient needs and challenges in rare disease R&D and of data and technologies that offer solutions to those challenges. 

The panel discussion between representatives of patients, academia, regulatory scientists, industry and clinicians and with the audience will provide different perspectives on challenges and solutions in the context of regulatory realities and imperatives. They will also provide examples of solutions already used and their limitations.

Medicine developers and researchers are witnessing the convergence of three major trends:

  • increasing potential use of Real World Data (RWD) but with the need to ensure that the data quality can support data-driven regulatory decision making,
  • expansion of data sources and data set including from digital tools and sensors—some in curated research data assets—but most in need of a great deal of work before being suitable for research
  • fast evolution of AI and machine learning methods.

For the rare disease community, these novel tools and digital technologies hold many promises, including shortening the diagnostic odyssey, identifying a potential treatment candidate for drug repurposing based on the disease metabolic pathways and the medicine mechanism of action, identifying relevant endpoints, accelerating trial recruitment, studying the disease natural history or modelling external control arms for clinical trials. The generation of deeper insights from “big” data, create new uses, but also some new challenges for data collections. Data and data base management need to adapt to enable deep analysis. 

From a regulatory standpoint the use of RWD and artificial intelligence (AI) in the medicinal product lifecycle still presents substantial challenges in deriving actionable evidence[1].

Data are dispersed in disparate data sets that need to be curated, standardized, combined, analyzed, and integrated. An analysis revealed that the number of European databases that meet minimum regulatory requirements across a broad range of regulatory use cases and which are readily accessible is disappointingly low and geographically skewed. Another recent survey done in the context of Screen4Care amongst European rare disease registries showed high barriers to the secondary use of data and limited availability of the data for machine learning.

Additionally, health data is made up of lots of different languages, medical ontologies, systems and structures, with challenging policy restrictions and technology issues. Fragmentation, lack of scale and data privacy are further compounding the issues in rare diseases.  Some databases for example cannot be used for rare diseases if database queries return a too low number of patients.

When linked to the regulatory decision making, it is of utmost importance to ensure that data quality, traceability, and embedded lineage & provenance will be adequately addressed to tackle the challenges linked to the application of RWD and AI in rare disease research. By standardizing and harmonizing data capture process, curation methods, and reporting practices across various stakeholders, researchers can overcome many obstacles related to data inconsistencies and improve the overall reliability and usability of the collected data.

To address the unmet needs of the Rare Disease (RD) community in a timely and meaningful manner, projects at all stages of the development need to contribute to regulatory approvals.


  • Brasil, S., et al. (2019). Artificial Intelligence (AI) in Rare Diseases: Is the Future Brighter? Genes, 10(12), 978.
  • Cave, A., Kurz, X., & Arlett, P. (2019). Real-World Data for Regulatory Decision Making: Challenges and Possible Solutions for Europe. Clinical Pharmacology & Therapeutics, 106(1), 36-39.
  • Crown, W. H. (2019) Real-World Evidence, Causal Inference, and Machine Learning. Value Health 22(5):587-592.
  • Liu, J., et al. (2022) Natural History and Real-World Data in Rare Diseases: Applications, Limitations, and Future Perspectives. J Clin Pharmacol 62 Suppl 2(Suppl 2):S38-S55.
  • EMA concept pare on AI and the Data Quality Framework for EU medicines regulation

[1] Guidance is available from the FDA and the EMA on the use of registries for supporting regulatory decision making for drugs and biological products. ,

During this webinar panelists and participants will discuss:

  • How to plan for the data needs for the future of medicine development for rare diseases.
  • How to anticipate the scope, depth, and quality of data that will be required to generate reliable evidence suitable for regulatory use cases.
  • The tools that are available to make data collection accessible for these uses.

The training webinar is organized as a series of lectures presented by experts in the specific topics. Interaction between participants and lecturers will be facilitated by moderated question & answer sessions.

The workshop is open to the international research community, clinicians, medical specialists, healthcare professionals and advocacy patient groups with knowledge on the RD Clinical Trials.

The registration is mandatory.
Please register here.
REGISTRATION DEADLINE: 25 January 2024 at 23:45 p.m.

The training and registration are free of charge.
The training organisers will not cover expenses incurred by the participants in any way. 

At the end of the training, participants will be requested to fill an online questionnaire as feedback for learning and impact assessment.

A certificate of attendance can be requested via email to No credits of Continuing Education in Medicine will be issued.

The language of the training webinar is English.

Online. Details for the connection will be provided the day of the webinar to registered participants.

This Training webinar is organized by Universitaetsklinikum Aachen (Germany) and Assistance Publique Hopitaux de Paris (France) in cooperation with the European Federation of Pharmaceutical Industries and Associations (EFPIA).

If you have questions, please contact the course organisers through this email address: and

Please, indicate in the Subject: WP20 January training webinar 


Click on the speaker to enlarge the biosketch.


Click on the panelist to enlarge the biosketch.


  • Start Date:26 janvier 2024
  • Start Time:14:00
  • End Date:26 janvier 2024
  • End Time:16:00
  • Location:Online