In recent years, data-driven medicine has gained increasing importance in terms of diagnosis, treatment, and research due to the exponential growth of healthcare data [Sascha-2021]. The linkage of health data from various sources, including genomics, and analysis via innovative approaches based on artificial intelligence (AI) advanced the understanding of risk factors, causes and development of optimal treatment in different disease areas. Furthermore, it contributed to the development of a high-quality accessible health care system. However, medical study results often depend on the amount of available patient data, crucially when it comes to rare diseases the dependency is accentuated. Typically, the more data is available for the intended analysis or the scientific hypotheses, the more accurate the results are [Sascha-2021].

Nevertheless, the reuse of patient data for medical research is often limited to data sets available at a single medical centre. The most imminent reasons why medical data is not heavily shared for research across institutional borders rely on ethical, legal and privacy aspects and rules. Data protection regulations prohibit data centralisation for analysis purposes because of privacy risks like the accidental disclosure of personal data to third parties. Therefore, in order to (1) enable health data sharing across national borders, (2) fully comply with present GDPR privacy guidelines and (3) innovate by pushing research beyond the state of the art, this project proposes a robust decentralised infrastructure which will empower researchers, innovators and healthcare professionals to exploit the full potential of larger sets of multi-source health data via tailored made AI tools useful to compare, integrate and analyse in a secure, cost-effective fashion; with the very final aim of supporting improvement of citizen’s health outcomes.
  • Objective 1

    Overcome cross-border barriers to health data integration, access, FAIRification and preprocessing


    Guide medical centres in collecting patients’ data following a common schema in order to promote interoperability and re-use of datasets in scope. This includes legal, ethical and data protection authorisations, and data documentation, cataloguing, mapping to well-established and therefore widely understood ontologies. Legal and ethical implications will be duly considered and procedures for data access and re-use will be proposed. As a default preprocessing step data pseudonymisation will be performed mitigating risk of personal data leak; this will be followed by data quality and integrity assessment. Finally, this objective enables the integration of a BETTER station at each medical centre premises, validating the accesses to the relevant local datasets including genomics.
    Stacks Image 105
  • Objective 2

    Health data fusion and open data integration


    Despite being nontrivial with traditional approaches, a meaningful way to improve health outcomes involves (1) fusing multiple data sources, including genomics and external datasets, to better describe the pathology, (2) comparing protocols and strategies with international peers, and (3) leverage a multi-source historical data to support diagnostic and therapeutic decisions. Thanks to the proposed distributed framework, data fusion will be delivered in two directions: within the single medical centre and cross centres by leveraging each other's historical datasets. Innovative analytical and AI tools will be researched and developed.
    Stacks Image 149
  • Objective 3

    Deploy a distributed analytics framework for cross-border data processing and analysis


    Deploy, test and utilise BETTER, a PHT (Personal Health Train) distributed analytics platform composed of stations hosted on each medical centre premises. Furthermore, a central service will be hosted by UKK (Klinikum Der Universitaet Zu Koeln) in order to monitor and orchestrate activities. Importantly, this framework will support the development of analytics and AI tools via both Federated and Incremental Learning modalities; in line with GDPR data will not leave a single medical centre. The framework will be exploited by researchers, data scientists and software developers to securely build applications for analysing multiple health datasets including genomics.
    Stacks Image 172
  • Objective 4

    Development of distributed tools leveraging artificial intelligence capabilities


    Within each use case, tailored tools are developed in order to properly answer clinical needs. Some of them will indeed exploit DA (Distributed Analytics) and AI to push data analysis boundaries going beyond the state of the art. Crucially, multiple data sources including genomics will be fused together aiming to better understanding of risk factors, causes and development of the studied diseases. The tools will be developed using a co-creation methodology where medical end-users closely collaborate with researchers and technology providers enabling the emerging new concepts. Finally, trustworthy AI guidelines (digital-strategy.ec.europa.eu/en/library/ethics-guidelines-trustworthy-ai) will be followed throughout the development lifecycle, particular attention will be devoted to the explainability of the developed tools.
    Stacks Image 188
  • Objective 5

    Ethical, Legal and Societal aspects (ELSA) awareness in the AI lifecycle


    Most data science projects, whether federated or centralised, do not co-create or co-develop using methodology which includes the ethical, legal and societal aspects (ELSA) involved in the data science lifecycle. More specifically, FA (Factor Analysis) approaches have focused on the privacy during the development phase, but do not incorporate the application phase. In the objective, we will start with the end-goal in mind (intended AI use) and will develop ELSA-awareness tools and methods for co-creation and co-development of AI models. Different experts from social sciences and humanities (SSH) disciplines (ethics, law) and technical disciplines are co-creating these methodologies and applying them to the proposed use cases. Hence, entrenching the appropriateness and clinical effectiveness of the developed AI tool, while considering the safety, value and sustainability of the AI.
    Stacks Image 218
  • Objective 6

    Plan, coordinate and implement 3 clinical Use Cases


    Integration and analysis on real-world health-related data, including genomics, represents one of the core pillars of the project. Six medical centres are committed to integrate, validate, and utilise digital tools that build on top of BETTER platform where multiple data sources from different centres can be fused and exploited for improving clinical outcomes. Importantly, different use cases will collect and analyse the whole genome sequence data. Besides potential foreseen obstacles including technical & IT issues, organisational, and training on new tools, this objective aims to demonstrate the effectiveness of the proposed framework in clinical setups enabling not only the sharing of data but also protocols, skills, know-how aiming to improving patient’s health outcomes.
    Stacks Image 234
Stacks Image 4
Stacks Image 7
Stacks Image 15
The project has received funding from the European Union's Horizon Europe research and innovation programme under grant agreement No 101136262. The communication reflects only the author's view and the Commission is not responsible for any use that may be made of the information it contains.

Privacy Policy

Follow us:

Uden navn
Uden navn 2

Please accept our simple cookie, that is only used for functionality. No data is collected.