Nevertheless, the reuse of patient data for medical research is often limited to data sets available at a single medical centre. The most imminent reasons why medical data is not heavily shared for research across institutional borders rely on ethical, legal and privacy aspects and rules. Data protection regulations prohibit data centralisation for analysis purposes because of privacy risks like the accidental disclosure of personal data to third parties. Therefore, in order to (1) enable health data sharing across national borders, (2) fully comply with present GDPR privacy guidelines and (3) innovate by pushing research beyond the state of the art, this project proposes a robust decentralised infrastructure which will empower researchers, innovators and healthcare professionals to exploit the full potential of larger sets of multi-source health data via tailored made AI tools useful to compare, integrate and analyse in a secure, cost-effective fashion; with the very final aim of supporting improvement of citizen’s health outcomes.
Overcome cross-border barriers to health data integration, access, FAIRification and preprocessingGuide medical centres in collecting patients’ data following a common schema in order to promote interoperability and re-use of datasets in scope. This includes legal, ethical and data protection authorisations, and data documentation, cataloguing, mapping to well-established and therefore widely understood ontologies. Legal and ethical implications will be duly considered and procedures for data access and re-use will be proposed. As a default preprocessing step data pseudonymisation will be performed mitigating risk of personal data leak; this will be followed by data quality and integrity assessment. Finally, this objective enables the integration of a BETTER station at each medical centre premises, validating the accesses to the relevant local datasets including genomics.
Health data fusion and open data integration
DescriptionDespite being nontrivial with traditional approaches, a meaningful way to improve health outcomes involves (1) fusing multiple data sources, including genomics and external datasets, to better describe the pathology, (2) comparing protocols and strategies with international peers, and (3) leverage a multi-source historical data to support diagnostic and therapeutic decisions. Thanks to the proposed distributed framework, data fusion will be delivered in two directions: within the single medical centre and cross centres by leveraging each other's historical datasets. Innovative analytical and AI tools will be researched and developed.
Deploy a distributed analytics framework for cross-border data processing and analysis
DescriptionDeploy, test and utilise BETTER, a PHT (Personal Health Train) distributed analytics platform composed of stations hosted on each medical centre premises. Furthermore, a central service will be hosted by UKK (Klinikum Der Universitaet Zu Koeln) in order to monitor and orchestrate activities. Importantly, this framework will support the development of analytics and AI tools via both Federated and Incremental Learning modalities; in line with GDPR data will not leave a single medical centre. The framework will be exploited by researchers, data scientists and software developers to securely build applications for analysing multiple health datasets including genomics.
Development of distributed tools leveraging artificial intelligence capabilities
DescriptionWithin each use case, tailored tools are developed in order to properly answer clinical needs. Some of them will indeed exploit DA (Distributed Analytics) and AI to push data analysis boundaries going beyond the state of the art. Crucially, multiple data sources including genomics will be fused together aiming to better understanding of risk factors, causes and development of the studied diseases. The tools will be developed using a co-creation methodology where medical end-users closely collaborate with researchers and technology providers enabling the emerging new concepts. Finally, trustworthy AI guidelines (digital-strategy.ec.europa.eu/en/library/ethics-guidelines-trustworthy-ai) will be followed throughout the development lifecycle, particular attention will be devoted to the explainability of the developed tools.
Ethical, Legal and Societal aspects (ELSA) awareness in the AI lifecycle
DescriptionMost data science projects, whether federated or centralised, do not co-create or co-develop using methodology which includes the ethical, legal and societal aspects (ELSA) involved in the data science lifecycle. More specifically, FA (Factor Analysis) approaches have focused on the privacy during the development phase, but do not incorporate the application phase. In the objective, we will start with the end-goal in mind (intended AI use) and will develop ELSA-awareness tools and methods for co-creation and co-development of AI models. Different experts from social sciences and humanities (SSH) disciplines (ethics, law) and technical disciplines are co-creating these methodologies and applying them to the proposed use cases. Hence, entrenching the appropriateness and clinical effectiveness of the developed AI tool, while considering the safety, value and sustainability of the AI.
Plan, coordinate and implement 3 clinical Use Cases
DescriptionIntegration and analysis on real-world health-related data, including genomics, represents one of the core pillars of the project. Six medical centres are committed to integrate, validate, and utilise digital tools that build on top of BETTER platform where multiple data sources from different centres can be fused and exploited for improving clinical outcomes. Importantly, different use cases will collect and analyse the whole genome sequence data. Besides potential foreseen obstacles including technical & IT issues, organisational, and training on new tools, this objective aims to demonstrate the effectiveness of the proposed framework in clinical setups enabling not only the sharing of data but also protocols, skills, know-how aiming to improving patient’s health outcomes.