Barts Health Datasets 

Barts Health NHS Trust makes several sets of health data available to external organisations through the Barts Health Data Platform (BHDP). As a rule, this data will stay in the secure data environment (SDE) of the BHDP, unless we can be certain the data can be made anonymous. Any data which leaves the SDE are strictly monitored and controlled, and this may not be allowed. Different types of data are described below: 

Structured Clinical Data

​This type of dataset includes coded information about a patient’s diagnosis, treatments, procedures, medicines, test results, and measurements like blood pressure, pulse rate, body temperature, and oxygen rate. Barts Health collects around 2 million new data points
every day and covers a population of close to 3 million patients which is over 40% non-white.

Free text

This covers all documents for patients in Barts Health including scanned letters and reports which are available for analysis within the BHDP. Without the appropriate approvals or consent, identifying patient names and information will be removed, and the resulting data can only be analysed within the BHDP. With over 170 million documents already in our records and hundreds of thousands more being added each day, this provides a unique resource for analysis.

​​OMOP clinical data 

This is a type of Structured Clinical dataset which has been standardised into the OMOP form. This makes it easier to analyse data across multiple hospitals. Over 60,000 distinct concepts have been standardised in OMOP.  

​​Imaging

Over 13 million medical images (DICOM) are available from radiology (X-rays and MRIs) with reports explaining what the images are showing. The patient's name and identifiable information are automatically removed from the images and reports. Digital pathology with reports and findings is expected to be added soon.

​​Maternity

There are over 160,000 births linking both the mother and baby’s health record. This also includes data from the Neonatal Intensive Care Unit. Patient names and identifiable information are removed before it is used for analysis inside the BHDP.

​​Geographic

All addresses over the course of patient care are mapped to geographical areas (LSOA and UPRN) and linked with deprivation (poverty levels) and air quality data. This allows tracking patient data over many years and compares it with data on up to 15 different pollutants to see how differently these things affect a person's health.

Blood Counts 

There are over 160,000 births linking both the mother and baby’s health record. This also includes data from the Neonatal Intensive Care Unit. Patient names and identifiable information are removed before it is used for analysis inside the BHDP.

Data Set Identified Anonymous
Structured Clinical Data Yes Yes
Free text Yes Yes (if in BHDP)
OMOP Clinical Data Yes Yes
Imaging Yes Yes
Maternity Yes Yes

Ethical approvals 

​All requests to access data from Barts Health must go through an application and review process. The review process is managed by the Barts Health Data Access Committee, which includes members from the technical, commercial, clinical, research, nursing, public contributors and information and research governance teams.

When access to data is granted, it will be limited to the minimum amount necessary for the request. If the request includes a large amount of data or data with sensitive information, a more detailed review will be needed.

If any personal data is required, then a complete ethical review process is needed, which may include individual patient consent for the data set. Further approval or sign-off is needed for data where personal details have been removed.

Some structured data sets can be made anonymous by removing fields with known patient details from the record. Unstructured free text fields can have specific patient-related information removed, but this may still leave relatable identifiable details about the patient (e.g., family) in the text and would still needs careful ethical review and potentially Caldicott Guardian sign-off and would not normally be allowed outside the BHDP.

​​Data linkage

How we link data

​​ We use various methods for data linkage:
• Data Transfers: We can either send Barts Health data to a partner's secure environment or receive data from partners to link with our datasets.
• Federated Analysis: In some cases, we analyse datasets separately without moving them, then combine the results using statistical methods.
• Information Governance: Before linking data, we ensure that all necessary approvals are in place. This includes checking whether the identifiers (like NHS numbers) are already authorised for use. We assess the sensitivity of the data to determine how safely it can be shared. For example, sharing a list of NHS numbers is generally low risk, while sharing names and addresses is much more sensitive.

Linkage methods

​​We have several approaches to linking data:
• Partner Matching: One partner matches identifiers based on existing approvals, which is straightforward and low-risk.
• Third-Party Matching: A trusted third party can perform the matching, ensuring that neither partner sees the other's full dataset.
• Cryptographic Matching: Instead of sharing identifiable information directly, we use cryptographic hashes. This means that sensitive identifiers are transformed into a secure format that cannot be reversed, enhancing privacy. 

Data transfer options

​​Once data is linked, we can transfer it in different ways:
• Manual Transfers: Researchers can download and upload data manually when they have the necessary approvals and permissions.
• Team Transfers: Our Barts Health Data Platform team can handle transfers directly for researchers.
• API Integration: For more advanced setups, we can automate data transfers through APIs, making the process seamless.

 By continuously improving our data linkage strategies, we aim to support impactful research while safeguarding patient confidentiality.