Skip to main content
Toggle menu

Search the website

Bennett Institute for Applied Data Science First Annual Report

Introduction

This is the first annual report for the new Bennett Institute for Applied Data Science. It is produced as an internal report to the Bennett Institute Advisory Board: however we expect that we will modify the content lightly and then post a version online at bennett.ox.ac.uk, as part of our work to promote the full range of activities at the Bennett Institute, alongside the individual workstreams, tools and research outputs of the team.

It would be an understatement to say that we have had an exciting first year. We have grown over the past few years from a small research group into a flourishing ecosystem of teams that all support and reinforce each other’s work, delivering on our vision of taking large datasets and making them drive change in the world. We are tremendously grateful to Peter Bennett and the Peter Bennett Foundation for their support, and commitment to our work. With this backing we have been able - more than ever - to take our grand plans and turn them into the hard, concrete delivery set out below.

How we work

We take large datasets and turn them into high impact academic research outputs in academic journals across a range of topics including clinical epidemiology, informatics methods, variation in clinical care, behaviour change, research integrity, policy analysis, and more.

But we also set out to make these datasets impactful in the world for all, by developing new methods and services that make it easier, safer, and faster for everyone to work with data. To do this, we recruit people with deep expertise in a huge variety of skills: our researchers and clinicians work alongside software developers, Information Governance and Policy professionals. By pooling skills and knowledge, all these diverse specialisms become part of the core creative academic group.

We also take best practice from each of the diverse specialisms in our group. We work in an open and collaborative way, by default: but we embody this in our technical approaches, sharing all our code openly to all as open source outputs for review and re-use globally. Where we encounter problems, we aim to fix them, not just for ourselves, but also for the whole community, by engaging with the policy community and sharing workable solutions to common challenges.

OpenSAFELY

In normal times, good analysis of health data can inform policy, improve patient care, and ultimately save lives. The Covid-19 pandemic made the need for detailed, accurate and timely analyses urgent, but the most detailed data in the UK - the full NHS Primary Care records - were not readily available for use, due to legitimate concerns about patient confidentiality and the complexity of the event-level data.

OpenSAFELY was created to resolve these issues, and enable near real-time analysis of these data. We developed and deployed a secure access layer, installed at the existing centres where the data are held. This included modules to curate and facilitate the use of the raw data, and an expert support service, to simplify the analytical process and remove the need to ever transfer or view the raw data. OpenSAFELY is installed in the data-centres of TPP and EMIS, which enables analysis of the GP records of >99% of the population of England.

This year has seen continued growth in the service, both in the volume of data available to analyse in conjunction with the primary care records, and in the range and number of users and projects. In addition to sources such as the Hospital Episode Statistics (HES) database, and details of Covid tests and vaccinations, the Bennett Institute have also worked in partnership with ONS to link data from the ONS Covid Infection Survey to the Primary Care records, enabling more complex analyses than would be possible from either source alone, and only possible due to the design and security of OpenSAFELY.

This growth in the service has been facilitated through implementation of our innovative Co-Piloting scheme, where new users are supported by an experienced user of OpenSAFELY from the Bennett Institute through the first six weeks of their project. This scheme helps researchers quickly understand how to use the service, code and conduct their analysis efficiently, and produce their first outputs within that time. This service has been successful, with consistently positive feedback from researchers, high-quality analyses completed, and a growing number of collaborative working relationships between the Bennett Institute and research organisations.

Trust in OpenSAFELY

OpenSAFELY has earned access to an unprecedented scale of data by listening to the concerns of patients, professional groups, and campaigners; and then developing innovative new methods for working with patient data safely and transparently, to address those concerns through concrete action.

Support for OpenSAFELY Citizens Juries

In 2021 a series of three citizens’ juries were commissioned to address policy questions about data sharing initiatives introduced during the COVID-19 pandemic. The juries were sponsored by the NIHR ARC for Greater Manchester, the National Data Guardian and NHSx. The data sharing initiatives covered were OpenSAFELY; the NHS Data Store; and the addition of extra information to the Summary Care Record. Of these three initiatives, the juries were most supportive of the decision to introduce OpenSAFELY (77% of jurors very much in support and 33% in support) and least supportive of the decision to introduce the NHS COVID-19 Data Store and Platform (38% of jurors very much in support). This high level of support is attributable to the fact that most jurors considered OpenSAFELY to be the most transparent, trustworthy, and secure of the three data sharing initiatives. When asked at the end of the jury process whether OpenSAFELY should continue post-COVID, 87% of jurors were in support.

Support for OpenSAFELY from Patients

The OpenSAFELY public advisory group, composed of 12 members representative of different ethnicities, educational backgrounds, and genders, have continued to express the same strong level of citizen support for the continuation of OpenSAFELY beyond the pandemic.

Support for OpenSAFELY from Privacy Campaigners

Privacy Campaigners MedConfidential - who are usually critical of projects accessing large volumes of citizen-level data - have been strongly, actively and publicly supportive of OpenSAFELY, sitting on the OpenSAFELY Advisory Board, and supporting the ongoing use of OpenSAFELY on their website, via social media, and in email communication with senior policymakers.

Support for OpenSAFELY from Professional Groups

OpenSAFELY has received ongoing support from the professional community, including from the BMA, the RCGP and the Joint GP IT Committee. OpenSAFELY has also received support from the National Data Guardian - who referenced the Citizens Juries in her annual report, and from a large number of senior academics.

OpenSAFELY has operated successfully since 2020, with the legal permission to access and analyse Primary Care data being granted through the Control Of Patient Information (COPI) Notice issued by the Secretary of State for Health, to enable such use during the Covid-19 Pandemic. That COPI Notice has now expired, which risked the ongoing operation of OpenSAFELY.

We have worked closely with the Department for Health and Social Care (DHSC) and NHS England (NHSE) to ensure that OpenSAFELY can continue to operate. Both organisations are strongly supportive of - and grateful for - our work, and recognise the significant contribution made by the Bennett Institute and the research we enable. Both organisations are committed to putting in place a secure long-term legal agreement to ensure that OpenSAFELY can continue to operate without such a risk in future.

As such an agreement will take some months to finalise, DHSC agreed to our proposal that the expired COPI notice be extended, solely for OpenSAFELY, for a further six months. This was supported at Official and Ministerial level, and the extension was formally signed by the Secretary of State in October. This extension runs until 30th April 2023, and DHSC and NHSE are committed to working with us in the intervening period, to finalise the long-term legal basis for OpenSAFELY.

Outputs from OpenSAFELY

  • Over 100 Users, from around 25 organisations
  • 125 approved projects, with increasing proportion (currently around 35%) from external researchers
  • More than 30 published research outputs, including in Nature, the Lancet, the BMJ, and other high-impact journals, on topics including:
    • COVID-19 risks
    • Ethnicity and COVID risk
    • Household COVID risks
    • Long COVID
    • Monoclonal and antiviral coverage
    • Monoclonal and antiviral effectiveness
    • Vaccine effectiveness
    • Vaccine coverage
    • Vaccine safety
    • Consequences of COVID infection
    • Pharmacoepidemiology (exploring whether specific medicines increase or decrease patients’ risks from COVID)
    • Changes in clinical activity during and after pandemic

The case-studies below show examples of the high-value research made possible by OpenSAFELY.

Case Study: comparative effectiveness of Covid-19 vaccines

Objective: To compare the effectiveness of the Pfizer-BioNtech and Oxford-AstraZeneca vaccines against infection and Covid-19 disease, in health and social-care workers.

Results: The cumulative incidence of recorded infections with Covid-19, and accident and emergency attendance or hospital admission, resulting from infection, was similar for both vaccines. With both vaccines, incidence dropped sharply after 3-4 weeks, and there were no substantial differences in infection or disease up to 20 weeks after vaccination.

Conclusion: The sharp drop in incidence after 3-4 weeks is in line with expected onset of vaccine-induced immunity, and wider results suggest strong protection against COVID-19 disease for both vaccines.

Detailed Summary and Full Report

Case study: changes in English medication safety indicators throughout the COVID-19 pandemic

Objective: To assess whether the disruption to primary care services during the Covid-19 Pandemic had a negative impact on prescribing practices, as measured by the national PINCER programme, which aims to identify and correct hazardous prescribing in GP Practices.

Results: All PINCER measures were successfully implemented in OpenSAFELY (in TPP and EMIS), and analysis confirmed that levels of hazardous prescribing remained largely unchanged during the pandemic, with only small reductions in achievement of the PINCER indicators. All indicators exhibited substantial recovery by Sept 2021.

Conclusion: Good performance was maintained, during the Covid-19 Pandemic, across a diverse range of widely evaluated measures of safe prescribing.

Detailed Summary and Full Report

Case study: potentially inappropriate prescribing of DOACs to people with mechanical heart valves

Objective: To assess prescribing of direct oral anticoagulants (DOACs) amongst people with mechanical heart valves, who are not recommended to receive DOACs, following guidance issued during the COVID-19 pandemic to switch patients on warfarin to DOACs.

Results: Of the 15,457 people identified as having a mechanical heart valve, 1,058 (6.8%) of them had been prescribed a DOAC during the study period, of which 767 were still receiving a DOAC in May 2021.

Conclusion: Direct alerts have been issued to clinicians through their EHR software informing the issue. We show that the OpenSAFELY platform can be used for rapid audit and feedback to mitigate the indirect health impacts of COVID-19 on the NHS.

Detailed Summary and Full Report

Expanding OpenSAFELY to non-health datasets

OpenSAFELY has a proven track record of earning trust from privacy campaigners, professional groups, and the public in regard to safely managing disclosive medical records data. It has also earned a reputation for being an efficient data curation tool: this is critical, as it has been estimated that the data curation work makes up 80% of the total cost of an analysis project using NHS health data.

This presents significant opportunities for expanding OpenSAFELY as a set of data management tools on top of non-health datasets that present similar challenges are privacy, disclosivity and curation. For this work we will explore two models. Firstly, deploying OpenSAFELY on top of the data in the data centres where it already resides. Secondly, setting up our own data deposit service where collaborative organisations can send their data, and have it securely managed by our team. Both of these models will allow better access to non-health datasets; permit linkage of health data onto non-health datasets; and build deep collaborations with non-health research communities.

Next steps

We have initiated a programme of work to engage with key stakeholders covering key non health datasets including: the National Pupil Database; birth cohorts data for social science and health research; and commercial data held by supermarkets. These datasets all have high research value, and have all struggled with access and productivity due to privacy and curation challenges. This work is being led by Pete Stokes, who has joined our team from his previous role running the Trusted Research Environment at the Office for National Statistics.

Expanding OpenSAFELY to non-health datasets will require significant Research Software Engineering and Information Engineering resource and expertise to deliver successfully, and specifically to expand our methodological innovation on health data into non-health datasets. It will also require funding for initial development. The Bennett Institute leadership are currently considering how to best achieve this alongside on-going activities.

PCMAU: NHS Primary Care and Medicines Analytics Unit

In late 2021, following an open and competitive tender process, the Bennett Institute was awarded a three-year contract from NHS England to establish the Primary Care and Medicines Analytics Unit. This covers the Bennett Institute’s work on NHS service monitoring and clinical informatics, in collaboration with the Medicines and Primary Care Directorates at NHS England.

The focus of the PCMAU is to develop analytical tools, conduct single analyses and provide education and training to support NHSE analysts working on primary care or medicines. During the first year the PCMAU has delivered;

  • OpenSAFELY Interactive pilot phase - the creation of a point and click interface to the OpenSAFELY trusted research environment, providing access to OpenSAFELY analysis to trusted users without the need for any code to be written.
  • PINCER - the recreation of 13 measures, used to reduce errors in prescribing and medicines, in OpenSAFELY to support monitoring. The measures are available in reports.opensafely.org with accompanying methods preprinted.
  • OpenPrescribing - ongoing maintenance and some development of OpenPrescribing is funded through PCMAU.
  • Series of papers and associated reports on the restoration of services during the pandemic (SRO add links)
  • User research project into the current education and training available in the analytics of primary care and medicines data, and any gaps in existing material.

OpenPrescribing

Every month, the NHS in England publishes anonymised data about the drugs prescribed by GPs. However, the raw data files published are large and unwieldy, with more than 700 million rows, and, for many years, were barely usable (and, thus, hardly used). OpenPrescribing was developed to make it easy for GPs, NHS managers and the general public to explore the data, with huge success - The service has over 20,000 unique users each month.

In addition to making the data accessible, we continue to develop the range of pre-specified measures within OpenPrescribing, which highlight potential opportunities to improve the quality, safety, and cost effectiveness of prescribing, for each practice. For example, this year saw the development of the Outlier Detection Dashboards - a new tool to identify instances or patterns of unusual prescribing, intended to help local health leaders. While many outliers are wholly understandable, with appropriate local knowledge, others may occur when updated guidance on treatments has been issued by the NHS, and missed in some areas. This tool has received very positive feedback, as has OpenPrescribing more widely.

In addition to being used as the data source for a continuing series of academic papers, OpenPrescribing is also now routinely used to inform news articles on prescribing practice at a national and local level.

Next Steps for OpenPrescribing

As of Feb 2022 this work is now supported by the NHS England PCMAU contract for three years, renewable to five. In the next year, we intend to build on the continuing success of OpenPrescribing, and are looking for funding to enable production of a regular series of “State of the Nation” reports, providing a detailed picture of prescribing patterns for nationally important topics. These reports would be intended for a broad audience, to include the public, the media, policymakers, and clinicians. In addition to stimulating debate, showing the value of the data, and demonstrating the expertise within the Bennett Institute, these reports would also direct those latter parts of the audience to the OpenPrescribing tools, which would increase use, and enable them to understand and improve prescribing practices nationwide. We are also developing work to deploy OpenPrescribing across new datasets including hospital prescribing.

TrialsTracker

Clinical trials are the “gold standard” for evaluating which treatments work best in medicine. However the results of these trials are often left unreported, which undermines the ability of doctors, researchers and patients to make informed decisions about which treatments work best. The TrialsTracker was initially developed to track these trials and highlight any which were not reported when scheduled.

Two instances of the TrialsTracker live audit tool are currently in operation: One covering all trials covered by EU guidelines (which include those in the UK), and another for those covered by US legislation. These continue to receive extensive global media coverage and, more importantly, have contributed to a significant improvement in the reporting of trial results since their launch.

Recent research outputs include an analysis showing that UK academics perform better than those from any other country in Europe at reporting their clinical trial results in compliance with the rules. This is likely due, in large part, to the work done by the Bennett Institute with the Parliamentary Science and Tech Select Committee.

What’s Next?

Nick Devito who leads our research integrity workstream has successfully received funding for half his time on a major EU grant; and is leading our Retractobot project which will launch in early 2023. This tool identifies papers citing retracted publications, and then contacts half of the authors to inform them, as a randomised trial to evaluate whether informing researchers that they have cited a retracted paper reduces the frequency of future citations for retracted papers.

Policy and Advocacy

The expertise within the Bennett Institute is highly regarded, and widely sought, across many sectors and professions, and many of our staff sit on or support Health Service, Government, Academic and other, Boards, Advisory Groups and Expert Panels.

The most high-profile call on that expertise in the last year was with the Goldacre Review into the use of Health Data for Research and Analysis, which was commissioned by the Secretary of State for Health and Social Care in 2021, and published in April 2022. The Review made a series of recommendations to improve the use of health data. These were well received by the Government, Department for Health and Social Care, and the NHS, and received extensive media coverage. The recommendations from the Review are now being implemented, with strong support in particular for the two key recommendations (on TREs and open working methods) in the NHS Data Strategy 2022. Teams from the Bennett Institute are actively supporting this work in the NHS.

Case Study: the Goldacre Review for Secretary of State, 2022

In April 2022 we published the “Goldacre Review” Better, Broader, Safer: Using Health Data for Research and Analysis. Following a 6 month research process, involving interviews with more than 300 people, focus groups, and extensive desk research, the Review made approximately 185 recommendations to the UK Government across the following areas: modernising the NHS analytics workforce; open working; privacy and security; trusted research environments; information governance, ethics, and participation; and data curation.

The Review was well received by the UK Government who officially responded to it in the NHS Data Strategy Data Saves Lives published in June 2022, directly picking up the Review’s recommendations related to:

  1. Privacy and Security (Trusted Research Environments - renamed Secure Data Environments)
  2. IG, Engagement, and Ethics
  3. Open Working

Of particular note were the Data Strategy’s commitments to:

  • Implementing secure data environments (TREs) as the default across the NHS
  • Developing a standard for public engagement, setting out best practice for health and care organisations, and any other body using NHS data, to engage appropriately with the public and staff across the system on data programmes and issues
  • Developing an Open Analytics policy for the NHS

The NHS is now focused on implementing these recommendations - a process that we, as the Bennett Institute, are actively involved in. More details about the policy impact of the Review are available on our blog: here, here, here, and here.

Select Committees and Other Engagement

Various members of our team sit on national committees relating to data use and we are widely consulted across the system. BG gave invited oral evidence to the Science and Technology Select Committee twice in the last year:

  • Dec 2021: Reproducibility and Research Integrity
    • Evidence focussed on strengths and weaknesses of the academic publication process, and the problems caused by the lack of value pl aced on Research Software Engineering expertise within academic institutions
    • An article was published in The Register, summarising this evidence
  • May 2022: The Right to Privacy, Digital Data
    • Evidence focussed on risks to data privacy and the impact that failure to mitigate these has on public trust, and other related themes discussed in the Goldacre Review

Organisation

The Bennett Institute has grown fast and is now in the process of segmenting its work out into single teams with the right degree of autonomy to maintain productivity and retain the creative spark that has driven us to success over the preceding years at smaller scale. We have developed a model of eight core teams, each of which is described below.

Operations

This is the senior leadership team for the Institute, responsible for its overall delivery. In addition to setting the strategic direction for the institute, and building strong relationships with stakeholder and partner organisations, this team are also responsible for: setting strategic direction for the Institute, prioritising and directing existing and new activities; leading engagement with stakeholder and partner organisations; coordinating promotion of the Institute, its tools and expertise; prioritising and coordinating grant applications; managing grants, finances and resourcing; coordinating formal governance.

Data

The Data team develop tools to enable high-quality epidemiology and service analytics, by making it easy to understand and work with Primary Care data, and support the use of these.

Pipeline

The Pipeline team provide and support the platform and tools which enable reproducible research, including the creation of studies and running of analyses. They also work to automate various components of the wider service offered to researchers.

Product

The Product team engage with users and operators of the OpenSAFELY products, to understand their needs and identify opportunities for further development and enhancements. They then work with the Data and Pipeline teams to agree and prioritise the most appropriate solutions to deliver these.

Epidemiology

The Epidemiology team use the platforms and tools run by the Bennett Institute, to produce high-quality and reproducible epidemiological research. They advocate for better ways of working in epidemiology, and develop and maintain tools to facilitate this.

Clinical Informatics

The Clinical Informatics team ensure Bennett Institute products are clinically accurate and meaningful, and routinely used to deliver actionable insights to users. They work with users and clinicians to test existing tools, and develop ideas for new ones.

NHS Service Analytics

The NHS Service Analytics Team team use the data platforms and tools run by the Bennett Institute to produce high quality, and reproducible, NHS Service Analytics, and work to make it easier for others to do this.

Research Integrity and Policy

The Research Integrity and Policy Team analyses the technical, regulatory, and cultural barriers to better use of data, and the production of reproducible and rigorous research. They turn the findings into actionable policy insights and offer workable, practical solutions.

Governance

Day-to-day work is led, prioritised and coordinated by the Operations Team. Senior strategic input to the Bennett Institute is provided by the Advisory Board, which tracks progress against core objectives, and provides support and advice in achieving these. The Institute also runs and engages with a number of other committees, to inform and direct specific areas of work.

Operations Team Meetings

The Ops Team meet weekly, to discuss progress across the Institute, prioritise activity and resource and plan future work.

Bennett Institute Advisory Board

The Advisory Board provides strategic advice, guidance and support to the Institute’s senior management in developing and reviewing the Institute’s strategy, research and performance, in relation to the key policy and societal challenges which it should be exploring, and to ensure that the Bennett Institute takes an international perspective. The Advisory Board will support the Bennett Institute’s profile and impact globally, and will advise upon the circulation of its research and achievements to a global audience. The Advisory Board was established at the same time as the Institute and is chaired by Professor Patrick Grant, the Pro Vice-Chancellor (Research).

OpenSAFELY Oversight Board

The core work of OpenSAFELY is overseen by the Oversight Board, Chaired by Professor Sir Nigel Shadbolt (who is also on the Bennett Institute Advisory Board) with all key stakeholders represented, including Data Controllers, partner organisations, and a representative of the patients whose data are analysed.

OpenSAFELY Digital Critical Friends Group

It is crucial that patients and the public are involved in all our work, that they have sight of what we do, and that we have structures in place to hear back from those who are affected by our work, both in terms of platforms and single research paper outputs. Public and Patient support for both the principles of OpenSAFELY and the analysis undertaken through the service is particularly crucial. To help maintain this, we meet with a group of patient representatives every two weeks, to discuss and seek feedback on developments with the service and future plans. This group also provides feedback directly to the Oversight Board, where a representative of this group is a core member, to ensure all stakeholders are aware of the views of the wider group.

OpenSAFELY/ONS Analytical Working Group

This group is chaired by ONS, and manages the partnership between ONS and the Bennett Institute, established when data from the Covid Infection Survey were made available through OpenSAFELY. It is used to agree and prioritise research using these data and identify further opportunities for collaboration.

Funding

The Bennett Institute was established following a generous donation from the Peter Bennett Foundation, to pioneer the better use of data, evidence and digital tools in healthcare and policy and so optimise the impact of interventions to achieve improved outcomes.

In addition to the Peter Bennett Foundation donation, the Bennett Institute also regularly applies for research grants to fund specific projects. Previously awarded grants have come from: Laura and John Arnold Foundation, the NHS National Institute for Health Research (NIHR), the NIHR School of Primary Care Research, NHS England, the NIHR Oxford Biomedical Research Centre, the Mohn-Westlake Foundation, NIHR Applied Research Collaboration Oxford and Thames Valley, the Wellcome Trust, the Good Thinking Foundation, Health Data Research UK, the Health Foundation, the World Health Organisation, UKRI MRC, Asthma UK, the British Lung Foundation, and both the Longitudinal Health and Wellbeing and Data and Connectivity strands of the National Core Studies programme.