ICD-10 to SNOMED CT mapping tool | Bennett Institute for Applied Data Science

We’ve created a tool that could help with mapping of codelists from ICD-10 to SNOMED CT. We are making it available more widely in the hope that others find it helpful.

ICD-10 and SNOMED CT

ICD-10 and SNOMED CT are both widely used clinical coding systems, although each is designed for a specific purpose.

ICD-10 is an international disease classification system developed and maintained by the World Health Organization. It is used primarily for recording diagnoses for reporting, monitoring, billing and other administrative purposes, rather than for direct clinical care. ICD-10 codes are mutually exclusive, relatively high-level, and designed to remain stable over time. In the UK, ICD-10 is the mandated standard for coding diagnoses in secondary care admissions and is used extensively in datasets such as Hospital Episode Statistics. The WHO’s international ICD-10 edition consists of 14,000 diagnostic codes.

SNOMED CT, by contrast, is an international clinical terminology designed to comprehensively record clinical information at the point of care. It covers concepts ranging from patient demographics, through findings and observations, to procedures and diagnoses. It also allows concepts to be recorded at different levels of detail, for example, as a broad diagnosis such as diabetes mellitus, or as a more specific concept such as type 1 diabetes with hypoglycaemia. In the UK, SNOMED CT is the mandated standard for primary care clinical coding and is embedded across general practice systems. SNOMED CT terminology consists of over 200,000 concepts.

Codelist mapping

Codelist mapping is the process of taking a codelist defined in one clinical terminology and identifying corresponding concepts in a different terminology. Maps between terminologies are often produced and maintained by international organisations.

The World Health Organization publishes a map from SNOMED CT to ICD-10. This map is intended as a semi-automated clinical coding aid, to be used in conjunction with patient records and clinical judgement to determine appropriate code selection. This direction of mapping generally works because ICD-10 is a high-level classification, and multiple SNOMED CT concepts can appropriately map to a single ICD-10 code. In the UK, NHS England also publishes a SNOMED CT to ICD-10 map, built on the SNOMED CT UK Edition, which includes the International Edition together with UK-specific extension content. Using the UK Edition allows SNOMED CT concepts that are only present in the UK extension to be mapped to ICD-10.

However, these SNOMED to ICD-10 maps are not designed for use in reverse. Because SNOMED CT is a larger classification system, with many more concepts, mapping from ICD-10 to SNOMED CT is much more problematic. A single ICD-10 code often corresponds to multiple SNOMED CT concepts with different meanings and levels of detail.

For example, an ICD-10 code for asthma may map to a wide range of SNOMED CT concepts, including current asthma, asthma in remission, exercise-induced asthma, or an acute asthma exacerbation. These concepts are not interchangeable, and choosing between them requires assumptions about clinical intent that are not captured by the original ICD-10 code.

IMPORTANT WARNING

In most cases, codelist mapping is not recommended.

Constructing a codelist is a complex process. Decisions about inclusion and exclusion are made throughout, often based on clinical judgement, data quality, and the intended use of the codelist. These decisions are often not apparent from the codelist alone, which is why good metadata and documentation matter.

When a codelist is mapped from one terminology to another, this context is usually lost. Even if the mapped codelist looks reasonable, it still needs a detailed review. In practice, it is often harder to spot omissions or inappropriate inclusions in a mapped codelist than it is to rebuild the codelist directly in the new terminology.

As a result, our default approach is to recreate the intended clinical concept by curating a new codelist in the target coding system, rather than to rely on automated mapping.

So why did we create this?

We are currently working with the Global Burden of Disease Study to explore how OpenSAFELY might be used to derive estimates of incidence and prevalence for selected diseases and conditions in the UK population. As a well-established study, GBD already maintains a large set of ICD-10 codelists to derive disease metrics from ICD-10 coded data sources.

To extract equivalent information from primary care records using OpenSAFELY, SNOMED CT codelists are required, as SNOMED CT is the mandated terminology in primary care. These codelists need to reflect the same intent as the existing ICD-10 codelists. Given the scale of the project and the number of codelists involved, we wanted to explore whether automated approaches could be used to generate candidate SNOMED CT codelists.

A custom mapping tool

To support this work, we built a custom-made mapping tool, available online, which allows us to explore and curate mappings between ICD-10 codes and SNOMED CT concepts.

SNOMED Editions

The tool supports three mapping configurations, using SNOMED CT to ICD-10 maps published by NHS England :

SNOMED CT International Edition mappings
SNOMED CT UK Edition mappings
Both International and UK Edition mappings

Mapping version selector

The different options help explore the difference between the mappings, while the inclusion of the UK Edition mapping accounts for UK-specific clinical usage.

In our experience, we found the UK Edition provided more non-relevant matches, likely due to the greater degree of detail in this mapping file. We’d therefore recommend starting with the International Edition mapping, but carefully considering how UK-specific terms will be missed.

Mapping there and back again

The SNOMED CT to ICD-10 mapping files are not designed for automated use, and they are not intended to support mapping from ICD-10 to SNOMED CT. Using them in this way can easily introduce incorrect or overly broad concepts.

To mitigate this, we take an additional validation step. For each SNOMED CT concept generated through mapping, we map it back to all associated ICD-10 codes and examine how many of those appear in the original ICD-10 codelist. This allows us to assess how well the SNOMED CT concept aligns with the intent of the original codelist.

Where a SNOMED CT concept maps to many ICD-10 codes, but only a small proportion of those are present in the original codelist, this suggests a weak or potentially misleading match. In these cases, it may be possible to apply a cut-off threshold to automatically exclude such concepts before manual review.

Reverse mapping example

Using hierarchy to provide clinical context

A key feature of the tool is its hierarchy view. Rather than presenting mappings as a flat list, codes are displayed within the SNOMED CT ontology, allowing users to see parent and child relationships.

Hierarchy view

Mapped concepts are shown in blue, while unmapped concepts appear in grey. This visual distinction makes it much easier to understand where mappings sit in the wider clinical hierarchy, and to identify gaps or inconsistencies that might otherwise be missed.

Grouping to potential parent codes

We have also added a feature called Group to Potential Parent Codes. This brings additional structure to the mapping process by identifying relevant parent concepts, and in some cases, child concepts, that may not have been captured in the initial mapping.

Parent finder example

This step is particularly useful for identifying clinically appropriate higher-level concepts that might be suitable for inclusion but are not explicitly present in the ICD-10 source codes. It can also reveal possible child terms from this “found” parent term which may have been missed.

Usage counts

We also pull in data from the OpenCodeCounts project, which shows how often individual SNOMED CT codes are used in practice. This helps focus review on the codes that are most commonly recorded and provides some reassurance that rarely used or obscure codes are unlikely to have a large impact on analyses.

Example of counts of code usage

Pulling it all together

At a high level, the mapping process follows these steps:

ICD-10 codes are mapped to SNOMED CT concepts using one (or both) of the SNOMED CT maps
These mappings are processed through the custom mapping tool to generate a structured list of SNOMED CT codes
Thresholds are applied to
- remove codes that do not meet a mapped back to ICD-10 threshold
- identify and include missing parent codes based on a parent inclusion threshold

From here, a codelist CSV file can be generated and imported into OpenCodelists. This resulting codelist would still need careful review and refinement within OpenCodelists to ensure accuracy and completeness.

Bring your own codelists

We believe in the benefits of being open about the work we do and sharing the tools we build. This includes being transparent about how our methods work and how decisions are made, as well as sharing code so others can see exactly what has been done.

We originally developed this tool for a specific set of GBD codelists, as a part of an ongoing GBD-Bennett collaboration. We continue to explore other ways of how to make codelist curation more efficient. At this stage, we do not plan further upgrades to this tool, but recognise it might still be useful to share more widely. For this reason, we have made it possible to upload your own ICD-10 codelist to the tool to support mapping to SNOMED CT and draw insights about relationships between the two classification systems.

PLEASE NOTE: This tool and the associated mappings are based on the SNOMED CT and ICD-10 versions available at the point of development. The tool is not under active maintenance, and the mappings are not updated as terminology releases evolve. As SNOMED CT content, extensions, and mappings change over time, the accuracy and relevance of the outputs generated by this tool may decrease.

Upload

Although we do not currently plan further development of this tool, we will keep it available for as long as it appears helpful. This work reflects our current understanding and experience, and we recognise that others may have approached similar challenges in different or more effective ways. We would welcome any feedback, suggestions, or shared learning from those who use the tool. Please do get in touch at bennett@phc.ox.ac.uk.