Find duplicates: $match

This page describes how to use the $match operation to search for potential duplicate patient records, including request format, query parameters, and response structure.

To use $match operation you need to set up an MPI. Read the MPI manual to learn how to run and use it.

Currently, the $match operation is available only for Patient resources. If you are interested in extending this functionality to other resource types, please contact us.

The $match operation is used to find potential duplicate patient records.

It performs a probabilistic search based on a matching model that compares the patient record you provide with other patient records in the system across multiple features and estimates how similar they are. The structure of the matching model and its parameters are described on the Matching Model Explanation page.

The result is a list of potential duplicates, each with a calculated match score and a detailed breakdown of feature similarity.

This page provides key information about using $match. For full API details, refer to our Swagger documentation.

$match

The match operation can be initiated either through the MPI user interface or by using the API.

The $match operation supports several query parameters that let you control how matching is performed and how results are returned:

Name

Type

Default

Description

Example

model

string

model

Matching model ID to be used for matching

model

threshold

integer

0

Minimum score threshold for a candidate to appear in the match results

0

page

integer

1

Page number of results

1

size

integer

10

Number of results per page

10

To call the $match operation, you have to send a FHIR Parameters resource that includes the patient record for which you want to search potential duplicates. Typically, this record contains demographic data such as:

Name (given and family)
Address (e.g., city, state)
Birth date
Other identifying attributes if available (e.g., telecom, identifiers)

For example, the request can looks like this:

POST /fhir/Patient/$match?model=model&threshold=10&page=1&size=10
Content-Type: application/json

{
  "resourceType": "Parameters",
  "parameter": [
    {
      "name": "resource",
      "resource": {
        "name": [
          {
            "given": [
              "Freya"
            ],
            "family": "Shah"
          }
        ],
        "address": [
          {
            "city": "London"
          }
        ],
        "birthDate": "1970-12-17"
      }
    }
  ]
}

As a result, you will receive the following:

A list of candidate duplicate patient records
For each candidate record:
- match_weight — an overall similarity score calculated by the matching model
- match_details — per-feature similarity contributions (e.g., name similarity, date of birth match, address closeness, etc.)
- resource — the full FHIR Patient resource for that candidate

The response is sorted by match_weight in descending order so that the most similar records appear first.

For example:

[
  {
    "match_details": {
      "fn": 13.336495228175629,
      "dob": 10.59415069916466,
      "ext": -10.517360697819983,
      "sex": 0
    },
    "match_weight": 13.413285229520307,
    "resource": {
      "id": "236",
      "resourceType": "Patient",
      "name": [
        {
          "given": [
            "Freya"
          ],
          "family": "Shah"
        }
      ],
      "address": [
        {
          "city": "Londodn"
        }
      ],
      "birthDate": "1970-12-17",
      "identifier": [
        {
          "value": "62",
          "system": "cluster"
        }
      ]
    }
  },
  {
    "match_details": {
      "fn": 13.336495228175629,
      "dob": 10.59415069916466,
      "ext": -10.517360697819983,
      "sex": 0
    },
    "match_weight": 13.413285229520307,
    "resource": {
      "id": "242",
      "resourceType": "Patient",
      "name": [
        {
          "given": [
            "Freya"
          ],
          "family": "Shah"
        }
      ],
      "address": [
        {
          "city": "Lonnod"
        }
      ],
      "birthDate": "1970-12-17",
      "identifier": [
        {
          "value": "62",
          "system": "cluster"
        }
      ]
    }
  },
  {
    "match_details": {
      "fn": 13.104401641242227,
      "dob": 10.59415069916466,
      "ext": -10.517360697819983,
      "sex": 0
    },
    "match_weight": 13.181191642586905,
    "resource": {
      "id": "238",
      "resourceType": "Patient",
      "name": [
        {
          "given": [
            "Shah"
          ],
          "family": "Freya"
        }
      ],
      "address": [
        {
          "city": "London"
        }
      ],
      "telecom": [
        {
          "value": "[email protected]",
          "system": "email"
        }
      ],
      "birthDate": "1970-12-17",
      "identifier": [
        {
          "value": "62",
          "system": "cluster"
        }
      ]
    }
  }
]

PreviousConfigure MPI module NextMerging and Unmerging Records: $merge and $unmerge

Last updated 7 days ago

Was this helpful?