Bulk API

Bulk export & import operations


You can dump all resources of a specific type with $dump operation - GET [resource-type]/$dump - Aidbox will respond with Chunked Transfer Encoding ndjson stream. This is a memory efficient operation - Aidbox just streams database cursor to socket. If your HTTP Client supports processing of Chunked Encoding, you can process resources in stream one by one without waiting for end of the response.

GET /Patient/$dump
HTTP/1.1 200 OK
Content-Type: application/ndjson
Transfer-Encoding: chunked
{"resourceType": "Patient", "id": .............}
{"resourceType": "Patient", "id": .............}
{"resourceType": "Patient", "id": .............}
{"resourceType": "Patient", "id": .............}

Here is an example of how you can dump all patients from your box (assuming you have a client with access policy):

curl -u client:secret -H 'content-type:application/json' \
https://<box-url>/Patient/\$dump | gzip > patients.ndjson.gz

Dump data

Dumps data as NDJSON, optionally in FHIR format or GZIPped
Path Parameters
name of the resource type to be exported
Query Parameters
Date in ISO format; if present, exported data will contain only the resources created after the date.
Convert data to the FHIR format. If disabled, the data is exported in the Aidbox format.
GZIP the result. If enabled, HTTP headers for gzip encoding are also set.
200: OK
NDJSON representing the resource Example request: GET /Appointment/$dump?fhir=true


Takes the sql query and responds with the Chunked Encoded stream in CSV format. Useful to export data for analytics.

POST /$dump-sql
query: select id, resource#>>'{name,0,family}'
format: csv # ndjson; sql; elastic-bulk?
HTTP/1.1 200 OK
Content-Type: application/CSV
Transfer-Encoding: chunked
pt-1 Doe John
pt-2 Smith Mike


You can efficiently load data into Aidbox in ndjson gz format from external web service or bucket. There are two versions of $load - /$load and /[resourceType]/$load. First can load multiple resource types from one ndjson file, second is more efficient, but loads only for a specific resource type. Both operations accept body with source element, which should be publicly available url. If you want to secure your import use Signed URLs by Amazon S3 or Google Storage.

There are two versions of this operation - /fhir/$load accepts data in FHIR format, /$load works with Aidbox format.

Keep in mind that $load does not validate inserted resources for the sake of performance. Be mindful of the data you insert and use correct URL for your data format.

Here how you can load 100 synthea Patients (see tutorial):

POST /fhir/Patient/$load
source: 'https://storage.googleapis.com/aidbox-public/synthea/100/Patient.ndjson.gz'
{total: 124}

Or load the whole synthea package:

POST /fhir/$load
source: 'https://storage.googleapis.com/aidbox-public/synthea/100/all.ndjson.gz'
# resp
{CarePlan: 356, Observation: 20382, MedicationAdministration: 150, .... }

$import & /fhir/$import

$import is implementation of upcoming FHIR Bulk Import API. This is async Operation, which returns url to monitor progress. Here is self descriptive example:

POST /fhir/$import
id: synthea
inputFormat: application/fhir+ndjson
contentEncoding: gzip
mode: bulk
- resourceType: Encounter
url: https://storage.googleapis.com/aidbox-public/synthea/100/Encounter.ndjson.gz
- resourceType: Organization
url: https://storage.googleapis.com/aidbox-public/synthea/100/Organization.ndjson.gz
- resourceType: Patient
url: https://storage.googleapis.com/aidbox-public/synthea/100/Patient.ndjson.gz

You post import body with id and can monitor progress of import using:

GET /BulkImportStatus/[id]

Read more