Bulk API

Bulk export & import operations

$dump

You can dump all resources of a specific type with $dump operation - GET [resource-type]/$dump - Aidbox will respond with Chunked Transfer Encoding ndjson stream. This is a memory efficient operation - Aidbox just streams database cursor to socket. If your HTTP Client supports processing of Chunked Encoding, you can process resources in stream one by one without waiting for end of the response.

GET /Patient/$dump
#response
HTTP/1.1 200 OK
Content-Type: application/ndjson
Transfer-Encoding: chunked
{"resourceType": "Patient", "id": .............}
{"resourceType": "Patient", "id": .............}
{"resourceType": "Patient", "id": .............}
{"resourceType": "Patient", "id": .............}
.........

Here is an example of how you can dump all patients from your box (assuming you have a client with access policy):

curl -u client:secret -H 'content-type:application/json' \
https://<box-url>/Patient/\$dump | gzip > patients.ndjson.gz

get
Dump data

[base]/:resourceType/$dump
Dumps data as NDJSON, optionally in FHIR format or GZIPped
Request
Response
Request
Path Parameters
resourceType
required
string
name of the resource type to be exported
Query Parameters
_since
optional
string
Date in ISO format; if present, exported data will contain only the resources created after the date.
fhir
optional
boolean
Convert data to the FHIR format. If disabled, the data is exported in the Aidbox format.
gzip
optional
boolean
GZIP the result. If enabled, HTTP headers for gzip encoding are also set.
Response
200: OK
NDJSON representing the resource Example request: GET /Appointment/$dump?fhir=true
{"id":"ap-1","meta":{"versionId":15,"lastUpdated":"2021-04-02T16:03:31.057462+03:00","extension":[{"url":"ex:createdAt","valueInstant":"2021-04-02T16:03:09.419823+03:00"}]},"start":"2021-02-02T16:02:50.997+03:00","status":"fullfilled","participant":[{"status":"accepted"}],"resourceType":"Appointment"}
{"id":"ap-2","meta":{"versionId":26,"lastUpdated":"2021-04-02T16:04:24.695862+03:00","extension":[{"url":"ex:createdAt","valueInstant":"2021-04-02T16:03:38.168497+03:00"}]},"start":"2020-02-02T16:02:50.997+03:00","status":"fullfilled","participant":[{"status":"accepted"}],"resourceType":"Appointment"}
{"id":"ap-3","meta":{"versionId":40,"lastUpdated":"2021-04-02T16:08:41.198887+03:00","extension":[{"url":"ex:createdAt","valueInstant":"2021-04-02T16:03:55.869199+03:00"}]},"start":"2021-04-02T16:02:50.996+03:00","status":"fullfilled","participant":[{"status":"accepted"}],"resourceType":"Appointment"}

$dump-sql

Takes the sql query and responds with the Chunked Encoded stream in CSV format. Useful to export data for analytics.

POST /$dump-sql
query: select id, resource#>>'{name,0,family}'
format: csv # ndjson; sql; elastic-bulk?
HTTP/1.1 200 OK
Content-Type: application/CSV
Transfer-Encoding: chunked
pt-1 Doe John
pt-2 Smith Mike
................

$load

You can efficiently load data into Aidbox in ndjson gz format from external web service or bucket. There are two versions of $load - /$load and /[resourceType]/$load. First can load multiple resource types from one ndjson file, second is more efficient, but loads only for a specific resource type. Both operations accept body with source element, which should be publicly available url. If you want to secure your import use Signed URLs by Amazon S3 or Google Storage.

There are two versions of this operation - /fhir/$load accepts data in FHIR format, /$load works with Aidbox format.

Keep in mind that $load does not validate inserted resources for the sake of performance. Be mindful of the data you insert and use correct URL for your data format.

Here how you can load 100 synthea Patients (see tutorial):

POST /fhir/Patient/$load
source: 'https://storage.googleapis.com/aidbox-public/synthea/100/Patient.ndjson.gz'
#resp
{total: 124}

Or load the whole synthea package:

POST /fhir/$load
source: 'https://storage.googleapis.com/aidbox-public/synthea/100/all.ndjson.gz'
# resp
{CarePlan: 356, Observation: 20382, MedicationAdministration: 150, .... }

$import & /fhir/$import

$import is implementation of upcoming FHIR Bulk Import API. This is async Operation, which returns url to monitor progress. Here is self descriptive example:

POST /fhir/$import
id: synthea
inputFormat: application/fhir+ndjson
contentEncoding: gzip
mode: bulk
inputs:
- resourceType: Encounter
url: https://storage.googleapis.com/aidbox-public/synthea/100/Encounter.ndjson.gz
- resourceType: Organization
url: https://storage.googleapis.com/aidbox-public/synthea/100/Organization.ndjson.gz
- resourceType: Patient
url: https://storage.googleapis.com/aidbox-public/synthea/100/Patient.ndjson.gz

You post import body with id and can monitor progress of import using:

GET /BulkImportStatus/[id]

Read more