Aidbox
Search
K

$export

FHIR Bulk Data Export
The FHIR Bulk Data Export feature allows to export FHIR resources in ndjson format.
Aidbox supports patient-level and group-level export. When the export request is submitted the server returns URL to check the export status. When export is finished, status endpoint returns URLs to download resources.
Only one export process can be run at the same time. If you try to submit an export request while there is active export, you get 429 Too Many Requests error.

Setup storage

Aidbox can export data to GCP or AWS cloud. Export results will be in <datetime>_<uuid> folder on the bucket.

GCP

Create bucket and service account that has read and write access to the bucket.
Create GcpServiceAccount resource in Aidbox. Example:
private-key: |
-----BEGIN PRIVATE KEY-----
your-key-here
-----END PRIVATE KEY-----
service-account-email: service-account@email
id: gcp-service-account
resourceType: GcpServiceAccount
Set the following environment variables:
  • box_bulk__storage_backend=gcp — backend for export
  • box_bulk__storage_gcp_service__account — id of the GcpServiceAccount resource
  • box_bulk__storage_gcp_bucket — bucket name

AWS

Create S3 bucket and IAM user that has read and write access to the bucket.
Create AwsAccount resource in Aidbox. Example:
region: us-east-1
access-key-id: your-key-id
secret-access-key: key
id: aws-account
resourceType: AwsAccount
Set the following environment variables:
  • box_bulk__storage_backend=aws — backend for export
  • box_bulk__storage_aws_account — id of the AwsAccount resource
  • box_bulk__storage_aws_bucket — bucket name

Parameters

Parameter
Description
_outputFormat
Specifies format in which the server generates files. The following formats are supported:
  • application/fhir+ndjson.ndjson files will be saved
  • application/fhir+ndjson+gzip.ndjson.gz files will be saved
_type
Includes only the specified types. This list is comma-separated.
_since
Includes only resources changed after the specified time.
patient
Export data that belongs only to listed patient. Format: comma-separated list of patient ids. Available only for patient-level export.

Patient-level export

Patient-level export exports all Patient resources and resources associated with them. This association is defined by FHIR Compartments.
To start export make a request to /fhir/Patient/$export:
Request
Response
Rest console
GET /fhir/Patient/$export
Accept: application/fhir+json
Prefer: respond-async
Status
202 Accepted
Headers
  • Content-Location — Link to check export status (e.g. /fhir/$export-status/<id>)
Make a request to the export status endpoint to check the status:
Request
Response (completed)
Rest console
GET /fhir/$export-status/<id>
Status
200 OK
Body
{
"status": "completed",
"transactionTime": "2021-12-08T08:28:06.489Z",
"requiresAccessToken": false,
"request": "[base]/fhir/Patient/$export"
"output": [
{
"type": "Patient",
"url": "https://storage/some-url",
"count": 2
},
{
"type": "Person",
"url": "https://storage/some-other-url",
"count": 1
}
]
}
Delete request on the export status endpoint cancels export.
Request
Response
Rest console
DELETE /fhir/$export-status/<id>
Status
202 Accepted

Group-level export

Group-level export exports all Patient resources that belong to the specified group and resources associated with them. Characteristics of the group are not exported. This association is defined by FHIR Compartments.
To start export make a request to /fhir/Group/<group-id>/$export:
Request
Response
Rest console
GET /fhir/Group/<group-id>/$export
Accept: application/fhir+json
Prefer: respond-async
Status
202 Accepted
Headers
  • Content-Location — Link to check export status (e.g. /fhir/$export-status/<id>)
Make a request to the export status endpoint to check the status:
Request
Response (completed)
Rest console
GET /fhir/$export-status/<id>
Status
200 OK
Body
{
"status": "completed",
"transactionTime": "2021-12-08T08:28:06.489Z",
"requiresAccessToken": false,
"output": [
{
"type": "Patient",
"url": "https://storage/some-url",
"count": 2
},
{
"type": "Person",
"url": "https://storage/some-other-url",
"count": 1
}
]
}
Delete request on the export status endpoint cancels export.
Request
Response
Rest console
DELETE /fhir/$export-status/<id>
Status
202 Accepted

System-level export

System-level export exports data from a FHIR server, whether or not it is associated with a patient. You may restrict the resources returned using the _type parameter.
Limitation: export operation will work for standard FHIR resources only, not for custom resources.
Request
Response(completed)
GET /fhir/$export
Accept: application/fhir+json
Prefer: respond-async
Status
200 OK
Body
{
"status": "completed",
"transactionTime": "2021-12-08T08:28:06.489Z",
"requiresAccessToken": false,
"output": [
{
"type": "Patient",
"url": "https://storage/some-url",
"count": 2
},
{
"type": "Person",
"url": "https://storage/some-other-url",
"count": 1
}
]
}
Delete request on the export status endpoint cancels export.
Request
Response
Rest console
DELETE /fhir/$export-status/<id>
Status
202 Accepted

Troubleshooting guide

$export operation expects you setup external storage, Aidbox exports data into. In most cases issues with $exoprt are the consequence of incorrect Adbox configuration. In order to exclude this run the following rpc:
POST /rpc
Content-Type: text/yaml
method: aidbox.bulk/storage-healthcheck
Normally, you should see something like this in response body:
result:
message: ok
storage:
type: gcp
bucket: my_bucket
account:
id: gcp-acc
resourceType: GcpServiceAccount
This means, that integration between Aidbox and your storage setup correctly.
What other responses you may see

Storage-type not specified

Storage-type not specified error means, box_bulk__storage_backend env variable wasn't setup. Valid values are aws and gcp.

Unsupported storage-type

unsupported storage-type error means, box_bulk__storage_backend env variable has invalid value. Valid values are aws and gcp.

bulk-storage account not specified

This error means account is not specified
  • box_bulk__storage_gcp_service__account for GCP
  • box_bulk__storage_aws_account for AWS

Account not found

This means there is no account for aws or gcp
Create AWSAccount or GCPServiceAccount, depending on your config.

Bucket is not specified

This error means, bucket is not specified.
Specify box_bulk__storage_gcp_bucket for GCP.
Specify box_bulk__storage_aws_bucket for AWS.