πŸŽ“
Synthea by Bulk APi
In this guide we will generate synthea data and load it into aidbox

Generate synthea data

We are going to generate synthetic data with synthea project:
1
# brew install gradle
2
git clone https://github.com/synthetichealth/synthea
3
cd synthea
4
​
5
# edit src/main/resources/synthea.properties
6
# set exporter.fhir.bulk_data = true
7
​
8
# generate 100 pts
9
./run_synthea -p 100
10
​
11
cd output/fhir
12
ls -lah
13
# create all.ndjson
14
cat *.ndjson > all.ndjson
15
​
16
# gzip all ndjson
17
gzip *.ndjson
18
ls -lah
19
​
20
#load to storage
21
gsutil cp *.ndjson.gz gs://your-bucket/dir/
Copied!

Load by Resource Type

Now we can load for example Patients and Observations into your box:
1
POST /fhir/Patient/$load
2
Accept: text/yaml
3
Content-Type: text/yaml
4
​
5
source: 'https://storage.googleapis.com/aidbox-public/synthea/100/Patient.ndjson.gz'
6
​
7
#resp
8
{total: 124}
Copied!
1
POST /fhir/Observation/$load
2
Accept: text/yaml
3
Content-Type: text/yaml
4
​
5
source: 'https://storage.googleapis.com/aidbox-public/synthea/100/Observation.ndjson.gz'
6
​
7
#resp
8
{total: 20382}
Copied!
Let see the data in Aidbox:
1
GET /Patient?_ilike=John&_revinclude=Observation:patient
Copied!

Load all at once with $load

Using /fhir/$load you can load ndjson file with multiple resource types in one step:
1
POST /fhir/$load
2
Accept: text/yaml
3
Content-Type: text/yaml
4
​
5
source: 'https://storage.googleapis.com/aidbox-public/synthea/100/all.ndjson.gz'
6
​
7
# resp
8
​
9
{CarePlan: 356, Observation: 20382, MedicationAdministration: 150, Goal: 301, Patient: 124, DiagnosticReport: 1430, Practitioner: 181, ExplanationOfBenefit: 3460, Immunization: 1636, Claim: 4488, MedicationRequest: 1028, Encounter: 3460, Condition: 871, Procedure: 2854, Organization: 181, AllergyIntolerance: 40, ImagingStudy: 134}
Copied!
Let's see database stats:
1
SELECT relname, reltuples
2
FROM pg_class r
3
JOIN pg_namespace n ON (relnamespace = n.oid)
4
WHERE relkind = 'r' AND n.nspname = 'public'
5
order by reltuples desc
6
LIMIT 20
7
​
8
---
9
​
10
observation 20382
11
attribute 7257
12
claim 4488
13
encounter 3460
Copied!

Cleanup data:

Truncate tables from db console:
1
truncate CarePlan;
2
truncate Observation;
3
truncate MedicationAdministration
4
truncate Goal;
5
truncate Patient;
6
truncate DiagnosticReport;
7
truncate Practitioner;
8
truncate ExplanationOfBenefit;
9
truncate Immunization;
10
truncate Claim;
11
truncate MedicationRequest;
12
truncate Encounter;
13
truncate Condition;
14
truncate "procedure";
15
truncate Organization;
16
truncate AllergyIntolerance;
17
truncate ImagingStudy;
18
​
Copied!

Load with Bulk $import

To load data in a async way using new FHIR Bulk $import:
1
POST /fhir/$import
2
Accept: text/yaml
3
Content-Type: text/yaml
4
​
5
id: synthea
6
inputFormat: application/fhir+ndjson
7
contentEncoding: gzip
8
mode: bulk
9
inputs:
10
- resourceType: AllergyIntolerance
11
url: https://storage.googleapis.com/aidbox-public/synthea/100/AllergyIntolerance.ndjson.gz
12
- resourceType: CarePlan
13
url: https://storage.googleapis.com/aidbox-public/synthea/100/CarePlan.ndjson.gz
14
- resourceType: Claim
15
url: https://storage.googleapis.com/aidbox-public/synthea/100/Claim.ndjson.gz
16
- resourceType: Condition
17
url: https://storage.googleapis.com/aidbox-public/synthea/100/Condition.ndjson.gz
18
- resourceType: DiagnosticReport
19
url: https://storage.googleapis.com/aidbox-public/synthea/100/DiagnosticReport.ndjson.gz
20
- resourceType: Encounter
21
url: https://storage.googleapis.com/aidbox-public/synthea/100/Encounter.ndjson.gz
22
- resourceType: ExplanationOfBenefit
23
url: https://storage.googleapis.com/aidbox-public/synthea/100/ExplanationOfBenefit.ndjson.gz
24
- resourceType: Goal
25
url: https://storage.googleapis.com/aidbox-public/synthea/100/Goal.ndjson.gz
26
- resourceType: ImagingStudy
27
url: https://storage.googleapis.com/aidbox-public/synthea/100/ImagingStudy.ndjson.gz
28
- resourceType: Immunization
29
url: https://storage.googleapis.com/aidbox-public/synthea/100/Immunization.ndjson.gz
30
- resourceType: MedicationAdministration
31
url: https://storage.googleapis.com/aidbox-public/synthea/100/MedicationAdministration.ndjson.gz
32
- resourceType: MedicationRequest
33
url: https://storage.googleapis.com/aidbox-public/synthea/100/MedicationRequest.ndjson.gz
34
- resourceType: Observation
35
url: https://storage.googleapis.com/aidbox-public/synthea/100/Observation.ndjson.gz
36
- resourceType: Organization
37
url: https://storage.googleapis.com/aidbox-public/synthea/100/Organization.ndjson.gz
38
- resourceType: Patient
39
url: https://storage.googleapis.com/aidbox-public/synthea/100/Patient.ndjson.gz
40
- resourceType: Practitioner
41
url: https://storage.googleapis.com/aidbox-public/synthea/100/Practitioner.ndjson.gz
42
- resourceType: Procedure
43
url: https://storage.googleapis.com/aidbox-public/synthea/100/Procedure.ndjson.gz
Copied!
Operation will return 200 instantly and you can monitor import status with:
1
GET /BulkImportStatus/synthea
Copied!