Aidbox User Docs
Run Aidbox locallyRun Aidbox in SandboxTalk to us Ask community
  • Aidbox FHIR platform documentation
    • Features
    • Architecture
  • Getting Started
    • Run Aidbox in Sandbox
    • Run Aidbox locally
    • Run Aidbox on AWS
    • Upload Sample Data
  • Tutorials
    • CRUD, Search Tutorials
      • Delete data
      • Set up uniqueness in Resource
      • Search Tutorials
        • Custom SearchParameter tutorial
        • Create custom Aidbox Search resource
        • Multilingual search tutorial
        • Migrate from Aidbox SearchParameter to FHIR SearchParameter
        • Change sort order by locale collation
    • Bulk API Tutorials
      • 🎓Synthea by Bulk API
      • 🎓$dump-sql tutorial
    • Security & Access Control Tutorials
      • Allow patients to see their own data
      • Restrict operations on resource type
      • Relationship-based access control
      • Creating user & set up full user access
      • Restricting Access to Patient Data
      • Create and test access control
      • RBAC
        • Flexible RBAC built-in to Aidbox
        • RBAC with JWT containing role
        • RBAC with ACL
      • Set-up token introspection
      • Prohibit user to login
      • Managing Admin Access to the Aidbox UI Using Okta Groups
      • Run Multibox locally
      • How to enable labels-based access control
      • How to enable patient data access API
      • How to enable SMART on FHIR on Patient Access API
      • How to enable hierarchical access control
      • How to configure Audit Log
      • How is an HTTP request processed in Aidbox
      • How to configure SSO with another Aidbox instance to access Aidbox UI
      • How to configure SSO with Okta to access Aidbox UI
      • How to configure sign-in with Apple for access to the Aidbox UI
      • How to configure Azure AD SSO for access to the Aidbox UI
      • How to configure Microsoft AD FS for access to the Aidbox UI
      • How to configure Azure AD SSO with certificate authentication for access to the Aidbox UI
      • How to configure GitHub SSO for access to Aidbox UI
      • How to configure Keycloak for access for AidboxUI
      • How to implement Consent-based Access Control using FHIR Search and Aidbox Access Policy
      • Debug Access Control
      • AccessPolicy best practices
    • Terminology Tutorials
      • Load ICD-10 terminology into Aidbox
      • Uploading IG terminology content to external FHIR terminology server
    • Validation Tutorials
      • Upload FHIR Implementation Guide
        • Environment Variable
        • Aidbox UI
          • IG Package from Aidbox Registry
          • Public URL to IG Package
          • Local IG Package
        • Aidbox FHIR API
        • UploadFIG Tool
      • ISiK
      • Carin BB
      • US Core
      • Davinci Pdex
      • mCode
    • Integration Toolkit Tutorials
      • Postmark integration tutorial
      • Mailgun integration tutorial
    • Subscriptions Tutorials
      • AidboxTopicSubscription NATS tutorial
    • Other tutorials
      • Run Aidbox with FHIR R6
      • Migrate from Multibox to Aidbox
      • SDC with Custom Resources
      • How to create FHIR NPM package
      • Migrate from legacy licence portal to Aidbox portal
      • How to run Aidbox in GCP Cloud Run
  • Overview
    • Licensing and Support
    • Aidbox user portal
      • Projects
      • Licenses
      • Members
    • Aidbox UI
      • Aidbox Notebooks
      • REST Console
      • Database Console
      • Attrs stats
      • DB Tables
      • DB Queries
    • Versioning
    • Release Notes
    • Contact us
  • Configuration
    • Settings
    • Configure Aidbox and Multibox
    • Init Bundle
  • API
    • REST API
      • CRUD
        • Create
        • Read
        • Update
        • Patch
        • Delete
      • FHIR Search
        • SearchParameter
        • Include and Revinclude
        • Chaining
      • Aidbox Search
      • Bundle
      • History
      • $everything on Patient
      • Other
        • Aidbox & FHIR formats
        • Capability Statement
        • $document
        • Observation/$lastn
        • $validate
        • SQL endpoints
        • $matcho
        • $to-format
        • Aidbox version
        • Health check
    • Bulk API
      • Configure Access Policies for Bulk API
      • $dump
      • $dump-sql
      • $dump-csv
      • $export
      • $load & /fhir/$load
      • $import & /fhir/$import
      • aidbox.bulk data import
      • Bulk import from an S3 bucket
    • Batch/Transaction
    • GraphQL API
    • Other APIs
      • Plan API
        • Provider Directory API
          • Practitioner
          • PractitionerRole
          • Organization
          • OrganizationAffiliation
        • Plan API Overview
      • Archive/Restore API
        • create-archive
        • restore-archive
        • prune-archived-data
        • delete-archive
      • ETAG support
      • Cache
      • Changes API
      • RPC API
      • Sequence API
      • Encryption API
      • Batch Upsert
  • Modules
    • Profiling and validation
      • FHIR Schema Validator
        • Aidbox FHIR IGs Registry
        • Setup Aidbox with FHIR Schema validation engine
      • Skip validation of references in resource using request header
      • Asynchronous resource validation
    • Access Control
      • Identity Management
        • User Management
        • Application/Client Management
      • Authentication
        • Basic HTTP Authentication
        • OAuth 2.0
        • Token Introspector
        • SSO with External Identity Provider
      • Authorization
        • Access Policies
        • SMART on FHIR
          • SMART Client Authorization
            • SMART App Launch
            • SMART Backend services
          • SMART Client Authentication
            • SMART: Asymmetric (/"private key JWT") authentication
            • SMART: Symmetric (/"client secret") authentication
          • SMART: Scopes for Limiting Access
          • Pass Inferno tests with Aidbox
          • Example: SMART App Launch using Aidbox and Keycloak
          • Example: SMART App Launch using Smartbox and Keycloak
        • Scoped API
          • Organization-based hierarchical access control
          • Compartments API
          • Patient data access API
        • Label-based Access Control
      • Audit & Logging
    • Observability
      • Getting started
        • Run Aidbox with OpenTelemetry locally
        • How to export telemetry to the OTEL collector
      • Logs
        • How-to guides
          • OpenTelemetry logs
          • Elastic Logs and Monitoring Integration
          • Datadog Log management integration
          • Loki Log management integration
        • Tutorials
          • Log analysis and visualization tutorial
          • Export logs to Datadog tutorial
        • Extending Aidbox Logs
        • Technical reference
          • Log appenders
          • Log transformations
          • Log Schema
          • OTEL logs exporter parameters
      • Metrics
        • How-to guides
          • How to export metrics to the OTEL collector
          • Use Aidbox Metrics Server
          • Set-up Grafana integration
        • Technical reference
          • OpenTelemetry Metrics
          • OTEL metrics exporter parameters
      • Traces
        • How to use tracing
        • OTEL traces exporter parameters
    • Subscriptions
      • Aidbox topic-based subscriptions
        • Kafka AidboxTopicDestination
        • Webhook AidboxTopicDestination
        • GCP Pub/Sub AidboxTopicDestination
        • Tutorial: produce QuestionnaireResponse to Kafka topic
      • Aidbox SubSubscriptions
    • Aidbox Forms
      • Getting started
      • Aidbox Forms Interface
      • Aidbox UI Builder
        • UI Builder Interface
        • Form creation
          • Form Settings
          • Widgets
          • Components
          • Versioning
          • Form customisation in Theme Editor
          • Form signature
          • How-to guides
            • How to: populate forms with data
            • How to extract data from forms
            • How to calculate form filling percentage
          • Multilingual forms
          • FHIRPath Editor
        • Import Questionnaire
        • Form sharing
        • Printing forms
          • Template-based PDF generation
        • FHIR versions
        • Offline forms
        • Embedding
          • Request Interception
        • Configuration
        • Forms multitenancy
        • Building reports using SQL on FHIR
        • Integration with external terminology servers
        • External FHIR servers as a data backend
        • Store attachments in S3-like storages
      • Access Control in Forms
      • Audit Logging in Forms
      • Aidbox Form Gallery
    • Define extensions
      • Extensions using StructureDefinition
      • Extensions using FHIRSchema
    • Custom Resources
      • Custom resources using FHIR Schema
      • Custom resources using StructureDefinition
      • Migrate to FHIR Schema
        • Migrate custom resources defined with Entity & Attributes to FHIR Schema
        • Migrate custom resources defined with Zen to FHIR Schema
    • Aidbox terminology module
      • Concept
        • $translate-concepts
        • Handling hierarchies using ancestors
      • ValueSet
        • ValueSet Expansion
        • ValueSet Code Validation
        • Create a ValueSet
      • CodeSystem
        • CodeSystem Concept Lookup
        • CodeSystem Subsumption testing
        • CodeSystem Code Composition
      • Import external terminologies
        • Import flat file (/CSV)
        • $import operation
        • Ready-to-use terminologies
      • $translate on ConceptMap
    • SQL on FHIR
      • Defining flat views with View Definitions
      • Query data from flat views
      • Reference
    • Integration toolkit
      • C-CDA / FHIR Converter
        • List of supported templates
          • Admission Diagnosis Section (/V3)
          • Advance Directives Section (/entries optional) (/V3)
          • Advance Directives Section (/entries required) (/V3)
          • Allergies and Intolerances Section (/entries optional) (/V3)
          • Allergies and Intolerances Section (/entries required) (/V3)
          • Assessment Section
          • Chief Complaint Section
          • Chief Complaint and Reason for Visit Section
          • Complications Section (/V3)
          • Course of Care Section
          • DICOM Object Catalog Section - DCM 121181
          • Default Section Rules
          • Discharge Diagnosis Section (/V3)
          • Document Header
          • Encounters Section (/entries optional) (/V3)
          • Encounters Section (/entries required) (/V3)
          • Family History Section (/V3)
          • Functional Status Section (/V2)
          • General Status Section
          • Goals Section
          • Health Concerns Section (/V2)
          • History of Present Illness Section
          • Hospital Consultations Section
          • Hospital Course Section
          • Hospital Discharge Instructions Section
          • Hospital Discharge Physical Section
          • Hospital Discharge Studies Summary Section
          • Immunizations Section (/entries optional) (/V3)
          • Immunizations Section (/entries required) (/V3)
          • Medical (/General) History Section
          • Medical Equipment Section (/V2)
          • Medications Administered Section (/V2)
          • Medications Section (/entries optional) (/V2)
          • Medications Section (/entries required) (/V2)
          • Mental Status Section (/V2)
          • Notes
          • Nutrition Section
          • Objective Section
          • Operative Note Fluids Section
          • Operative Note Surgical Procedure Section
          • Past Medical History (/V3)
          • Payers Section (/V3)
          • Plan of Treatment Section (/V2)
          • Postprocedure Diagnosis Section (/V3)
          • Preoperative Diagnosis Section (/V3)
          • Problem Section (/entries optional) (/V3)
          • Problem Section (/entries required) (/V3)
          • Procedure Description Section
          • Procedure Disposition Section
          • Procedure Estimated Blood Loss Section
          • Procedure Implants Section
          • Procedure Specimens Taken Section
          • Procedures Section (/entries optional) (/V2)
          • Procedures Section (/entries required) (/V2)
          • Reason for Visit Section
          • Results Section (/entries optional) (/V3)
          • Results Section (/entries required) (/V3)
          • Review of Systems Section
          • Social History Section (/V3)
          • Vital Signs Section (/entries optional) (/V3)
          • Vital Signs Section (/entries required) (/V3)
        • How to deploy the service
        • Producing C-CDA documents
        • How to customize conversion rules
      • HL7 v2 Integration
        • HL7 v2 integration with Aidbox Project
        • Mappings with lisp/mapping
      • X12 message converter
      • Analytics
        • Power BI
      • Mappings
      • Email Providers integration
        • Setup SMTP provider
    • SMARTbox | FHIR API for EHRs
      • Get started
        • Set up Smartbox locally
        • Deploy Smartbox with Kubernetes
      • (/g)(/10) Standardized API for patient and population services
      • The B11 Decision Support Interventions
        • Source attributes
        • Feedback Sections
      • How-to guides
        • Pass Inferno tests with Smartbox
        • Perform EHR launch
        • Pass Inferno Visual Inspection and Attestation
        • Revoke granted access
        • Set up EHR-level customization
        • Check email templates
        • Setup email provider
        • Register users
        • Set up SSO with Auth0
        • Publish Terms of Use link onto the documentation page
        • Find out what resources were exported during the $export operation
        • Find documentation endpoint
      • Background information
        • Considerations for Testing with Inferno ONC
        • Adding Clients for Inferno tests
        • Multitenancy approach
        • What is Tenant
        • Email templating
    • ePrescription
      • Getting started
      • Authentication with mTLS
      • Pharmacies synchronization
      • Prescribing
        • NewRx Message
        • CancelRx Message
        • How to test Callback
      • Directory
        • DirectoryDownload Message
        • GetProviderLocation Message
        • AddProviderLocation Message
        • UpdateProviderLocation Message
        • DisableProviderLocation Message
      • Medications
        • FDB
      • References
        • Environment Variables
      • Frequently Asked Questions
    • Other modules
      • MDM
        • Train model
        • Configure MDM module
        • Find duplicates: $match
        • Mathematical details
      • MCP
  • Database
    • Overview
    • Database schema
    • PostgreSQL Extensions
    • AidboxDB
      • HA AidboxDB
    • Tutorials
      • Migrate to AidboxDB 16
      • Working with pgAgent
  • File storage
    • AWS S3
    • GCP Cloud Storage
    • Azure Blob Storage
    • Oracle Cloud Storage
  • Deployment and maintenance
    • Deploy Aidbox
      • Run Aidbox on Kubernetes
        • Deploy Production-ready Aidbox to Kubernetes
        • Deploy Aidbox with Helm Charts
        • Highly Available Aidbox
        • Self-signed SSL certificates
      • Run Aidbox on managed PostgreSQL
      • How to inject env variables into Init Bundle
    • Backup and Restore
      • Crunchy Operator (/pgBackRest)
      • pg_dump
      • pg_basebackup
      • WAL-G
    • Indexes
      • Get suggested indexes
      • Create indexes manually
  • App development
    • Use Aidbox with React
    • Aidbox SDK
      • Aidbox JavaScript SDK
      • Apps
      • NodeJs SDK
      • Python SDK
    • Examples
  • Reference
    • Matcho DSL reference
    • FHIR Schema reference
    • Settings reference
      • General
      • FHIR
      • Security & Access Control
      • Modules
      • Database
      • Web Server
      • Observability
      • Zen Project
    • Environment variables
      • Aidbox required environment variables
      • Optional environment variables
      • AidboxDB environment variables
    • System resources reference
      • IAM Module Resources
      • SDC Module Resources
      • Base Module Resources
      • Bulk Module Resources
      • AWF Module Resources
      • Cloud Module Resources
      • HL7v2 Module Resources
      • SQL on FHIR Module Resources
    • Email Providers reference
      • Notification resource reference
      • Mailgun environment variables
      • Postmark environment variables
    • Aidbox Forms reference
      • FHIR SDC API
      • Aidbox SDC API
      • Generating Questionnaire from PDF API
    • Aidbox SQL functions
  • Deprecated
    • Deprecated
      • Zen-related
        • RPC reference
          • aidbox
            • mdm
              • aidbox.mdm/update-mdm-tables
              • aidbox.mdm/match
        • FTR
        • Aidbox configuration project
          • Run Aidbox locally using Aidbox Configuraiton project
          • Aidbox configuration project structure
          • Set up and use configuration projects
          • Enable IGs
          • Repository
          • Seed Import
          • Manage Indexes in Zen Project
          • Seed v2
          • 🎓Migrate to git Aidbox Configuration Projects
          • Aidbox Configuration project reference
            • Zen Configuration
            • Aidbox project RPC reference
            • aidbox.config/config
          • Custom resources using Aidbox Project
          • First-Class Extensions using Zen
          • Zen Indexes
        • US Core IG
          • US Core IG support reference
        • Workflow Engine
          • Task
            • Aidbox Built-in Tasks
            • Task Executor API
            • Task User API
          • Workflow
            • Workflow User API
          • Services
          • Monitoring
        • FHIR conformance Deprecated guides
          • Touchstone FHIR 4.0.1 basic server
          • Touchstone FHIR USCore ClinData
          • How to enable US Core IG
            • Start Aidbox locally with US Core IG enabled
            • Add US Core IG to a running Aidbox instance
          • HL7 FHIR Da Vinci PDex Plan Net IG
        • Terminology Deprecated Tutorials
          • Inferno Test-Suite US Core 3.1.1
        • API constructor (/beta)
        • zen-lang validator
          • Write a custom zen profile
          • Load zen profiles into Aidbox
        • FHIR topic-based subscriptions
          • Set up SubscriptionTopic
          • Tutorial: Subscribe to Topic (/R4B)
          • API Reference
            • Subscription API
        • 🏗️FHIR Terminology Repository
          • FTR Specification
          • Create an FTR instance
            • FTR from CSV
            • FTR from FHIR IG
            • FTR from FTR — Direct Dependency
            • FTR from FTR — Supplement
          • FTR Manifest
          • Load SNOMED CT into Aidbox
          • Load LOINC into Aidbox
          • Load ICD-10-CM into Aidbox
          • Load RxNorm into Aidbox
          • Load US VSAC Package to Aidbox
          • Import via FTR
        • Zen Search Parameters
      • Entity / Attribute
        • Entities & Attributes
        • First-Class Extensions using Attribute
        • Custom Resources using Entity
        • Working with Extensions
        • Aidbox Search Parameters
      • Forms
      • Other
        • Custom Search
        • SearchQuery
        • Subscribe to new Patient resource
        • App Development Deprecated Tutorials
          • Receive logs from your app
            • X-Audit header
          • Patient Encounter notification Application
        • Other Deprecated Tutorials
          • Resource generation with map-to-fhir-bundle-task and subscription triggers
          • APM Aidbox
          • Automatically archive AuditEvent resources in GCP storage guide
          • HL7 v2 pipeline with Patient mapping
          • How to migrate to Apline Linux
          • How to migrate transaction id to bigint
          • How to fix broken dates
          • Configure multi-tenancy
        • AidboxProfile
        • GCP Pub/Sub
Powered by GitBook
On this page
  • aidbox.bulk/load-from-bucket
  • Files content and naming requirement
  • Parameters
  • aidbox.bulk/load-from-bucket-status

Was this helpful?

Edit on GitHub
  1. API
  2. Bulk API

Bulk import from an S3 bucket

Load data from files in an existing Amazon Simple Storage Service (Amazon S3) with maximum performance.

Previousaidbox.bulk data importNextBatch/Transaction

Last updated 22 days ago

Was this helpful?

If you want to import S3 objects with a presigned url, refer to .

aidbox.bulk/load-from-bucket

It allows loading data from a bunch of .ndjson.gz files on an AWS bucket directly to the Aidbox database with maximum performance.

Be careful You should run only one replica of Aidbox to use aidbox.bulk/load-from-bucket operation.

Files content and naming requirement

  1. The file must consist of Resources of the same type.

  2. The file name should start with a name of the Resource type, then some postfix is possible, and extension .ndjson is required. Files can be placed in subdirectories of any level. Files with the wrong path structure will be ignored.

  3. Every resource in .ndjson files MUST contain id property.

Resource requirements for aidbox.bulk/load-from-bucket:

Operation
id
resourceType

aidbox.bulk/load-from-bucket

Required

Not required

Valid file structure example:

fhir/1/Patient.ndjson.gz
fhir/1/patient-01.ndjson.gz
Observation.ndjson.gz

Invalid file structure example:

import.ndjson
01-patient.ndjson.gz
fhir/Patient

Parameters

Object with the following structure:

  • bucket * defines your bucket connection string in formats3://<bucket-name>

  • thread-num defines how many threads will process the import. The default is 4.

  • account credential:

    • access-key-id * AWS key ID

    • secret-access-key * AWS secret key

    • region * AWS Bucket region

  • disable-idx? the default is false. Allows to drop all indexes for resources, which data are going to be loaded. Indexes will be restored at the end of successful import. All information about dropped indexes is stored at DisabledIndex resources.

  • drop-primary-key? the default is false. The same as the previous parameter, but drops primary key constraint for resources tables. This parameter disables all checks for duplicates for imported resources.

  • upsert? the default is false. If upsert? is false, import for files with id uniqueness constraint violation will fail with an error, if true - records in the database will be overridden with records from import. Even when upsert? is true, it's still not allowed to have more than one record with the same id in one import file. Setting this option to true will cause a decrease in performance.

  • scheduler possible values: optimal , by-last-modified, the default is optimal . Establishes the order in which the files are processed. The optimal value provides the best performance. by-last-modified should be used with thread-num = 1 to guarantee a stable order of file processing.

  • prefixes array of prefixes to specify which files should be processed. Example: with value ["fhir/1/", "fhir/2/Patient"] only files from directory "fhir/1" and Patient files from directory "fhir/2" will be processed.

  • connect-timeout the default is 0. Specifies the number of milliseconds after which the file is considered as failed if connection to the resource could not be established. (e.g. in case of network issues). Zero is interpreted as an infinite timeout.

  • read-timeout the default is 0. Specifies the number of milliseconds after which the file is considered as failed if there is no data available to read (e.g. in case of network issues). Zero is interpreted as an infinite timeout.

Returns the string "Upload started"

Returns error message

Example

POST /rpc
content-type: text/yaml
accept: text/yaml

method: aidbox.bulk/load-from-bucket
params:
  bucket: s3://your-bucket-id
  thread-num: 4
  account:
    access-key-id: your-key-id
    secret-access-key: your-secret-access-key
    region: us-east-1
Status: 200
result:
  message: Upload from bucket <s3://your-bucket-id> started. 6 new files added.
  progress:
    total: 6
  new-files-count: 6

Loader File

For each file being imported via load-from-bucket method, Aidbox creates LoaderFile resource. To find out how many resources were imported from a file, check the loaded field.

Loader File Example

{
  "end": "2022-04-11T14:50:27.893Z",
  "file": "/tmp/patient.ndjson.gz",
  "size": 100,
  "type": "Patient",
  "bucket": "local",
  "loaded": 20,
  "status": "done"
}
{
  "end": "2022-04-11T14:50:27.893Z",
  "file": "/tmp/patient.ndjson.gz",
  "size": 100,
  "type": "Patient",
  "bucket": "local",
  "status": "error",
  "error": {
    "code": "23505",
    "source": "postgres"
  },
  "message": "23505: ERROR: duplicate key value violates unique constraint \"patient_pkey\""
}

Sources of Error

There are the following sources of error for this request.

  • AWS Error

  • PostgreSQL Error

  • Aidbox Error\

AWS Error

Code
Description

InvalidAccount

The AWS access key ID or AWS secret access key that you provided is not valid.

NoSuchKey

The specified S3 bucket or S3 object key does not exist.

\

PostgreSQL Error

Aidbox Error

Any other errors than the above can be caught as Aidbox Error. An error message will be provided if available.\

How to reload a file one more time

On launch aidbox.bulk/load-from-bucket checks if files from the bucket were planned to import and decides what to do:

  • If ndjson.gz file has it's related LoaderFile resource, the loader skips this file from import

  • If there is no related LoaderFile resource, Aidbox puts this file to the queue creating a LoaderFile resource

In order to import a file one more time you should delete related LoaderFile resource and relaunch aidbox.bulk/load-from-bucket.

Files are processed completely. The loader doesn't support partial re-import.\

AWS User Policy: Minimal Example

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "MinimalUserPolicyForBulkImport",
      "Effect": "Allow",
      "Action": [
        "s3:ListBucket",
        "s3:GetObject"
      ],
      "Resource": [
        "arn:aws:s3:::<your-bucket-name>",
        "arn:aws:s3:::<your-bucket-name>/*"
      ]
    }
  ]
}

\

aidbox.bulk/load-from-bucket-status

Returns status and progress of import for specified bucket. Possible states are: in-progress, completed, interrupted.

State interrupted means that aidbox was restarted during the loading process. If you run aidbox.bulk/load-from-bucket operation again on the same bucket, it will be continued.

Example

POST /rpc
content-type: text/yaml
accept: text/yaml

method: aidbox.bulk/load-from-bucket-status
params:
  bucket: s3://your-bucket-id
Status: 200
result:
  state: in-progress
  progress:
    total: 6
    pending: 2
    done: 4

See .\

aidbox.bulk data import
Documentation of PostgreSQL