Use Aidbox Metrics Server

This guide explains how to set up and use Aidbox Metrics Server

Setup and environment variable

DefineBOX_METRICS_PORTenvironment variable with monitoring server port number.

Start metrics server

Aidbox starts monitoring server on startup automatically.

To check the monitoring server works make GET <AIDBOX_BASE_URL>:<BOX_METRICS_PORT> request. The output should be a string "aidbox metrics".

Metrics server endpoints

There are three types of metrics Aidbox collects and exposes. All endpoints are available on a separate port, e.g. GET <AIDBOX_BASE_URL>:<BOX_METRICS_PORT>/metrics.

EndpointUpdate frequency

GET /metrics

continuous

GET /metrics/minutes

every minute

GET /metrics/hours

every hour

The /metrics/hours response can take some time since it collects a lot of information from the database. Make sure your metrics scraper timeout is sufficient.

Prometheus example scrapers configuration

global:
  # omitted global configuration values
  external_labels:
    monitor: 'aidbox'
scrape_configs:
  # omitted default scrappers configuration
  
  - job_name: aidbox
    honor_labels: true
    scrape_interval: 10s
    metrics_path: /metrics
    static_configs:
      - targets: [ 'aidbox.example.com:9999' ]  # should be <AIDBOX_BASE_URL>:<BOX_METRICS_PORT

  - job_name: aidbox-minutes
   honor_labels: true
    scrape_interval: 1m
    metrics_path: /metrics/minutes
    static_configs:
      - targets: [ 'aidbox.example.com:9999' ]  # should be <AIDBOX_BASE_URL>:<BOX_METRICS_PORT

  - job_name: aidbox-hours
    honor_labels: true
    scrape_interval: 10m
    scrape_timeout: 30s                         # increased timeout
    metrics_path: /metrics/hours
    static_configs:
      - targets: [ 'aidbox.example.com:9999' ]   # should be <AIDBOX_BASE_URL>:<BOX_METRICS_PORT

Collected metrics

HTTP

MetricUpdate frequencyDescription

aidbox_http_request_duration_seconds_bucket

continuous

request duration as cumulative counters for buckets

aidbox_http_request_duration_seconds_count

continuous

request duration events count

aidbox_http_request_duration_seconds_sum

continuous

sum of request duration events value

aidbox_http_request_wait_seconds_bucket

continuous

queue waiting time as cumulative counters for buckets

aidbox_http_request_wait_seconds_count

continuous

queue waiting time events count

aidbox_http_request_wait_seconds_sum

continuous

sum of queue waiting time events value

Postgres

MetricUpdate frequencyDescription

pg_requests_total

continuous

number of executed selects requests

pg_inserts_total

continuous

number of executed insert statements

pg_updates_total

continuous

number of executed update statements

pg_deletes_total

continuous

number of executed delete statements

pg_blks_hit

continuous

number of shared block cache hits

pg_blks_read

continuous

number of shared blocks read

pg_tup_fetched

continuous

fetched tuples number

pg_tup_returned

continuous

returned tuples number

pg_errors_total

continuous

number of errors

pg_activity_count

continuous

number of PG workers

pg_idx_scan

every minute

number of index scans

pg_seq_scan

every minute

number of sequential scans

pg_stat_statements_total_calls

every minute

number of times executed

pg_stat_statements_stddev_execution_time

every minute

statement execution time

pg_stat_statements_mean_execution_time

every minute

mean statement execution time

pg_table_size

every hour

table size

pg_database_size

every hour

database size

pg_activity_max

every hour

maximum number of connections

Hikari (Postgres connection pool)

MetricUpdate frequencyDescription

hikari_active_count

continuous

number of active connections

hikari_idle_count

continuous

number of idle connections

hikari_acquire_created_seconds_bucket

continuous

time taken to create an actual physical connection

hikari_acquire_created_seconds_count

continuous

number of created physical connections

hikari_acquire_created_seconds_sum

continuous

total amount of time to create all physical connections

hikari_acquired_total

continuous

number of obtained connections

hikari_acquire_wait_seconds_bucket

continuous

time taken to obtain a connection

hikari_acquire_wait_seconds_sum

continuous

total amount of time to obtain all connections

hikari_acquire_used_seconds_bucket

continuous

time consumed by a connection

hikari_max_size

every hour

maximum number of connections

JVM

MetricUpdate frequencyDescription

jvm_gc_time

continuous

garbage collector execution time

jvm_gc_count

continuous

garbage collector count of launch

jvm_heap_memory

continuous

heap memory usage

jvm_non_heap_memory

continuous

non-heap memory usage

jvm_thread_count

continuous

number of live threads including both daemon and non-daemon thread

jvm_thread_peak_count

continuous

peak live thread count

jvm_thread_daemon_count

continuous

number of daemon thread

jvm_available_processors_size

every hour

number of processors available to the JVM

jvm_max_memory_size

every hour

maximum amount of memory that JVM will attempt to use

jvm_total_memory_size

every hour

total amount of memory in JVM

Disable PostgreSQL metrics

If you have a different pg exporter you can disable Aidbox PostgreSQL metrics for avoiding metrics duplication.

In this case, you should set BOX_METRICS_POSTGRES_ON to false value

Last updated