Monitoring & Observability#

Concourse Server publishes real-time operational metrics that let you observe server health, diagnose performance issues, and track resource utilization without any third-party tooling. Metrics are exposed through four complementary channels:

Channel	Best for	How to consume
`concourse monitor` CLI	Ad-hoc inspection, watch mode, scripts	Shell
JMX MBeans	Rich attribute inspection, in-process monitoring	JConsole, VisualVM, JMX clients
Prometheus endpoint	Scraping into Prometheus / Grafana / GKE Managed Prometheus	HTTP
OpenTelemetry (OTLP) push	Datadog, New Relic, OTel Collector	gRPC or HTTP

All four channels read from the same underlying set of JMX MBeans. The Prometheus and OpenTelemetry channels are driven by the jmx_prometheus_javaagent attached at JVM startup, which translates JMX attributes into Prometheus or OTLP metrics.

`concourse monitor` CLI#

concourse monitor <subcommand> renders a live snapshot of one slice of server state. All subcommands accept the flags below:

Flag	Default	Description
`--jmx-port`	`9010`	JMX port used to connect to the server
`--json`	off	Emit JSON instead of a formatted dashboard
`--watch`	off	Continuously refresh; prints per-interval deltas for counter-style metrics
`--interval`	`2`	Refresh period in seconds (used with `--watch`)
`-e` / `--environment`	default env	Scope per-environment metrics to a specific environment

Subcommands#

Subcommand	What it reports
`overview`	Aggregated summary of every section
`storage`	Segments, disk space, seek mix, cache hit rates
`operations`	Per-operation count, average and max latency
`transactions`	Start / commit / fail / abort counts
`locks`	Active locks, parked threads, wait and hold times
`heap`	Heap and non-heap memory usage
`gc`	Per-collector collection counts and pause totals
`threads`	Live, peak, daemon, total-started thread counts
`transport`	Buffer → Database transport progress
`compaction`	Compaction progress and queue depth

Examples#

# One-shot overview
concourse monitor overview

# Watch operations with a 5-second refresh
concourse monitor operations --watch --interval 5

# Scope metrics to a specific environment
concourse monitor storage -e production

# Machine-readable JSON for a metrics pipeline
concourse monitor locks --json

In --watch mode, counter-style metrics display per-interval deltas (for example, TransactionsCommitted 1,204 (+18/s)), making it easy to see throughput in real time.

JMX MBeans#

Concourse Server registers two MXBeans under the com.cinchapi.concourse domain:

ObjectName	Scope
`com.cinchapi.concourse:type=Server`	Server-wide
`com.cinchapi.concourse:type=Engine,environment=<env>`	Per environment

`type=Server` — selected attributes#

Attribute	Meaning
`Version`	Build version of the running server
`ActiveSessions`	Client sessions currently connected
`ActiveTransactions`	Transactions currently open
`RunningPlugins`	Active plugin processes
`EnvironmentCount`	Initialized environments
`TransactionsStarted`	Cumulative started transactions
`TransactionsCommitted`	Cumulative successful commits
`TransactionsFailed`	Cumulative failed commits
`TransactionsAborted`	Cumulative client aborts
`AtomicCommits`	Cumulative successful atomic commits
`AtomicRetries`	Cumulative atomic operation retries

`type=Engine,environment=<env>` — key attribute groups#

Locks: LockCount, RangeLockCount, ParkedThreadCount, ReadLockRequests, WriteLockRequests, FailedTryLocks, ReadLockAvgWaitNanos, WriteLockAvgWaitNanos, ReadLockAvgHoldNanos, WriteLockAvgHoldNanos, MaxReadLockHoldNanos, MaxWriteLockHoldNanos.

Transport: TransportCompleted, TransportAvgDurationNanos, TransportInProgress, TimeSinceLastTransportNanos.

Storage: SegmentCount, DiskSpaceAvailable, DiskSpaceTotal, MemorySeeks, DiskSeeks, BloomFilterGuards, BloomFilterFalsePositives, plus per-chunk-type variants (Table*, Index*, Corpus*), TotalDataBytes, MinSegmentBytes, MaxSegmentBytes, AvgSegmentBytes, BufferPageCount, BufferAvgWritesPerPage.

Caches: IndexCacheHitRate, TableCacheHitRate, CorpusCacheHitRate, and associated hit/miss/eviction counts.

Operations: for each of Select, Find, Browse, Add, Remove, Search, Audit, Gather, and Chronicle — {Op}Count, {Op}AvgNanos, {Op}MaxNanos.

Compaction: CompactionShiftIndex, CompactionShiftCount, CompactionGarbageQueueSize, CompactionShiftsAttempted, CompactionShiftsSucceeded, CompactionSegmentsGarbageCollected.

The jmx_port setting controls which port the JMX RMI connector listens on (default 9010). You can attach any JMX client — JConsole, VisualVM, jmxterm, or a custom JMX consumer — at service:jmx:rmi:///jndi/rmi://<host>:<port>/jmxrmi.

Prometheus#

Enable the Prometheus endpoint by setting one option in concourse.yaml:

1
2
3

monitoring:
  export_metrics: true
  metrics_port: 9091   # optional; defaults to 9091

When this is enabled, Concourse Server attaches the jmx_prometheus_javaagent at JVM startup. The agent listens on metrics_port and exposes the server’s JMX MBeans as Prometheus metrics at http://<host>:<metrics_port>/metrics.

Metric names#

The agent’s translation rules (installed automatically by the server) map MBean attributes to these naming conventions:

Rules used by the agent to translate MBean attributes:

type=Engine,environment=X ⇒ *Nanos becomes concourse_engine_*Seconds{environment="X"} (units converted from nanoseconds to seconds).
type=Engine,environment=X ⇒ *HitRate becomes concourse_engine_*HitRate{environment="X"} (gauge, 0 to 1).
Any other type=Engine,environment=X attribute becomes concourse_engine_<attr>{environment="X"}.
Any type=Server attribute becomes concourse_server_<attr>.

Prometheus scrape configuration#

scrape_configs:
  - job_name: concourse
    scrape_interval: 15s
    static_configs:
      - targets:
          - concourse-host-1:9091
          - concourse-host-2:9091

OpenTelemetry (OTLP)#

For collectors such as the OpenTelemetry Collector, Datadog Agent, New Relic OTLP endpoint, or Grafana Cloud, enable OTLP push from the same metrics pipeline:

monitoring:
  export_metrics: true              # required
  enable_opentelemetry: true
  opentelemetry_endpoint: http://otel-collector:4317
  opentelemetry_protocol: grpc      # grpc or http
  opentelemetry_interval: 15        # seconds between pushes

Defaults:

Option	Default
`opentelemetry_endpoint`	`http://localhost:4317`
`opentelemetry_protocol`	`grpc` (OTLP/gRPC); alternative: `http` (OTLP/HTTP)
`opentelemetry_interval`	`15` seconds

The OTLP exporter uses the same JMX translation rules as the Prometheus endpoint, so the metric names you see in your OTLP backend are the concourse_engine_* / concourse_server_* families listed above.

Agent jar required

Both Prometheus and OTLP export require agents/jmx_prometheus_javaagent.jar in the server installation. If the agent jar is not present at startup, Concourse prints a warning and continues without metrics export.

Picking a channel#

Interactive troubleshooting: start with concourse monitor overview and drill into locks, transport, or operations as needed. Switch to --watch to watch a problem unfold.
One-off script or health check: concourse monitor <subcommand> --json is easy to parse in shell pipelines.
Long-term dashboards and alerting: enable the Prometheus endpoint and point Grafana (or any PromQL-compatible tool) at it.
Centralized metrics across many services: enable OpenTelemetry push so Concourse participates in the same metrics pipeline as the rest of your stack.

For the underlying configuration keys, see Configuration.