Monitoring & Observability#
Concourse Server publishes real-time operational metrics that let you observe server health, diagnose performance issues, and track resource utilization without any third-party tooling. Metrics are exposed through four complementary channels:
| Channel | Best for | How to consume |
|---|---|---|
concourse monitor CLI |
Ad-hoc inspection, watch mode, scripts | Shell |
| JMX MBeans | Rich attribute inspection, in-process monitoring | JConsole, VisualVM, JMX clients |
| Prometheus endpoint | Scraping into Prometheus / Grafana / GKE Managed Prometheus | HTTP |
| OpenTelemetry (OTLP) push | Datadog, New Relic, OTel Collector | gRPC or HTTP |
All four channels read from the same underlying set of JMX MBeans. The Prometheus and OpenTelemetry channels are driven by the jmx_prometheus_javaagent attached at JVM startup, which translates JMX attributes into Prometheus or OTLP metrics.
concourse monitor CLI#
concourse monitor <subcommand> renders a live snapshot of one
slice of server state. All subcommands accept the flags below:
| Flag | Default | Description |
|---|---|---|
--jmx-port |
9010 |
JMX port used to connect to the server |
--json |
off | Emit JSON instead of a formatted dashboard |
--watch |
off | Continuously refresh; prints per-interval deltas for counter-style metrics |
--interval |
2 |
Refresh period in seconds (used with --watch) |
-e / --environment |
default env | Scope per-environment metrics to a specific environment |
Subcommands#
| Subcommand | What it reports |
|---|---|
overview |
Aggregated summary of every section |
storage |
Segments, disk space, seek mix, cache hit rates |
operations |
Per-operation count, average and max latency |
transactions |
Start / commit / fail / abort counts |
locks |
Active locks, parked threads, wait and hold times |
heap |
Heap and non-heap memory usage |
gc |
Per-collector collection counts and pause totals |
threads |
Live, peak, daemon, total-started thread counts |
transport |
Buffer → Database transport progress |
compaction |
Compaction progress and queue depth |
Examples#
1 2 3 4 5 6 7 8 9 10 11 | |
In --watch mode, counter-style metrics display per-interval
deltas (for example, TransactionsCommitted 1,204 (+18/s)),
making it easy to see throughput in real time.
JMX MBeans#
Concourse Server registers two MXBeans under the
com.cinchapi.concourse domain:
| ObjectName | Scope |
|---|---|
com.cinchapi.concourse:type=Server |
Server-wide |
com.cinchapi.concourse:type=Engine,environment=<env> |
Per environment |
type=Server — selected attributes#
| Attribute | Meaning |
|---|---|
Version |
Build version of the running server |
ActiveSessions |
Client sessions currently connected |
ActiveTransactions |
Transactions currently open |
RunningPlugins |
Active plugin processes |
EnvironmentCount |
Initialized environments |
TransactionsStarted |
Cumulative started transactions |
TransactionsCommitted |
Cumulative successful commits |
TransactionsFailed |
Cumulative failed commits |
TransactionsAborted |
Cumulative client aborts |
AtomicCommits |
Cumulative successful atomic commits |
AtomicRetries |
Cumulative atomic operation retries |
type=Engine,environment=<env> — key attribute groups#
Locks: LockCount, RangeLockCount, ParkedThreadCount,
ReadLockRequests, WriteLockRequests, FailedTryLocks,
ReadLockAvgWaitNanos, WriteLockAvgWaitNanos,
ReadLockAvgHoldNanos, WriteLockAvgHoldNanos,
MaxReadLockHoldNanos, MaxWriteLockHoldNanos.
Transport: TransportCompleted,
TransportAvgDurationNanos, TransportInProgress,
TimeSinceLastTransportNanos.
Storage: SegmentCount, DiskSpaceAvailable,
DiskSpaceTotal, MemorySeeks, DiskSeeks,
BloomFilterGuards, BloomFilterFalsePositives, plus
per-chunk-type variants (Table*, Index*, Corpus*),
TotalDataBytes, MinSegmentBytes, MaxSegmentBytes,
AvgSegmentBytes, BufferPageCount,
BufferAvgWritesPerPage.
Caches: IndexCacheHitRate, TableCacheHitRate,
CorpusCacheHitRate, and associated hit/miss/eviction
counts.
Operations: for each of Select, Find, Browse, Add, Remove,
Search, Audit, Gather, and Chronicle — {Op}Count,
{Op}AvgNanos, {Op}MaxNanos.
Compaction: CompactionShiftIndex, CompactionShiftCount,
CompactionGarbageQueueSize, CompactionShiftsAttempted,
CompactionShiftsSucceeded,
CompactionSegmentsGarbageCollected.
The jmx_port setting controls which port the JMX RMI
connector listens on (default 9010). You can attach any JMX
client — JConsole, VisualVM, jmxterm, or a custom JMX
consumer — at
service:jmx:rmi:///jndi/rmi://<host>:<port>/jmxrmi.
Prometheus#
Enable the Prometheus endpoint by setting one option in
concourse.yaml:
1 2 3 | |
When this is enabled, Concourse Server attaches the
jmx_prometheus_javaagent at JVM startup. The agent listens on
metrics_port and exposes the server’s JMX MBeans as Prometheus
metrics at http://<host>:<metrics_port>/metrics.
Metric names#
The agent’s translation rules (installed automatically by the server) map MBean attributes to these naming conventions:
Rules used by the agent to translate MBean attributes:
type=Engine,environment=X⇒*Nanosbecomesconcourse_engine_*Seconds{environment="X"}(units converted from nanoseconds to seconds).type=Engine,environment=X⇒*HitRatebecomesconcourse_engine_*HitRate{environment="X"}(gauge, 0 to 1).- Any other
type=Engine,environment=Xattribute becomesconcourse_engine_<attr>{environment="X"}. - Any
type=Serverattribute becomesconcourse_server_<attr>.
Prometheus scrape configuration#
1 2 3 4 5 6 7 | |
OpenTelemetry (OTLP)#
For collectors such as the OpenTelemetry Collector, Datadog Agent, New Relic OTLP endpoint, or Grafana Cloud, enable OTLP push from the same metrics pipeline:
1 2 3 4 5 6 | |
Defaults:
| Option | Default |
|---|---|
opentelemetry_endpoint |
http://localhost:4317 |
opentelemetry_protocol |
grpc (OTLP/gRPC); alternative: http (OTLP/HTTP) |
opentelemetry_interval |
15 seconds |
The OTLP exporter uses the same JMX translation rules as the
Prometheus endpoint, so the metric names you see in your OTLP
backend are the concourse_engine_* / concourse_server_*
families listed above.
Agent jar required
Both Prometheus and OTLP export require
agents/jmx_prometheus_javaagent.jar in the server
installation. If the agent jar is not present at startup,
Concourse prints a warning and continues without metrics
export.
Picking a channel#
- Interactive troubleshooting: start with
concourse monitor overviewand drill intolocks,transport, oroperationsas needed. Switch to--watchto watch a problem unfold. - One-off script or health check:
concourse monitor <subcommand> --jsonis easy to parse in shell pipelines. - Long-term dashboards and alerting: enable the Prometheus endpoint and point Grafana (or any PromQL-compatible tool) at it.
- Centralized metrics across many services: enable OpenTelemetry push so Concourse participates in the same metrics pipeline as the rest of your stack.
For the underlying configuration keys, see Configuration.