Skip to content

Distributed Concourse#

Alpha Feature

Distributed Concourse is currently in alpha. It is available for testing and evaluation but is not yet recommended for production deployments.

Concourse supports forming a distributed cluster of multiple nodes that together act as a single logical database. Data is automatically partitioned and replicated across nodes for fault tolerance and availability.

Architecture#

Peer-to-Peer#

Concourse uses a peer-to-peer architecture with no single leader. Every node in the cluster can accept client connections and process requests. Requests are transparently routed to the node(s) that own the relevant data.

Consistency Model#

Distributed Concourse is a CP system (Consistency and Partition tolerance) with optimistic availability:

  • Strong consistency: All reads reflect the most recent write. There are no stale reads.
  • Partition tolerance: The cluster continues to operate correctly when network partitions occur, as long as a majority of nodes are reachable.
  • Optimistic availability: The system remains available for reads and writes as long as a quorum of replicas is reachable.

Replication#

Each piece of data is stored on multiple nodes according to the configured replication_factor. When replication_factor is not set explicitly, Concourse derives a default from the cluster size:

Cluster size (nodes) Default replication_factor
1 1
2 – 3 2
4 or more 3

In all cases the default gives you a majority — a quorum read is guaranteed to see at least one copy of every committed write. Override the default only if you have a specific durability/latency tradeoff in mind.

Configuration#

To form a cluster, configure the cluster section in concourse.yaml on every node:

1
2
3
4
5
6
cluster:
  nodes:
    - node1.example.com:1717
    - node2.example.com:1717
    - node3.example.com:1717
  replication_factor: 2

Settings#

Setting Description
cluster.nodes List of all node addresses (host:port) in the cluster, including this node
cluster.replication_factor Number of nodes that store a copy of each piece of data

Requirements#

  • All nodes must list the same set of nodes in their configuration.
  • All nodes must use the same replication_factor.
  • All nodes must run the same version of Concourse.
  • All nodes must have identical configurations (aside from node-specific settings like buffer_directory).

Client Connections#

Clients can connect to any node in the cluster. The node receiving the connection acts as a coordinator, transparently routing requests to the appropriate node(s) that own the data.

1
2
3
4
5
// Java - connect to any node
Concourse concourse = Concourse.at()
    .host("node1.example.com")
    .port(1717)
    .connect();

From the client’s perspective, the cluster behaves identically to a single-node deployment. No changes to application code are required.

Operating a Cluster#

Starting up#

All nodes must be started with the same cluster.nodes list. When a node starts, it reaches out to its peers and waits until it can form a quorum before accepting client connections. In practice this means:

  • If you start the nodes roughly in parallel (for example, via a Kubernetes StatefulSet or a Terraform apply), they discover each other and form a cluster within seconds.
  • If you start a single node in isolation, its startup blocks waiting for peers. This is the expected behavior — a node by itself cannot safely accept writes for a cluster of size > 1.
  • There is no separate bootstrap command. All nodes have the same configuration and the same role.

Health and failure detection#

Nodes communicate peer-to-peer using a gossip protocol. Each node periodically exchanges state with a random subset of peers, so failures and membership information propagate through the cluster without a central coordinator.

Use concourse monitor on each node to observe its local view of the cluster. Key operational signals:

  • TransportInProgress / TransportCompleted — indicates that indexing is keeping up with the buffer.
  • ActiveSessions on the connected node.
  • JMX attributes expose per-environment engine state that is local to each node.

Central metric aggregation via Prometheus or OpenTelemetry is the recommended way to see cluster-wide health on one dashboard.

Recovery from node failure#

  • Crash of a single node (within quorum): The surviving quorum continues to serve reads and writes. When the crashed node returns, it re-syncs from its peers before rejoining and serving clients again.
  • Loss of quorum: The remaining nodes refuse writes rather than accepting operations that could not be durably replicated. Reads that the surviving nodes can answer locally still succeed; reads that require a quorum block or fail until quorum is restored.
  • Permanent loss of a node: Because the alpha release does not support dynamic membership changes (see below), replacing a permanently-lost node requires bringing the cluster down and restarting it with a new cluster.nodes list on each surviving node. Plan capacity accordingly.

Network partitions#

Distributed Concourse is a CP system: during a network partition, the side of the partition that contains a majority quorum continues to serve reads and writes, while the minority side refuses to write until the partition heals. No data diverges across a partition — there is a single committed history at all times.

Current Limitations#

The alpha release of distributed Concourse has the following limitations:

  • No conversion from standalone: Existing single-node deployments cannot be converted to a cluster. Clusters must be started fresh.
  • No dynamic membership: Nodes cannot be added to or removed from a running cluster. The cluster membership is fixed at startup.
  • Identical configurations: All nodes must have identical configurations.
  • Same version required: All nodes must run the same version of Concourse. Rolling upgrades are not supported.

These limitations will be addressed in future releases as the distributed feature matures.