OriginChain docs
02 · core concepts

Core concepts.

OriginChain is a single managed engine with a Plan tree on top. SQL, vector, full-text, and graph are not separate engines - they are different query modes and different Plan operators over the same store. Understand the engine, the schemas, and the Plan and the rest of the surface follows.

01 · the engine

A single managed engine.

The engine is fronted by a write-ahead log. Every write is appended to the log, fsynced, then applied. Reads go through a process-wide page cache. There is no row-store / column-store / vector-engine split - every query mode is a different way to read the same row.

Each tenant gets a single, region-isolated managed instance. No shared compute, no noisy neighbour. Writes go to one primary; a sync follower replicates in lockstep on Tier 2 and above for RPO=0. The follower bootstraps from a snapshot transfer and then tails.

Fig. - request path: HTTP → plan → engine → follower + backup
one managed instance per tenant · region-isolated client HTTP Plan tree Engine durable commit · group commit page cache follower sync replica sync Backup archive continuous · PITR
02 · schemas

Declared in TOML.

A schema manifest declares the table's namespace + name, primary key, columns, secondary indexes, graph relations to other tables, and derived JSON extractions. The catalog is itself stored as rows - adding a column is a write, not a downtime migration.

# schemas/orders.toml
namespace   = "shop"
table       = "orders"
primary_key = ["id"]

[[columns]]
name = "id"
ty   = "str"      # ULIDs / UUIDs travel as text
required = true

[[columns]]
name = "customer"
ty   = "str"

[[columns]]
name = "amount_cents"
ty   = "i64"      # money in minor units - never f64

[[columns]]
name = "status"
ty   = "str"

[[columns]]
name = "placed_ms"
ty   = "u64"      # epoch milliseconds

[[indexes]]
name    = "by_status"
columns = ["status"]

[[indexes]]
name    = "by_customer_placed"
columns = ["customer", "placed_ms"]

[[relations]]
name          = "by_customer"
from_col      = "customer"
bidirectional = true

[relations.target]
namespace = "shop"
table     = "customers"
pk        = "id"

Six column types only - str, i64, u64, f64, bool, bytes. Vector and full-text indexes are NOT declared here; they live on their own runtime endpoints (see vector, fts) and link back to rows by primary key. Indexes and relations are honoured at write time - no separate "build index" step. See schemas reference for the full grammar.

03 · query modes

How data shapes work.

A single row is reachable through several query modes - SQL, secondary index, relation walk, full-text, and vector - and the engine keeps all of them in lockstep on every write. Each mode is a different way to read the same row, not a different store.

Mode Purpose
Rows The primary user-facing record. PK is one or more columns (ULIDs / UUIDs travel as str). Read via the typed /rows API or SQL.
Secondary indexes Speed up equality filters and left-prefix range scans on declared columns. Maintained automatically on every write.
Relations Graph edges between rows. Forward and reverse traversal are both O(degree) - declare a relation in the manifest and walk it with neighbors / BFS / Dijkstra.
Full-text BM25 inverted index. Stored on a separate runtime endpoint - index text under (table, field, doc_id), then search by query string. Boolean, BM25, phrase, fuzzy modes.
Vector Embeddings indexed with HNSW (default), IVF, or IVF-PQ. Stored on a separate runtime endpoint - put one vector per row by primary key, then top-k by similarity. Optional metadata filter.
Plan cache Compiled Plan tree for a /ask question template. Skips the rule-grammar and LLM compile on cache hit; replays the tree through the executor.
04 · the plan tree

Eleven operators, one tree.

Both /sql and /ask compile to the same Plan tree. The tree is JSON-serialisable, cached by question hash, and replayable. Every shipped query shape is one of these operators or a composition.

Scan
Full scan of a table. The fallback when no index applies.
ColumnScan
Projection-aware scan that decodes only the requested fields.
IndexScan
Indexed lookup. Used when WHERE has an indexed equality.
Filter
Predicate evaluation. Pushed under projection where possible by the optimiser.
Project
Column selection - drops fields the user did not ask for before they reach the wire.
Limit
Truncates the stream. Pushed below sort when the sort key admits a top-K shortcut.
Sort
External-merge sort with spill-to-disk. Exposed via /ask; ORDER BY through the SQL translator is on the roadmap.
Aggregate
GROUP BY with COUNT / SUM / AVG / MIN / MAX.
HashJoin
Build-side hash table on the smaller input, probe with the larger. INNER joins. Up to 32 tables per query.
OuterJoin
LEFT, RIGHT, and FULL variants - emits NULL-filled rows for unmatched probe entries. Up to 32-table left-deep chains.
RelationHop
Walks forward or reverse edges. Powers neighbours, BFS, path, and Dijkstra.
-- SELECT c.name, SUM(o.amount_cents) AS total
--   FROM shop.orders o
--   JOIN shop.customers c ON c.id = o.customer
--  WHERE o.status = 'paid'
--  GROUP BY c.name
--  LIMIT 100;

Limit { 100 }
└── Aggregate { group: [c.name], agg: SUM(o.amount_cents) AS total }
    └── Project { c.name, o.amount_cents }
        └── HashJoin { o.customer = c.id }
            ├── IndexScan { shop.orders, status = "paid" }
            └── Scan { shop.customers }
05 · replication model

Active-passive, sync.

One primary, one optional sync follower. The log replicates before the primary returns 200. A follower joining a running cluster bootstraps via a snapshot transfer - a chunked transfer of the full store - then tails the live stream from the snapshot's LSN.

Mode Tiers RPO RTO Notes
Primary only Tier 1 ~0.5s (commit fsync) ~5-10 min (archive restore) Single AZ, no follower. Restore replays the continuous backup archive.
Sync follower Tier 2, Tier 3, Enterprise 0 ~25s (drilled) Multi-AZ failover. Verified end-to-end with snapshot bootstrap. Tier 2 has 1 follower; Tier 3 has 2.

On Tier 2 and above, active-passive sync replication is the production path. Commits durably ack only after the follower has the frame on disk - RPO=0, RTO ~25 s. See ops → failover for the promotion procedure.

06 · versioning

Single-row optimistic CAS.

Every row carries an internal _oc_row_version field. The API exposes put_row_cas, get_row_versioned, and delete_row_cas for optimistic concurrency. A CAS that loses the race fails the entire batch with a deterministic error - no partial application. Idempotency keys make retries safe; the same key plus the same body returns the original response, a different body with the same key returns 409.

07 · backups & pitr

Continuous backup archive.

Two streams flow to the archive in parallel: durable checkpoints shipped on roll, and a continuous backup stream that flushes the open log every few hundred milliseconds. Restore-to-timestamp resolves to sub-second precision on the paid tier.

# restore an instance to a wall-clock timestamp
oc-pitr restore \
  --tenant acme \
  --target "2026-04-29T18:42:00Z" \
  --into   acme-restore-001

Continuous backups (segment-boundary granularity, ~5–10 min restore window) are included on every tier. Sub-second precise PITR (~0.5–1.5 s data-loss window) is a paid add-on - see pricing.