examples · atomic · 2 / 5

2. Knowledge-base article (row + vector + FTS)

what this does

Save one help-center article as three shapes - the structured row in kb.articles, an embedding of the title and body for semantic similarity, and a BM25 keyword index on the body. No graph edges - this is the simplest recipe that still covers a real RAG / search backend.

when to use it

Help-center search backends. Keyword search handles "exact phrase" queries; vector search handles "what does this user mean".
RAG retrieval - rank candidates by vector similarity, optionally re-rank with FTS, then fetch the body from the row store.
Any corpus where you want both lexical and semantic recall over the same documents.

the schema

Plain row schema - no [[relations]] because there's no graph edge to write.

# kb/articles.toml
namespace   = "kb"
table       = "articles"
primary_key = ["id"]

[[columns]]
name = "id"
ty   = "str"
required = true

[[columns]]
name = "title"
ty   = "str"
required = true

[[columns]]
name = "body"
ty   = "str"
required = true

[[columns]]
name = "url"
ty   = "str"

[[columns]]
name = "created_ms"
ty   = "u64"

[[indexes]]
name    = "by_created"
columns = ["created_ms"]

call 1 of 3 - the row

POST /v1/tenants/:t/rows/kb.articles

curl -X POST "$ORIGINCHAIN_URL/v1/tenants/$T/rows/kb.articles" \
  -H "Authorization: Bearer $OC_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "id":         "kb-2026-001",
    "title":      "How atomic multi-shape writes work",
    "body":       "Each shape (row, vector, FTS, graph) has its own endpoint. Every call is atomic individually. Idempotency keys make retries safe.",
    "url":        "/docs/concepts/atomic-multi-shape",
    "created_ms": 1747900000000
  }'

db.rows.put("kb.articles", {
    "id":         "kb-2026-001",
    "title":      "How atomic multi-shape writes work",
    "body":       "Each shape (row, vector, FTS, graph) has its own endpoint. Every call is atomic individually. Idempotency keys make retries safe.",
    "url":        "/docs/concepts/atomic-multi-shape",
    "created_ms": 1747900000000,
})

// The TypeScript SDK does not wrap row writes yet
// (shipping in the next release). Use `fetch` for now.

await fetch(`${BASE_URL}/v1/tenants/${TENANT}/rows/kb.articles`, {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${OC_TOKEN}`,
    "Content-Type":  "application/json",
  },
  body: JSON.stringify({
    id:         "kb-2026-001",
    title:      "How atomic multi-shape writes work",
    body:       "Each shape (row, vector, FTS, graph) has its own endpoint. Every call is atomic individually. Idempotency keys make retries safe.",
    url:        "/docs/concepts/atomic-multi-shape",
    created_ms: 1747900000000,
  }),
});

// The Go SDK does not wrap row writes yet
// (shipping in the next release). Use net/http for now.

body, _ := json.Marshal(map[string]any{
    "id":         "kb-2026-001",
    "title":      "How atomic multi-shape writes work",
    "body":       "Each shape (row, vector, FTS, graph) has its own endpoint. Every call is atomic individually. Idempotency keys make retries safe.",
    "url":        "/docs/concepts/atomic-multi-shape",
    "created_ms": uint64(1747900000000),
})
req, _ := http.NewRequestWithContext(ctx, "POST",
    BASE_URL+"/v1/tenants/"+TENANT+"/rows/kb.articles",
    bytes.NewReader(body))
req.Header.Set("Authorization", "Bearer "+OC_TOKEN)
req.Header.Set("Content-Type",  "application/json")

resp, _ := http.DefaultClient.Do(req)
defer resp.Body.Close()

call 2 of 3 - the embedding (title + body concatenated)

Embed both fields together. A title-only embedding misses everything the body says, and that's where most of the meaningful tokens live.

POST /v1/tenants/:t/vector/kb.articles/put

curl -X POST "$ORIGINCHAIN_URL/v1/tenants/$T/vector/kb.articles/put" \
  -H "Authorization: Bearer $OC_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "id":        "kb-2026-001",
    "embedding": [0.0211, -0.0612, 0.0341, /* ... 768 floats ... */],
    "dim":       768,
    "metric":    "cosine"
  }'

# Embed the title and body together so semantic search hits both.
text = f"{title}\n\n{body}"
embedding_768d = embed(text)              # any embedding model

db.vector.put(
    "kb.articles",
    "kb-2026-001",
    embedding_768d,
)

// Embed title + body together. embedding768d is your number[] of length 768.
await db.vectorPut("kb.articles", {
  id:        "kb-2026-001",
  embedding: embedding768d,
  dim:       768,
  metric:    "cosine",
});

// Embed title + body together. embedding768d is your []float32 of length 768.
err := db.VectorPut(ctx, "kb.articles", originchain.VectorPutRequest{
    ID:        "kb-2026-001",
    Embedding: embedding768d,
    Dim:       768,
    Metric:    "cosine",
})

call 3 of 3 - the keyword index (body only)

Index the body for BM25 keyword search. Most help-center queries are keyword-shaped ("install on Windows"), so this is the workhorse.

POST /v1/tenants/:t/fts/kb.articles/index

curl -X POST "$ORIGINCHAIN_URL/v1/tenants/$T/fts/kb.articles/index" \
  -H "Authorization: Bearer $OC_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "field":  "body",
    "doc_id": "kb-2026-001",
    "text":   "Each shape (row, vector, FTS, graph) has its own endpoint. Every call is atomic individually. Idempotency keys make retries safe."
  }'

db.fts.index(
    "kb.articles",
    "body",
    doc_id="kb-2026-001",
    text=body,
)

await db.ftsIndex("kb.articles", {
  field:  "body",
  docId:  "kb-2026-001",
  text:   body,
});

err := db.FTSIndex(ctx, "kb.articles", originchain.FTSIndexRequest{
    Field: "body",
    DocID: "kb-2026-001",
    Text:  body,
})

about atomicity

The three calls are separate. There is no single "write everything" endpoint. Each call is atomic by itself. The SDKs auto-attach an Idempotency-Key on every mutating call, so if the FTS call fails after the row and vector succeeded, retry just the FTS one - re-doing the row write would not duplicate it.

common mistakes

Embedding only the title. Titles carry maybe 10% of an article's meaning. Concatenate title + body before embedding so semantic similarity actually fires on body content.
Indexing the title in FTS but not the body. The opposite mistake. The body is where the searchable keywords live.
Forgetting to re-index on update. If you edit an article, you have to re-put the row, re-put the vector, and re-index the FTS field. None of the three rides along with the others.
Embedding huge bodies as one vector. Past ~1k tokens, semantic similarity gets muddy. For long articles, chunk the body, write one row per chunk with a parent article_id, and embed each chunk separately.