reference · vector

Vector search

Vector search finds rows whose embedding (a list of numbers from a model) is closest to a query embedding. It's how semantic search, RAG, recommendation systems, and image-similarity all work under the hood.

This page is the query reference. For how to save a vector, see Insert → vector. For how to declare vector columns on a schema, see Schemas → vector fields. All examples assume you have a client set up - see Quickstart.

1. Top-k search.

what this does

Given a query embedding, return the top k rows whose stored embeddings are closest to it.

POST /v1/tenants/:t/vector/:table/topk

curl -X POST "https://$OC_HOST/v1/tenants/$OC_TENANT/vector/shop.products/topk" \
  -H "Authorization: Bearer $OC_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query":  [0.011, -0.082, 0.046, /* ... 768 floats ... */],
    "k":      10,
    "dim":    768,
    "metric": "cosine"
  }'

# query_768d is your query embedding - any list of 768 floats.
hits = db.vector_topk(
    "shop.products",
    query=query_768d,
    k=10,
    dim=768,
    metric="cosine",
)
for hit in hits:
    print(hit.id, hit.score)

const hits = await db.vectorTopk("shop.products", {
  query:  query768d,      // number[] of length 768
  k:      10,
  dim:    768,
  metric: "cosine",
});
for (const hit of hits) {
  console.log(hit.id, hit.score);
}

hits, err := db.VectorTopK(ctx, "shop.products", originchain.VectorTopKRequest{
    Query:  query768d,      // []float32 of length 768
    K:      10,
    Dim:    768,
    Metric: "cosine",
})
if err != nil { /* handle */ }
for _, h := range hits {
    fmt.Println(h.ID, h.Score)
}

what each field means

Field	Type	Required	What it is
query	float[]	yes	The query embedding. Length must match the column's `dim`.
k	int	yes	How many results to return. Typical values: 10, 50, 100.
dim	int	yes	The vector's length. Must match the table's configured dimension.
metric	string	no	How "closeness" is measured. Must match what was used at insert time. See distance metric.
filter	object	no	Metadata filter. See filter by metadata.
mode	string	no	`"high_recall"` (default) or `"fast"`. See speed vs recall.

what you get back

An array of { id, score } objects, ordered from closest to farthest. The id is the row's primary key (so you can look up the full row); score tells you how close the match is.

[
  { "id": "sku-9281", "score": 0.92 },
  { "id": "sku-4017", "score": 0.88 },
  { "id": "sku-3320", "score": 0.85 },
  ...
]

Score ordering depends on the metric. With cosine and dot, higher is closer. With L2, lower is closer. Results always come back in correct ranking order - you do not need to sort yourself.

common mistakes

Wrong dim. If your query vector is 1536 floats but the table was set up for 768, you get 400 dim_mismatch.
Wrong metric. The metric must match the one used when the table was first populated. Mixing returns 400 metric_mismatch.
Empty results from a brand-new table. If you just created the table and the index hasn't built yet (very rare - usually instant), the first query can return fewer than k hits. Retry once.

2. Filter by metadata.

what this does

Restrict the search to vectors whose metadata matches a filter. Useful for things like "find similar products but only in the shoes category".

POST /v1/tenants/:t/vector/:table/topk

curl -X POST "https://$OC_HOST/v1/tenants/$OC_TENANT/vector/shop.products/topk" \
  -H "Authorization: Bearer $OC_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query":  [/* 768 floats */],
    "k":      10,
    "dim":    768,
    "metric": "cosine",
    "filter": { "category": "running-shoes" }
  }'

hits = db.vector_topk(
    "shop.products",
    query=query_768d,
    k=10,
    dim=768,
    metric="cosine",
    filter={"category": "running-shoes"},
)

const hits = await db.vectorTopk("shop.products", {
  query:  query768d,
  k:      10,
  dim:    768,
  metric: "cosine",
  filter: { category: "running-shoes" },
});

hits, err := db.VectorTopK(ctx, "shop.products", originchain.VectorTopKRequest{
    Query:  query768d,
    K:      10,
    Dim:    768,
    Metric: "cosine",
    Filter: map[string]any{"category": "running-shoes"},
})

Filters use exact equality on metadata fields you stored at insert time. The filter is applied during the search, not after - so a highly selective filter (e.g., only 1% of rows match) is still fast.

common mistakes

Filtering on a field you didn't store. The filter looks at the metadata object you passed at insert time - not at the row's other columns. If you want to filter on a column, include it in metadata.
Range filters. Only exact equality is supported today (category = "shoes"). For range filters (price < 100), filter the result in your app after the search returns.
Very selective filters returning fewer than k hits. If only 5 rows match your filter, you get 5 hits even if you asked for 50. That's not a bug - it's all there is.

3. Distance metric.

what this does

Picks the math used to compare two vectors. You set the metric when you create the table - the search has to use the same one.

pick by your model

Metric	Use it when
cosine	Default for text. Use this with OpenAI, Cohere, Voyage, BGE, E5, and most other text embedding models. Looks at the angle between two vectors - magnitude doesn't matter.
dot	Use when your vectors are already unit-normalised (length 1). Slightly cheaper than cosine for the same result. Some image-embedding pipelines emit normalised vectors.
l2	Use when the absolute distance between vectors matters - some image-feature pipelines, certain audio embeddings, and a few specialty research models.
manhattan	Same as L2 but uses absolute differences instead of squared ones. More forgiving of one or two big-difference dimensions. Niche.

Not sure? Pick cosine. It works for 90% of text embeddings.

4. Speed vs recall.

what this does

Approximate nearest-neighbor search trades a small amount of accuracy for huge speed gains. The mode field picks where on that trade-off you want to land.

POST /v1/tenants/:t/vector/:table/topk

# Add "mode" to the topk body. "high_recall" (default) or "fast".
curl -X POST "https://$OC_HOST/v1/tenants/$OC_TENANT/vector/shop.products/topk" \
  -H "Authorization: Bearer $OC_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query":  [/* 768 floats */],
    "k":      10,
    "dim":    768,
    "metric": "cosine",
    "mode":   "fast"
  }'

hits = db.vector_topk(
    "shop.products",
    query=query_768d,
    k=10,
    dim=768,
    metric="cosine",
    mode="fast",      # "fast" | "high_recall" (default)
)

const hits = await db.vectorTopk("shop.products", {
  query:  query768d,
  k:      10,
  dim:    768,
  metric: "cosine",
  mode:   "fast",     // "fast" | "high_recall"
});

hits, err := db.VectorTopK(ctx, "shop.products", originchain.VectorTopKRequest{
    Query:  query768d,
    K:      10,
    Dim:    768,
    Metric: "cosine",
    Mode:   originchain.ModeFast,    // ModeFast | ModeHighRecall
})

two modes

Mode	What you get	Use when
high_recall (default)	~96% of the truly-closest vectors. Higher latency.	Product search, similar-customer lookup, anywhere first-pass accuracy matters.
fast	~70% of the truly-closest vectors. ~3x faster.	RAG with a re-ranker, hot dashboards, anywhere latency dominates.

"Recall" means: of the truly closest vectors that brute force would return, how many did the index find? Both modes return ranking-correct results - the difference is whether the absolute top-K is occasionally missed.

5. Index choice.

what this does

OriginChain supports several different index types for vector data. Most users should stick with the default (HNSW). The other types are for very large or memory-constrained corpora.

Index	Pick when
HNSW (default)	Up to ~10M vectors per table. Best accuracy. The right choice for almost everyone.
IVF	Above ~10M vectors. Cheaper memory at the cost of a small recall hit. See IVF reference.
IVF-PQ	Above ~50M vectors or memory-constrained. Compresses vectors 64×-768×. See IVF-PQ reference.
Binary quantization	When you need 32× memory savings and can tolerate the recall hit. See Quantization reference.
Sparse vectors	For models like SPLADE / uniCOIL that emit sparse vectors instead of dense ones.

The index is picked when you declare the vector column on the schema. See Schemas → vector fields for the declaration syntax.