Vector search
Vector search finds rows whose embedding (a list of numbers from a model) is closest to a query embedding. It's how semantic search, RAG, recommendation systems, and image-similarity all work under the hood.
This page is the query reference. For how to save a vector, see Insert → vector. For how to declare vector columns on a schema, see Schemas → vector fields. All examples assume you have a client set up - see Quickstart.
1. Top-k search.
Given a query embedding, return the top k rows whose stored embeddings are closest to it.
curl -X POST "https://$OC_HOST/v1/tenants/$OC_TENANT/vector/shop.products/topk" \
-H "Authorization: Bearer $OC_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"query": [0.011, -0.082, 0.046, /* ... 768 floats ... */],
"k": 10,
"dim": 768,
"metric": "cosine"
}'# query_768d is your query embedding - any list of 768 floats.
hits = db.vector_topk(
"shop.products",
query=query_768d,
k=10,
dim=768,
metric="cosine",
)
for hit in hits:
print(hit.id, hit.score)const hits = await db.vectorTopk("shop.products", {
query: query768d, // number[] of length 768
k: 10,
dim: 768,
metric: "cosine",
});
for (const hit of hits) {
console.log(hit.id, hit.score);
}hits, err := db.VectorTopK(ctx, "shop.products", originchain.VectorTopKRequest{
Query: query768d, // []float32 of length 768
K: 10,
Dim: 768,
Metric: "cosine",
})
if err != nil { /* handle */ }
for _, h := range hits {
fmt.Println(h.ID, h.Score)
} | Field | Type | Required | What it is |
|---|---|---|---|
| query | float[] | yes | The query embedding. Length must match the column's dim. |
| k | int | yes | How many results to return. Typical values: 10, 50, 100. |
| dim | int | yes | The vector's length. Must match the table's configured dimension. |
| metric | string | no | How "closeness" is measured. Must match what was used at insert time. See distance metric. |
| filter | object | no | Metadata filter. See filter by metadata. |
| mode | string | no | "high_recall" (default) or "fast". See speed vs recall. |
An array of { id, score } objects, ordered from closest to farthest. The id is the row's primary key (so you can look up the full row); score tells you how close the match is.
[
{ "id": "sku-9281", "score": 0.92 },
{ "id": "sku-4017", "score": 0.88 },
{ "id": "sku-3320", "score": 0.85 },
...
] Score ordering depends on the metric. With cosine and dot, higher is closer. With L2, lower is closer. Results always come back in correct ranking order - you do not need to sort yourself.
- Wrong dim. If your query vector is 1536 floats but the table was set up for 768, you get
400 dim_mismatch. - Wrong metric. The metric must match the one used when the table was first populated. Mixing returns
400 metric_mismatch. - Empty results from a brand-new table. If you just created the table and the index hasn't built yet (very rare - usually instant), the first query can return fewer than k hits. Retry once.
2. Filter by metadata.
Restrict the search to vectors whose metadata matches a filter. Useful for things like "find similar products but only in the shoes category".
curl -X POST "https://$OC_HOST/v1/tenants/$OC_TENANT/vector/shop.products/topk" \
-H "Authorization: Bearer $OC_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"query": [/* 768 floats */],
"k": 10,
"dim": 768,
"metric": "cosine",
"filter": { "category": "running-shoes" }
}'hits = db.vector_topk(
"shop.products",
query=query_768d,
k=10,
dim=768,
metric="cosine",
filter={"category": "running-shoes"},
)const hits = await db.vectorTopk("shop.products", {
query: query768d,
k: 10,
dim: 768,
metric: "cosine",
filter: { category: "running-shoes" },
});hits, err := db.VectorTopK(ctx, "shop.products", originchain.VectorTopKRequest{
Query: query768d,
K: 10,
Dim: 768,
Metric: "cosine",
Filter: map[string]any{"category": "running-shoes"},
}) Filters use exact equality on metadata fields you stored at insert time. The filter is applied during the search, not after - so a highly selective filter (e.g., only 1% of rows match) is still fast.
- Filtering on a field you didn't store. The filter looks at the
metadataobject you passed at insert time - not at the row's other columns. If you want to filter on a column, include it in metadata. - Range filters. Only exact equality is supported today (
category = "shoes"). For range filters (price < 100), filter the result in your app after the search returns. - Very selective filters returning fewer than k hits. If only 5 rows match your filter, you get 5 hits even if you asked for 50. That's not a bug - it's all there is.
3. Distance metric.
Picks the math used to compare two vectors. You set the metric when you create the table - the search has to use the same one.
| Metric | Use it when |
|---|---|
| cosine | Default for text. Use this with OpenAI, Cohere, Voyage, BGE, E5, and most other text embedding models. Looks at the angle between two vectors - magnitude doesn't matter. |
| dot | Use when your vectors are already unit-normalised (length 1). Slightly cheaper than cosine for the same result. Some image-embedding pipelines emit normalised vectors. |
| l2 | Use when the absolute distance between vectors matters - some image-feature pipelines, certain audio embeddings, and a few specialty research models. |
| manhattan | Same as L2 but uses absolute differences instead of squared ones. More forgiving of one or two big-difference dimensions. Niche. |
Not sure? Pick cosine. It works for 90% of text embeddings.
4. Speed vs recall.
Approximate nearest-neighbor search trades a small amount of accuracy for huge speed gains. The mode field picks where on that trade-off you want to land.
# Add "mode" to the topk body. "high_recall" (default) or "fast".
curl -X POST "https://$OC_HOST/v1/tenants/$OC_TENANT/vector/shop.products/topk" \
-H "Authorization: Bearer $OC_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"query": [/* 768 floats */],
"k": 10,
"dim": 768,
"metric": "cosine",
"mode": "fast"
}'hits = db.vector_topk(
"shop.products",
query=query_768d,
k=10,
dim=768,
metric="cosine",
mode="fast", # "fast" | "high_recall" (default)
)const hits = await db.vectorTopk("shop.products", {
query: query768d,
k: 10,
dim: 768,
metric: "cosine",
mode: "fast", // "fast" | "high_recall"
});hits, err := db.VectorTopK(ctx, "shop.products", originchain.VectorTopKRequest{
Query: query768d,
K: 10,
Dim: 768,
Metric: "cosine",
Mode: originchain.ModeFast, // ModeFast | ModeHighRecall
}) | Mode | What you get | Use when |
|---|---|---|
| high_recall (default) | ~96% of the truly-closest vectors. Higher latency. | Product search, similar-customer lookup, anywhere first-pass accuracy matters. |
| fast | ~70% of the truly-closest vectors. ~3x faster. | RAG with a re-ranker, hot dashboards, anywhere latency dominates. |
"Recall" means: of the truly closest vectors that brute force would return, how many did the index find? Both modes return ranking-correct results - the difference is whether the absolute top-K is occasionally missed.
5. Index choice.
OriginChain supports several different index types for vector data. Most users should stick with the default (HNSW). The other types are for very large or memory-constrained corpora.
| Index | Pick when |
|---|---|
| HNSW (default) | Up to ~10M vectors per table. Best accuracy. The right choice for almost everyone. |
| IVF | Above ~10M vectors. Cheaper memory at the cost of a small recall hit. See IVF reference. |
| IVF-PQ | Above ~50M vectors or memory-constrained. Compresses vectors 64×-768×. See IVF-PQ reference. |
| Binary quantization | When you need 32× memory savings and can tolerate the recall hit. See Quantization reference. |
| Sparse vectors | For models like SPLADE / uniCOIL that emit sparse vectors instead of dense ones. |
The index is picked when you declare the vector column on the schema. See Schemas → vector fields for the declaration syntax.