examples · fts · 4 / 6

4. Phrase search

what this does

Returns documents whose description contains the exact phrase noise cancellation - both tokens, in that order, with nothing between them. Unlike boolean AND, position matters.

when to use it

Branded names and product models: "Bose QC45", "Sony WH-1000XM5".
Multi-word concepts where the order carries the meaning: "connection refused", "out of memory".
Log-template matching - searching for the literal stem of an emitted log line.

the request

GET /v1/tenants/:t/fts/:schema/:field?q=...&mode=phrase

curl -G "https://$OC_HOST/v1/tenants/$OC_TENANT/fts/shop.products/description" \
  -H "Authorization: Bearer $OC_TOKEN" \
  --data-urlencode "q=noise cancellation" \
  --data-urlencode "mode=phrase"

hits = db.fts.search(
    "shop.products",
    "description",
    q="noise cancellation",
    mode="phrase",
)
for doc_id in hits.doc_ids:
    print(doc_id)

const hits = await db.ftsSearch("shop.products", "description", {
  q: "noise cancellation",
  mode: "phrase",
});
for (const docId of hits.doc_ids) {
  console.log(docId);
}

hits, _ := db.FTSSearch(ctx, "shop.products", "description", originchain.FTSSearchRequest{
    Q:    "noise cancellation",
    Mode: "phrase",
})
for _, docID := range hits.DocIDs {
    fmt.Println(docID)
}

what you get back

{
  "mode": "phrase",
  "doc_ids": ["p001"]
}

how it works

The query is tokenised the same way as the indexed text.
The engine pulls posting lists for each token along with token positions inside each document.
A doc matches only if the positions are consecutive: position(t_i+1) = position(t_i) + 1 for every adjacent token pair.

common mistakes

Stemming changes the match. If the analyser stems cancellation to cancel, then "noise cancellation" and "noise cancel" match the same docs. Know which analyser the field uses.
Punctuation between tokens. Most analysers strip punctuation - "noise, cancellation" in the source text still phrase-matches q=noise cancellation.
Expecting fuzzy behaviour. Phrase is strict - one typo and the phrase doesn't match. Use BM25 with fuzzy=1 if you need typo tolerance.