OriginChain docs
examples · fts · 4 / 6

4. Phrase search

← FTS examples
what this does

Returns documents whose description contains the exact phrase noise cancellation - both tokens, in that order, with nothing between them. Unlike boolean AND, position matters.

when to use it
  • Branded names and product models: "Bose QC45", "Sony WH-1000XM5".
  • Multi-word concepts where the order carries the meaning: "connection refused", "out of memory".
  • Log-template matching - searching for the literal stem of an emitted log line.
the request
GET /v1/tenants/:t/fts/:schema/:field?q=...&mode=phrase
curl -G "https://$OC_HOST/v1/tenants/$OC_TENANT/fts/shop.products/description" \
  -H "Authorization: Bearer $OC_TOKEN" \
  --data-urlencode "q=noise cancellation" \
  --data-urlencode "mode=phrase"
what you get back
{
  "mode": "phrase",
  "doc_ids": ["p001"]
}
how it works
  • The query is tokenised the same way as the indexed text.
  • The engine pulls posting lists for each token along with token positions inside each document.
  • A doc matches only if the positions are consecutive: position(ti+1) = position(ti) + 1 for every adjacent token pair.
common mistakes
  • Stemming changes the match. If the analyser stems cancellation to cancel, then "noise cancellation" and "noise cancel" match the same docs. Know which analyser the field uses.
  • Punctuation between tokens. Most analysers strip punctuation - "noise, cancellation" in the source text still phrase-matches q=noise cancellation.
  • Expecting fuzzy behaviour. Phrase is strict - one typo and the phrase doesn't match. Use BM25 with fuzzy=1 if you need typo tolerance.