Every capability, one page.
Five query shapes on a single managed database. The tables below list what the engine actually does today - one row per capability, one sentence on what it gives you.
SQL.
Relational queries against the same row store the vector, full-text and graph shapes share - one consistency model, one bearer token, one writer.
| Feature | What it does |
|---|---|
| SELECT / INSERT / UPDATE / DELETE | The standard four. UPDATE and DELETE address rows by primary key or by a WHERE that resolves to a key range. |
| GROUP BY + HAVING + aggregates | SUM, AVG, MIN, MAX, COUNT, COUNT DISTINCT. HAVING filters post-aggregation. Group keys can be expressions, not just bare columns. |
| INNER / LEFT / RIGHT / FULL OUTER JOIN | All four canonical join types with arbitrary boolean ON clauses. The planner picks hash, merge, or nested-loop based on key cardinality. |
| JOIN USING / NATURAL JOIN | Shorthand joins when both sides share a column name. Useful when junction tables follow a consistent key convention. |
| CROSS JOIN | Cartesian product for generated-data scenarios and small lookup-cross-product reports. |
| Joins up to 5 tables | Single-statement joins across up to five tables. Beyond five, break the query into intermediate views or pipeline the result through /ask. |
| Correlated subqueries | EXISTS, scalar subqueries in SELECT, and IN (subquery) in WHERE. The outer-row binding is resolved without materializing the inner side. |
| Window functions | ROW_NUMBER, RANK, DENSE_RANK, LAG, LEAD, SUM OVER, AVG OVER, with PARTITION BY, ORDER BY, and ROWS BETWEEN frames. |
| Recursive CTEs | WITH RECURSIVE for hierarchical walks (org charts, comment trees, dependency closures) that don't need the full Cypher surface. |
| JSON path operators | -> and ->> read a field from a JSON column. Indexable when the path is fixed; otherwise scanned. |
| EXPLAIN ANALYZE | Full plan tree with row-count estimates, actual rows, per-node time, and the index choice the planner made. Use ?explain=true on any /query call. |
| MVCC transactions (snapshot isolation) | Begin a transaction with /tx/begin, stage multi-statement writes, then commit or rollback atomically. Snapshot-isolation today; serializable is on the roadmap. |
| Atomic cross-shape writes | A single INSERT can stage a row, its embedding, its full-text shards, and its outbound edges in one transaction. No two-phase glue in the app. |
Cypher / graph query.
A read-write subset of the Cypher grammar against edges and labelled nodes stored on the same substrate as rows. No second cluster to provision.
| Feature | What it does |
|---|---|
| MATCH + WHERE | Pattern-match nodes and edges, then filter on properties. The standard entry point for every graph query. |
| CREATE / SET / DELETE / MERGE | Add nodes and edges, update properties, remove them, or upsert with MERGE. Writes commit atomically with the row store. |
| OPTIONAL MATCH | Left-outer-join semantics for graph patterns. The match either binds or returns NULL for the optional part. |
| WITH | Pipeline operator. Project intermediate results, aggregate, then pass into the next match - the way you chain stages in a Cypher query. |
| Single-hop, multi-hop, undirected | Patterns like (a)-[:R]->(b), longer chains, and undirected (a)-[:R]-(b) for symmetric relationships. |
| Backward hop | <-[:R]- hops both as a standalone match and inside chains. The inverse edge index makes 'who points at me' as cheap as forward expansion. |
| Variable-length paths | (a)-[:R*1..5]->(b) for paths between one and five hops. Bounded so the planner can pick BFS depth-limit instead of unbounded expansion. |
| length(p) / nodes(p) / relationships(p) | Built-in functions for path inspection. length(p) returns hop count; nodes(p) and relationships(p) return the path's vertex and edge lists. |
| UNWIND | Turn a list into rows. Works on literals, the output of nodes(p) / relationships(p), and mixed lists from prior WITH clauses. |
| Parameters ($param) | Bind query inputs by name in WHERE, RETURN, and UNWIND. Lets you cache the compiled plan across calls with different bound values. |
| List comprehensions | [x IN coll WHERE pred | expr] over literal lists at compile time and over column-path lists at runtime - the standard Cypher fold. |
| FOREACH (nested up to 3 levels) | Iterate a list and run a write block per element. Nested FOREACH unrolls to a flat write sequence so each element commits atomically. |
| CALL { subquery } | Inline a read-only subquery with CALL { ... } and consume its results via YIELD. Compose larger queries without breaking the single-statement boundary. |
| shortestPath | Shortest path between two nodes by hop count. For weighted shortest path, see the Graph algorithms section. |
| ORDER BY / SKIP / LIMIT / DISTINCT | Sort and paginate Cypher RETURN clauses. DISTINCT deduplicates result rows the same way it does in SQL. |
Graph algorithms.
First-class algorithms exposed at the engine level - call them directly through the API without lifting the data into a separate analytics layer.
| Feature | What it does |
|---|---|
| Neighbors / BFS / DFS | Forward expansion from a starting node. Pluggable depth cap; results stream as the frontier expands. |
| Reverse traversal | Same expansion against the inverse edge index, so "who points at me" queries are O(degree), not full scan. |
| Shortest path (unweighted) | BFS-based shortest path between two nodes. Returns the path itself, not just the length. |
| Weighted shortest path (Dijkstra) | Dijkstra on a numeric edge property. Useful for cost, latency, or distance routing. |
| K-shortest paths (Yen's algorithm) | Top-K diverse shortest paths between two nodes. Use when one route isn't enough - rerouting, alternative-recommendation, lineage trees. |
| All simple paths | Enumerate every loop-free path between two nodes up to a hop bound. Bounded so the call terminates on dense graphs. |
| Connected components | Partition the graph into reachability classes. Returns a component id per node. |
| Triangle counting | Count triangles globally or per node. The local count is a clustering-coefficient input. |
| Degree centrality | Rank nodes by their in-degree, out-degree, or total degree. |
| Betweenness centrality | Brandes' algorithm. Identifies the bridge nodes whose removal would disconnect or fragment the graph. |
| Eigenvector centrality | Iterative eigenvector solve. Captures recursive importance - the well-connected-to-the-well-connected signal. |
| PageRank | Damped random-walk centrality with tunable damping factor and convergence tolerance. The standard ranking primitive. |
| Label propagation | Community detection by majority-label propagation. Converges quickly on most real graphs. |
| Louvain community detection | Modularity-optimisation community detection. Finds tighter communities than label propagation on graphs with clearer hierarchical structure. |
| Random walk + biased Node2Vec walks | Uniform random walks and Node2Vec p/q-biased walks. The walk stream feeds either downstream analytics or the embedding trainer. |
| Node2Vec embeddings | Skip-gram with negative sampling over Node2Vec walks. Persisted on the engine; topk_node2vec serves 'find similar nodes' without round-tripping through Python. |
Vector search.
Dense and sparse vectors stored next to the row they describe. The vector index is part of the same instance - no separate service to sync.
| Feature | What it does |
|---|---|
| Dense vectors + top-K | Submit an embedding and receive the top-K nearest documents by the configured distance metric. |
| HNSW index | Hierarchical Navigable Small World graph. Default index for any table over a few thousand vectors. Tunable ef_search per query. |
| Brute-force fallback | For small tables or when you want exact recall, top-K falls back to a linear scan. No code change - the planner picks it. |
| Filtered top-K (metadata) | Restrict the candidate set by an equality predicate on a metadata field before ranking. Stays fast for selective filters. |
| Pre-filter HNSW walker | Filter inside the HNSW walk instead of post-filtering the top-K. Selective filters never make you over-fetch then discard hits. |
| Sparse vectors | BM25-style sparse vectors for lexical retrieval. Same put/topk endpoints; the metric is dot-product. |
| Sparse IVF | Inverted-file index over sparse vectors. Scales sparse retrieval past the brute-force cliff without giving up the BM25-style scoring shape. |
| Hybrid sparse + dense (RRF) | Reciprocal rank fusion of a sparse and a dense query. One call, one merged ranked list - no client-side stitching. |
| IVF (inverted file index) | Voronoi-cell partitioning with k-means-trained centroids. Search visits the top-nprobe cells - the standard scale path past the HNSW comfort zone. |
| IVF-PQ (IVF + product quantization) | IVF partitioning layered on a PQ-compressed posting list. Roughly 32x less memory per resident vector at the same recall - the path to ten-million-plus tables. |
| Scalar quantization (int8) | Per-dimension int8 quantization. About 4x memory reduction with under 2% recall loss on typical embeddings. |
| Binary quantization (1-bit) | 1-bit-per-dimension quantization with asymmetric scoring. About 32x memory reduction; recall stays at or above 0.80 on structured corpora. |
| Distance metrics: L2, cosine, dot, Manhattan | Four metrics, picked per table at the first put. The metric is locked once chosen so every query is comparable. |
| SIMD-accelerated kernels (f32) | AVX2 / NEON float32 kernels for distance computation. Owned in-tree - no math library wrapper to upgrade. |
| DELETE vector | Removes a vector by id. HNSW tombstones the node; IVF / IVF-PQ evict it from its posting list. Put-after-delete cleanly resurrects the slot. |
| Per-tenant isolation | Each tenant runs on its own instance. Your vectors share no memory, no cache, and no scheduler with anyone else's. |
Full-text search.
Ranked text retrieval against the same row store - no separate search cluster, no eventual-consistency window between writes and search hits.
| Feature | What it does |
|---|---|
| Unicode tokenization (UAX #29) | Unicode-correct word segmentation. Accented Latin, Arabic, and emoji all break on the expected boundaries. |
| ICU tokenizer (CJK / Thai / Khmer) | ICU segmenter for scripts that don't put spaces between words. Picked per (table, field) when the corpus is CJK or Southeast Asian. |
| Boolean queries (AND / OR / NOT) | Combine clauses with the standard boolean operators. Returns the full matching set, unranked, when ranking is not needed. |
| BM25 scoring | Standard BM25 with the canonical defaults (k1=1.2, b=0.75). Returns top-K ranked hits with the raw score per document. |
| Phrase queries | Match the exact ordered sequence of terms. Uses position lists, so single-word matches are filtered out. |
| Fuzzy / typo-tolerant matching | Levenshtein edit-distance matching with the term~N syntax. Edit-distance cap of 3 keeps the candidate set bounded on large indexes. |
| Synonyms | Customer-managed synonym maps, bidirectional expansion, configurable cap per term. Synonyms apply at query time, so changes take effect without a reindex. |
| Stemming (9 languages) | English Snowball plus stemmers for Spanish, French, German, Italian, Dutch, Swedish, Russian, and Portuguese. Reduces inflections to a shared root. |
| Multi-language analyzers | LocaleAnalyzer with built-in stopword lists for 9 languages. One field per locale or one mixed-locale field - your call. |
| Per-locale stopwords config | Override the built-in stopword list per (table, field). Useful when a domain term ('the', 'a') is meaningful in your corpus. |
| Highlighting / snippets | Returns matched-snippet text per hit with configurable open/close tags. Overlapping matches are merged for clean display. |
| Faceted search / aggregations | Group counts alongside the ranked hits, capped at 1000 buckets per facet. Supports the standard 'narrow by category' UI without a second query. |
| JSON document indexing | Index a JSON column field-by-field. Each addressable path becomes its own searchable field; the dotted path is the search field name. |
| Multi-field weighted merge | Search several fields in one call with per-field weights. Combine title and body, weighting title higher, in a single ranked list. |
Everything above, one bearer token.
Every feature on this page runs against the same managed instance - one endpoint, one writer, one consistency model. No glue between shapes.