diff --git a/README.md b/README.md index 6061068..b3f9ddf 100644 --- a/README.md +++ b/README.md @@ -459,6 +459,26 @@ SET hnsw.streaming = on; SET ivfflat.streaming = on; ``` +However, there are some important caveats. + +### Streaming Caveats + +With streaming queries, it’s possible for rows to be slightly out of order by distance. For strict ordering, use: + +```sql +WITH approx_order AS MATERIALIZED ( + SELECT *, embedding <-> '[1,2,3]' AS distance FROM items WHERE ... ORDER BY distance LIMIT 5 +) SELECT * FROM approx_order ORDER BY distance; +``` + +For distance filters, use a CTE and place the filter outside it. + +```sql +WITH approx_order AS MATERIALIZED ( + SELECT *, embedding <-> '[1,2,3]' AS distance FROM items WHERE ... ORDER BY distance LIMIT 5 +) SELECT * FROM approx_order WHERE distance < 0.1 ORDER BY distance; +``` + ### Streaming Options Since scanning a large portion of the index is expensive, there are options to control when the scan ends. @@ -492,24 +512,6 @@ Specify the max number of probes SET ivfflat.max_probes = 100; ``` -### Streaming Caveats - -With streaming queries, it’s possible for rows to be slightly out of order by distance. For strict ordering, use: - -```sql -WITH approx_order AS MATERIALIZED ( - SELECT *, embedding <-> '[1,2,3]' AS distance FROM items WHERE ... ORDER BY distance LIMIT 5 -) SELECT * FROM approx_order ORDER BY distance; -``` - -Distance filters should be placed outside the CTE for best performance. - -```sql -WITH approx_order AS MATERIALIZED ( - SELECT *, embedding <-> '[1,2,3]' AS distance FROM items WHERE ... ORDER BY distance LIMIT 5 -) SELECT * FROM approx_order WHERE distance < 0.1 ORDER BY distance; -``` - ## Half-Precision Vectors *Added in 0.7.0*