diff --git a/README.md b/README.md index db37a1f..017519d 100644 --- a/README.md +++ b/README.md @@ -528,6 +528,30 @@ SELECT id, content FROM items, plainto_tsquery('hello search') query You can use [Reciprocal Rank Fusion](https://github.com/pgvector/pgvector-python/blob/master/examples/hybrid_search_rrf.py) or a [cross-encoder](https://github.com/pgvector/pgvector-python/blob/master/examples/hybrid_search.py) to combine results. +## Subvector Indexing + +*Unreleased* + +Use expression indexing to index subvectors + +```sql +CREATE INDEX ON items USING hnsw ((subvector(embedding, 1, 3)::vector(3)) vector_cosine_ops); +``` + +Get the nearest neighbors by cosine distance + +```sql +SELECT * FROM items ORDER BY subvector(embedding, 1, 3)::vector(3) <=> subvector('[1,2,3,4,5]'::vector, 1, 3) LIMIT 5; +``` + +Re-rank by the full vectors for better recall + +```sql +SELECT * FROM ( + SELECT * FROM items ORDER BY subvector(embedding, 1, 3)::vector(3) <=> subvector('[1,2,3,4,5]'::vector, 1, 3) LIMIT 20 +) ORDER BY embedding <=> '[1,2,3,4,5]' LIMIT 5; +``` + ## Performance ### Tuning