From 609d01f4c6aef99faba86808417e9ae511fb1912 Mon Sep 17 00:00:00 2001 From: Andrew Kane Date: Sun, 26 Apr 2026 12:56:07 -0700 Subject: [PATCH] Added more scaling advice to readme [skip ci] --- README.md | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 1a8de7e..ad19a3d 100644 --- a/README.md +++ b/README.md @@ -11,6 +11,8 @@ Store your vectors with the rest of your data. Supports: Plus [ACID](https://en.wikipedia.org/wiki/ACID) compliance, point-in-time recovery, JOINs, and all of the other [great features](https://www.postgresql.org/about/) of Postgres +Have a lot of vectors? Use [half-precision vectors](#half-precision-vectors) and [binary quantization](#binary-quantization) to scale + [![Build Status](https://github.com/pgvector/pgvector/actions/workflows/build.yml/badge.svg)](https://github.com/pgvector/pgvector/actions) ## Installation @@ -314,6 +316,8 @@ For a large number of workers, you may need to increase `max_parallel_workers` ( The [index options](#index-options) also have a significant impact on build time (use the defaults unless seeing low recall) +Use [binary quantization](#binary-quantization) for faster build times at scale + ### Indexing Progress Check [indexing progress](https://www.postgresql.org/docs/current/progress-reporting.html#CREATE-INDEX-PROGRESS-REPORTING) @@ -673,6 +677,10 @@ SHOW shared_buffers; Be sure to restart Postgres for changes to take effect. +### Storing + +Use the `halfvec` type instead of `vector` for a smaller working set. + ### Loading Use `COPY` for bulk loading data ([example](https://github.com/pgvector/pgvector-python/blob/master/examples/loading/example.py)). @@ -687,6 +695,8 @@ Add any indexes *after* loading the initial data for best performance. See index build time for [HNSW](#index-build-time) and [IVFFlat](#index-build-time-1). +Use [binary quantization](#binary-quantization) for smaller indexes and faster build times at scale. + In production environments, create indexes concurrently to avoid blocking writes. ```sql @@ -717,6 +727,8 @@ SELECT * FROM items ORDER BY embedding <#> '[3,1,2]' LIMIT 5; #### Approximate Search +Use [binary quantization](#binary-quantization) with re-ranking to keep indexes in-memory at scale. + To speed up queries with an IVFFlat index, increase the number of inverted lists (at the expense of recall). ```sql @@ -759,10 +771,13 @@ COMMIT; ## Scaling -Scale pgvector the same way you scale Postgres. - Scale vertically by increasing memory, CPU, and storage on a single instance. Use existing tools to [tune parameters](#tuning) and [monitor performance](#monitoring). +For a smaller working set: + +1. Use the `halfvec` type instead of `vector` for tables +2. Use [binary quantization](#binary-quantization) for indexes (with re-ranking for search) + Scale horizontally with [replicas](https://www.postgresql.org/docs/current/hot-standby.html), or use [Citus](https://github.com/citusdata/citus) or another approach for sharding ([example](https://github.com/pgvector/pgvector-python/blob/master/examples/citus/example.py)). ## Languages @@ -878,6 +893,8 @@ No, but like other index types, you’ll likely see better performance if they d SELECT pg_size_pretty(pg_relation_size('index_name')); ``` +Use [half-precision indexing](#half-precision-indexing) or [binary quantization](#binary-quantization) for smaller indexes. + ## Troubleshooting #### Why isn’t a query using an index?