Added more scaling advice to readme [skip ci]

2026-07-23 04:20:56 +08:00 · 2026-04-26 12:56:07 -07:00
parent a7551a61ca
commit 609d01f4c6
1 changed files with 19 additions and 2 deletions
--- a/README.md
+++ b/README.md
@@ -11,6 +11,8 @@ Store your vectors with the rest of your data. Supports:

 Plus [ACID](https://en.wikipedia.org/wiki/ACID) compliance, point-in-time recovery, JOINs, and all of the other [great features](https://www.postgresql.org/about/) of Postgres

+Have a lot of vectors? Use [half-precision vectors](#half-precision-vectors) and [binary quantization](#binary-quantization) to scale
+
 [![Build Status](https://github.com/pgvector/pgvector/actions/workflows/build.yml/badge.svg)](https://github.com/pgvector/pgvector/actions)

 ## Installation
@@ -314,6 +316,8 @@ For a large number of workers, you may need to increase `max_parallel_workers` (

 The [index options](#index-options) also have a significant impact on build time (use the defaults unless seeing low recall)

+Use [binary quantization](#binary-quantization) for faster build times at scale
+
 ### Indexing Progress

 Check [indexing progress](https://www.postgresql.org/docs/current/progress-reporting.html#CREATE-INDEX-PROGRESS-REPORTING)
@@ -673,6 +677,10 @@ SHOW shared_buffers;

 Be sure to restart Postgres for changes to take effect.

+### Storing
+
+Use the `halfvec` type instead of `vector` for a smaller working set.
+
 ### Loading

 Use `COPY` for bulk loading data ([example](https://github.com/pgvector/pgvector-python/blob/master/examples/loading/example.py)).
@@ -687,6 +695,8 @@ Add any indexes *after* loading the initial data for best performance.

 See index build time for [HNSW](#index-build-time) and [IVFFlat](#index-build-time-1).

+Use [binary quantization](#binary-quantization) for smaller indexes and faster build times at scale.
+
 In production environments, create indexes concurrently to avoid blocking writes.

 ```sql
@@ -717,6 +727,8 @@ SELECT * FROM items ORDER BY embedding <#> '[3,1,2]' LIMIT 5;

 #### Approximate Search

+Use [binary quantization](#binary-quantization) with re-ranking to keep indexes in-memory at scale.
+
 To speed up queries with an IVFFlat index, increase the number of inverted lists (at the expense of recall).

 ```sql
@@ -759,10 +771,13 @@ COMMIT;

 ## Scaling

-Scale pgvector the same way you scale Postgres.
-
 Scale vertically by increasing memory, CPU, and storage on a single instance. Use existing tools to [tune parameters](#tuning) and [monitor performance](#monitoring).

+For a smaller working set:
+
+1. Use the `halfvec` type instead of `vector` for tables
+2. Use [binary quantization](#binary-quantization) for indexes (with re-ranking for search)
+
 Scale horizontally with [replicas](https://www.postgresql.org/docs/current/hot-standby.html), or use [Citus](https://github.com/citusdata/citus) or another approach for sharding ([example](https://github.com/pgvector/pgvector-python/blob/master/examples/citus/example.py)).

 ## Languages
@@ -878,6 +893,8 @@ No, but like other index types, you’ll likely see better performance if they d
 SELECT pg_size_pretty(pg_relation_size('index_name'));
 ```

+Use [half-precision indexing](#half-precision-indexing) or [binary quantization](#binary-quantization) for smaller indexes.
+
 ## Troubleshooting

 #### Why isn’t a query using an index?