From 609d01f4c6aef99faba86808417e9ae511fb1912 Mon Sep 17 00:00:00 2001
From: Andrew Kane <andrew@ankane.org>
Date: Sun, 26 Apr 2026 12:56:07 -0700
Subject: [PATCH] Added more scaling advice to readme [skip ci]

---
 README.md | 21 +++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index 1a8de7e..ad19a3d 100644
--- a/README.md
+++ b/README.md
@@ -11,6 +11,8 @@ Store your vectors with the rest of your data. Supports:
 
 Plus [ACID](https://en.wikipedia.org/wiki/ACID) compliance, point-in-time recovery, JOINs, and all of the other [great features](https://www.postgresql.org/about/) of Postgres
 
+Have a lot of vectors? Use [half-precision vectors](#half-precision-vectors) and [binary quantization](#binary-quantization) to scale
+
 [![Build Status](https://github.com/pgvector/pgvector/actions/workflows/build.yml/badge.svg)](https://github.com/pgvector/pgvector/actions)
 
 ## Installation
@@ -314,6 +316,8 @@ For a large number of workers, you may need to increase `max_parallel_workers` (
 
 The [index options](#index-options) also have a significant impact on build time (use the defaults unless seeing low recall)
 
+Use [binary quantization](#binary-quantization) for faster build times at scale
+
 ### Indexing Progress
 
 Check [indexing progress](https://www.postgresql.org/docs/current/progress-reporting.html#CREATE-INDEX-PROGRESS-REPORTING)
@@ -673,6 +677,10 @@ SHOW shared_buffers;
 
 Be sure to restart Postgres for changes to take effect.
 
+### Storing
+
+Use the `halfvec` type instead of `vector` for a smaller working set.
+
 ### Loading
 
 Use `COPY` for bulk loading data ([example](https://github.com/pgvector/pgvector-python/blob/master/examples/loading/example.py)).
@@ -687,6 +695,8 @@ Add any indexes *after* loading the initial data for best performance.
 
 See index build time for [HNSW](#index-build-time) and [IVFFlat](#index-build-time-1).
 
+Use [binary quantization](#binary-quantization) for smaller indexes and faster build times at scale.
+
 In production environments, create indexes concurrently to avoid blocking writes.
 
 ```sql
@@ -717,6 +727,8 @@ SELECT * FROM items ORDER BY embedding <#> '[3,1,2]' LIMIT 5;
 
 #### Approximate Search
 
+Use [binary quantization](#binary-quantization) with re-ranking to keep indexes in-memory at scale.
+
 To speed up queries with an IVFFlat index, increase the number of inverted lists (at the expense of recall).
 
 ```sql
@@ -759,10 +771,13 @@ COMMIT;
 
 ## Scaling
 
-Scale pgvector the same way you scale Postgres.
-
 Scale vertically by increasing memory, CPU, and storage on a single instance. Use existing tools to [tune parameters](#tuning) and [monitor performance](#monitoring).
 
+For a smaller working set:
+
+1. Use the `halfvec` type instead of `vector` for tables
+2. Use [binary quantization](#binary-quantization) for indexes (with re-ranking for search)
+
 Scale horizontally with [replicas](https://www.postgresql.org/docs/current/hot-standby.html), or use [Citus](https://github.com/citusdata/citus) or another approach for sharding ([example](https://github.com/pgvector/pgvector-python/blob/master/examples/citus/example.py)).
 
 ## Languages
@@ -878,6 +893,8 @@ No, but like other index types, you’ll likely see better performance if they d
 SELECT pg_size_pretty(pg_relation_size('index_name'));
 ```
 
+Use [half-precision indexing](#half-precision-indexing) or [binary quantization](#binary-quantization) for smaller indexes.
+
 ## Troubleshooting
 
 #### Why isn’t a query using an index?