Commit Graph

947 Commits

Author SHA1 Message Date
Andrew Kane
c01cf8a315 Renamed ApplyChanges to UpdateGraph [skip ci] 2024-01-22 17:02:02 -08:00
Andrew Kane
3ecb9a3cb2 Renamed HnswInsertElement to HnswFindElementNeighbors [skip ci] 2024-01-22 16:59:08 -08:00
Andrew Kane
a069e18fe4 Improved function names [skip ci] 2024-01-22 16:48:50 -08:00
Andrew Kane
5174a23094 Updated comment [skip ci] 2024-01-22 16:45:42 -08:00
Andrew Kane
16d7de79f6 Improved function names [skip ci] 2024-01-22 16:43:50 -08:00
Andrew Kane
e54ec4d637 Improved code [skip ci] 2024-01-22 16:37:32 -08:00
Andrew Kane
cc641002d3 Updated comments [skip ci] 2024-01-22 10:59:46 -08:00
Heikki Linnakangas
2f9b1e2893 xAdd overview comment on how HNSW build works (#419)
And rewrite some of the comments in InsertTuple(), to also give more
of a high-level overview of the flow.
2024-01-22 10:50:12 -08:00
Andrew Kane
a3e4fbf6aa Use shared lock for copying neighbors to local memory 2024-01-19 13:44:25 -08:00
Andrew Kane
09a4ec29a0 Added InsertTupleInMemory 2024-01-19 01:27:45 -08:00
Heikki Linnakangas
ca3b4cd029 Remove HnswSpool
It was just used to pass heap/index relations to
HnswParallelScanAndInsert. I think it was copied from nbtsort.c, which
is more complicated. I don't think we need a struct like this.

(That said, I actually think that we should have a state object that
would hold fields like 'heap', 'index', 'procinfo', 'collation'
etc. Passing that object around would simplify the signatures of many
functions. But that's a different story).
2024-01-19 00:26:47 -08:00
Heikki Linnakangas
d96e486274 Remove unused 'scantuplesortstates' field 2024-01-19 00:26:47 -08:00
Heikki Linnakangas
88213186a5 Remove unused argument 2024-01-19 00:26:47 -08:00
Andrew Kane
7dd9534894 Use same locking as insert 2024-01-19 00:18:29 -08:00
Andrew Kane
d801a843f4 Removed HnswPtrSetNull to avoid setting relptr_off directly 2024-01-16 17:08:13 -08:00
Andrew Kane
1458c7bb2a Improved code [skip ci] 2024-01-16 14:03:28 -08:00
Andrew Kane
cad48d9203 Improved locking 2024-01-16 13:34:55 -08:00
Heikki Linnakangas
719b4b7436 Use LWLocks instead of SpinLocks (#410)
Spinlocks should be held only for a few instructions, for multiple
reasons:

- You have to be very careful not to elog() out while holding a
  spinlock, because there is no mechanism to release the spinlock on
  error.

- Waiters can waste a lot of cycles spinning if the lock is
  contended. I you wait on a spinlock for too long, the PostgreSQL
  implementation will actually PANIC, see s_lock_stuck().

The flushLock is particularly problematic. It is held in exclusive
mode, which means it holds a spinlock, over the call to
FlushPages(). FlushPages() performs lots of I/O so it can take a very
long time (>= minutes), and can also easily error out for various
reasons.

allocatorLock would perhaps be OK as a spinlocks, but even that feels
a bit heavy, so I converted that to an LWLock, too.

entryLock is usually held for a very short time, in shared mode, so
that would be fine as a spinlock. However, in the rare case that the
entry point is updated, it's held for a very long time. An LWLock used
in shared mode is about as fast a spinlock, that path is pretty
heavily optimized.

I think we have some problems with the per-element spinlocks too. In
HnswUpdateNeighborPagesInMemory(), it's held over a call to
HnswUpdateConnection(), but HnswUpdateConnection() can error out at
least in case of an out-of-memory error (it uses lappend(), which
calls palloc()). It also calls the distance function, and I don't
think they are guaranteed to be ereport-free either. However, I didn't
address that in this PR, it needs a bit more thinking.
2024-01-16 13:25:03 -08:00
Andrew Kane
fa0acbf62d Fixed CI 2024-01-15 19:55:46 -08:00
Andrew Kane
1612b84069 Fixed error on Windows [skip ci] 2024-01-15 19:33:16 -08:00
Andrew Kane
2f9371516d Leave space for other objects in shared memory 2024-01-15 19:17:50 -08:00
Andrew Kane
9d3e4e74df Added support for in-memory parallel index builds for HNSW 2024-01-15 15:07:31 -08:00
Andrew Kane
0ce497a1b1 Updated Homebrew note [skip ci] 2024-01-15 12:12:04 -08:00
Andrew Kane
c7d60346d8 Improved macro [skip ci] 2024-01-13 20:02:41 -08:00
Andrew Kane
597bfdc76b Added HnswGetNeighbors macro 2024-01-13 20:00:34 -08:00
Andrew Kane
cbf3eb4fa5 Improved HNSW build and insert code 2024-01-13 10:07:42 -08:00
Andrew Kane
cacd389f6d Improved pattern for duplicates 2024-01-12 14:30:13 -08:00
Andrew Kane
423cc2b06c Homebrew now adds to postgresql@15 as well [skip ci] 2024-01-11 16:45:50 -08:00
Andrew Kane
85c4ef6a14 Updated Postgres versions in readme [skip ci] 2024-01-11 12:36:24 -08:00
Andrew Kane
c6160a783a Homebrew now adds to postgresql@16 [skip ci] 2024-01-11 12:32:14 -08:00
Andrew Kane
1881b857f9 Simplified code 2024-01-09 18:53:31 -08:00
Andrew Kane
51bde5fb22 Updated readme [skip ci] 2024-01-09 14:38:25 -08:00
Andrew Kane
10e65ce349 Added note about maintenance_work_mem [skip ci] 2024-01-09 14:31:54 -08:00
Andrew Kane
61279f5a59 Updated readme [skip ci] 2024-01-09 14:26:55 -08:00
Andrew Kane
72b3889e26 Updated readme [skip ci] 2024-01-09 14:22:19 -08:00
Andrew Kane
bb21b2decf Updated readme [skip ci] 2024-01-09 14:19:01 -08:00
Andrew Kane
8a65c0e831 Moved section [skip ci] 2024-01-09 13:33:03 -08:00
Andrew Kane
7d75d423e4 Added section on index build time [skip ci] 2024-01-09 13:27:27 -08:00
Andrew Kane
6cad1f5de0 Updated example [skip ci] 2024-01-09 13:04:47 -08:00
Andrew Kane
67eeade63c Moved HNSW first in readme [skip ci] 2024-01-09 13:04:18 -08:00
Andrew Kane
108fb09d7b Improved code [skip ci] 2024-01-08 17:54:49 -08:00
Andrew Kane
65d060ac86 Reverted FlushPages pattern for parallel builds 2024-01-08 10:45:31 -08:00
Andrew Kane
62ee33bb92 Improved locking code 2024-01-08 09:05:12 -08:00
Andrew Kane
520e274dde Improved locking code 2024-01-07 22:34:41 -08:00
Andrew Kane
9e680884bd Moved indtuples to HnswGraph 2024-01-07 22:23:49 -08:00
Andrew Kane
19a0e1b341 Moved graph to separate struct 2024-01-07 20:15:30 -08:00
Andrew Kane
c7fe1571ee Improved code 2024-01-07 18:30:51 -08:00
Andrew Kane
cb4c770df2 Switched to slist for elements to reduce allocations and remove limit 2024-01-07 18:26:19 -08:00
Andrew Kane
85fdecd79b Moved FlushPages before HnswEndParallel 2024-01-07 17:50:46 -08:00
Andrew Kane
6132428914 Improved number of parallel workers for HNSW index builds - closes #397 2024-01-05 19:46:08 -08:00