AI & ML Advanced By Samson Tanimawo, PhD Published Aug 25, 2026 5 min read

Vector Index Types: HNSW, IVF, ScaNN, DiskANN

Vector databases hide the index type. Knowing what each index is doing matters when you scale past a few million vectors.

HNSW (Hierarchical Navigable Small World)

Graph-based. Each vector connects to a few nearest neighbours. Search descends the graph greedily. Excellent recall (95-99%), low query latency. Memory-hungry: graph + vectors stays in RAM. Default in most vector DBs.

IVF (Inverted File Index)

Cluster the vectors into K cells. At query time, search only the nearest few cells. Trades recall for memory and speed. Often combined with Product Quantization (PQ) for compression.

ScaNN (Google’s Scalable Nearest Neighbours)

Tree-based partitioning + asymmetric quantization + reordering. Strong on the exact-recall vs speed Pareto frontier. Less popular than HNSW because it’s harder to operationalise.

DiskANN

Index lives on SSD, not RAM. Sacrifices a few ms of latency for 10-100x storage capacity. The only realistic choice for billion-scale.

Picking one