Vector Databases for Beginners
A vector database stores embeddings and answers ‘find me the most similar one’ queries in milliseconds. That’s the entire pitch. Here is when you actually need one.
What a vector database stores
A vector database stores high-dimensional vectors of floating-point numbers, alongside optional metadata. Each vector represents an embedding of some content: a paragraph, an image, an audio clip. The database’s headline trick is finding the vectors closest to a query vector, fast.
You’ll see vectors of 384, 768, 1536, or 3072 dimensions in 2025, depending on the embedding model that produced them. The database doesn’t care what the numbers mean, only how to find vectors close to a target by cosine similarity or Euclidean distance.
Why a regular database doesn’t work
Postgres and friends are built for exact matches and range queries. They aren’t built for “find me the 10 rows whose 1,536-dimensional vector is closest to this query vector by cosine distance.”
You could compute cosine similarity in SQL, but on a million rows it would take seconds per query. Vector databases use approximate nearest neighbour (ANN) indices like HNSW, IVF, or scalar quantization to bring that down to single-digit milliseconds with about 95-99% recall.
The tradeoff: ANN is approximate. You won’t always get the exact 10 nearest neighbours; you’ll get 10 of the top ~30 nearest. For semantic search, that’s usually fine.
The three operations you’ll use
Vector databases are simpler than relational ones. Most workloads use just three operations:
- Upsert: insert a vector with an ID and metadata. If the ID already exists, replace it.
- Query: given a vector and a top-K, return the K nearest stored vectors (with their metadata and similarity scores). Usually with optional metadata filters.
- Delete: remove a vector by ID.
That’s 95% of vector-database usage. There’s no JOIN, no GROUP BY, no transactions in the relational sense.
Popular options
Four worth knowing in 2025:
- Pinecone: managed-only, fast, expensive at scale. Strong choice for early-stage teams that don’t want to run infrastructure.
- Weaviate: open-source, can self-host or use managed. Good metadata filtering and built-in hybrid search.
- Chroma: open-source, embed-in-process. Designed to start as a Python library and grow into a server. Great for prototyping.
- pgvector: a Postgres extension. If you already run Postgres, you can add vector search without standing up a new service. Good for <10M vectors; specialised stores beat it at higher scale.
Pinecone is fastest to ship. pgvector is cheapest if you have Postgres. Chroma is best for laptop-scale prototyping. Weaviate is the strongest middle ground for self-hosted production.
The metadata-filter gotcha
The first real-world surprise: queries usually combine vector similarity with metadata filters. “Find documents similar to this query, where status = published and language = en.”
How the database handles this matters a lot. Two strategies:
- Pre-filter: filter by metadata first, then do similarity search on the smaller subset. Fast when the filter is selective.
- Post-filter: do similarity search across everything, then drop results that don’t match the filter. Fast when the filter passes most rows.
Some vector databases pre-filter only; others post-filter only; the best ones choose adaptively. Read the docs before you commit. A poor strategy choice can turn a 5ms query into a 5-second one.
When you actually need one
You probably don’t need a vector database under 10,000 vectors. A flat file, computed in-memory at query time, will give you accurate top-K in milliseconds. The complexity of running a vector database isn’t justified at that scale.
You probably do need one above ~100,000 vectors, especially if you query frequently. Below that, the operational simplicity of in-memory search wins.
Between 10K and 100K, it depends on query rate and metadata complexity. Add up the engineering cost of running another service vs the latency benefit, and pick the one that hurts less.
The biggest mistake teams make is reaching for Pinecone on day one for an MVP that has 800 documents. Start with a flat file. Move to a database when you actually feel the pain.