How to choose a vector database for your AI application

The most common question we get on RAG engagements is which vector database should we use?. The most useful answer is almost always not the one you've been told to use. The vector-database market has been shaped by a handful of well-marketed products and a wave of LinkedIn benchmarks; the actual decision depends on your scale, your data, and your appetite for operational work.

We have shipped production RAG on pgvector, Pinecone, Qdrant, Weaviate, and OpenSearch. They are all fine. They are not interchangeable. This post is the decision tree we walk clients through before recommending any of them.

Start with pgvector

If you already have a Postgres database — and most teams do — you should default to pgvector until you have a concrete reason not to. The reasons are not glamorous and that is exactly why they get skipped in vendor comparisons:

Your retrieval lives next to your application data, so you can join, filter, and sort by anything in your schema in the same query.
One database to back up, one database to monitor, one database your on-call already understands.
Real transactions, real foreign keys, real role-based access. Things that vector-only stores either do not have or have re-implemented poorly.
With HNSW indexes on a recent Postgres, sub-100 ms p95 on a few million vectors is achievable on a single properly-sized instance.

The teams that regret pgvector are usually the ones who did not size it correctly or who tried to push it past the scale it's comfortable at. We'll cover when to graduate next.

When to graduate from pgvector

There are three signals we treat as a real reason to reach for a dedicated vector store, and a few we don't.

Signal: write-heavy at high cardinality

If you're ingesting tens of millions of new vectors per day and want them queryable within seconds, pgvector's index build cost will start to dominate. Pinecone and Qdrant handle this gracefully because their indexes are designed for continuous writes. Postgres is happiest when the write rate is bounded.

Signal: hybrid filtering at very high cardinality

If your retrieval has to combine vector similarity with high-cardinality structured filters (per-tenant, per-document, per-permission, multi-axis at once), Weaviate and Qdrant have better-designed filter pushdown into the index. Postgres can do this but the planner sometimes makes the wrong choice and you'll spend time debugging query plans.

Signal: more than ~50 million vectors per tenant

pgvector with HNSW does scale into the hundreds of millions on serious hardware, but the operational work — index rebuilds, memory pressure, vacuum behavior — gets harder fast. At that scale a dedicated vector store earns its keep. Below it, the decision is largely about who you want to be on the hook for the database.

Signals we usually ignore

Marketing benchmarks. Every vendor publishes a benchmark on which they are the fastest. Run your own on your own data with your own filter shape before you believe any of them.
Algorithmic novelty. "We use product quantization with sub-clustering and reranking" is interesting; it is not a reason to switch databases.
Multi-modal support as a hypothetical. If you don't have multi-modal vectors today, do not pick a database for the day you might.

The shortlist, written honestly

For the cases where pgvector isn't the right answer, here is how we think about the credible alternatives. None of these are wrong choices; they have different strengths.

Database	Best for	Operational model	What to watch out for
pgvector	Default for almost every team with Postgres	Self-hosted alongside your existing DB	Index build cost on heavy write rates
Pinecone	Teams that want zero ops and predictable scaling	Fully managed serverless	Vendor lock-in; harder to debug at the storage layer
Qdrant	High write rate, complex filtering, self-hosted	Open source; managed cloud available	Smaller community; less ecosystem tooling
Weaviate	Hybrid search, multi-tenancy, schema-driven retrieval	Open source; managed cloud available	More moving parts; opinionated schema model
OpenSearch	Teams already invested in Elastic / OpenSearch	Self-hosted or AWS managed	Slower vector recall than purpose-built stores

Decision flow

How we pick a vector database

Default to pgvector. Graduate only when a real production signal forces the move.

The questions that actually decide it

Before we recommend any database, we run the team through six questions. The answers usually point at the right choice before we look at any benchmark.

1. How many vectors per tenant, in 18 months?

Not today. The number you would be embarrassed to admit is aspirational. Multiply by 1.5 for headroom. If the answer is under 50 M and growing modestly, pgvector is almost certainly fine. Above that, look at Pinecone or Qdrant.

2. What is the write pattern?

Embed-once-and-forget (a static knowledge base) is easy on any store. Continuous-ingest (streaming events, changing documents) is where the dedicated vector stores pull ahead.

3. How many filter axes per query?

One filter (tenant ID): every store handles it. Three or four filters with high cardinality: Weaviate or Qdrant. Filters that depend on joining other relational data: pgvector wins by default.

4. Hybrid retrieval — yes or no?

If you need sparse + dense (BM25 + vector), some stores have it natively (Weaviate, Qdrant) and some force you to build it yourself. Building it yourself is usually fine; if you are allergic to that work, prefer a store that ships it.

5. Who owns the operations?

A two-person engineering team without a database operator should not be self-hosting Qdrant on Kubernetes. Either pay for the managed tier or stay on pgvector. A platform team with real database expertise can run anything; the question is whether they want to.

6. Does the data need to live inside a specific environment?

SOC 2 / HIPAA / VPC-only requirements rule out some managed tiers and tilt the choice toward self-hosted Postgres, Qdrant, Weaviate, or AWS-native OpenSearch. This is one of the few questions that has a hard answer rather than a preference.

How we actually run the decision

On a typical engagement the decision takes us about a day, not because the question is hard, but because the team has rarely answered the six questions above for themselves. Once they have, the right database is usually obvious. We spend the rest of the engagement on the things that actually determine retrieval quality — chunking strategy, hybrid retrieval, reranking, evaluation harness — none of which depend on which store sits underneath.

The vector database matters less than the marketing wars would suggest. Pick the simplest one that satisfies your real constraints and move on to the work that actually changes the answers.