Choosing a Vector Database: How We Wasted $40K Learning What NOT to Do

The $40K Vector Database Tour

“Just use Pinecone. Everyone uses Pinecone.”

That was the advice from our tech advisor when we started building our AI-powered document search product. Six months and $40,000 in wasted engineering costs later, we finally have a vector database setup that actually works for our use case.

This is the story of how we went through three different vector database platforms before landing on the right one—and why the “obvious” choice at the start turned out to be completely wrong for us.

For the complete technical analysis behind vector database selection, check out the comprehensive comparison on CrashBytes. But if you want to hear about what actually happens when you choose wrong, keep reading.

Month 1: Pinecone (Because That’s What Everyone Said)

Our use case seemed simple enough:

Document search across 50K+ legal contracts
Semantic similarity search for clauses and terms
Metadata filtering by client, date, document type
~1,000 searches per day initially

“That’s perfect for Pinecone,” everyone said. So we signed up, integrated their API, and got a prototype working in 2 weeks.

The Honeymoon Phase

The first 6 weeks were great:

# Look how easy this was!
import pinecone

pinecone.init(api_key=os.environ['PINECONE_KEY'])
index = pinecone.Index('contracts')

# Upload embeddings
index.upsert(vectors=[
    (doc_id, embedding, metadata)
    for doc_id, embedding, metadata in documents
])

# Search - so simple!
results = index.query(
    vector=query_embedding,
    filter={"document_type": "NDA"},
    top_k=10
)

It just worked. Our demo impressed clients. The product team loved how fast we were moving.

“See? I told you Pinecone was the right choice,” our tech advisor said smugly.

Month 2: The Bill Arrives

Then we got our first real Pinecone bill: $847.

For context, we had:

~200K vectors
~5,000 queries per week
Very light usage compared to our projections

I did some quick math. If we hit our target of 100K queries per day (our growth projection), we’d be looking at $15K-20K per month just for vector search.

Our entire AWS bill was $3K per month. Vector search would cost 5-6x our entire infrastructure combined.

But we’d already built everything around Pinecone. Sunk cost fallacy kicked in hard.

“We’ll optimize later,” I told the CEO. “Let’s focus on customer acquisition first.”

Month 3: The Performance Wall

Growth came faster than expected. We onboarded a client with 500K documents—an order of magnitude more than our prototype was designed for.

Query latency went from 80ms to 1,200ms overnight.

We frantically tried to optimize:

✅ Adjusted the index pods (cost jumped to $1,500/month)
✅ Optimized embedding dimensions (broke our existing feature set)
✅ Added aggressive caching (reduced search quality)

Nothing worked. We’d hit a fundamental limit: our use case needed complex metadata filtering on large datasets, and Pinecone’s performance degraded badly under that pattern.

The client threatened to leave. We had 2 weeks to fix it.

That’s when we made the fatal mistake: we decided to migrate to Milvus.

Month 4-5: Milvus (Because It’s “Way Faster”)

I spent a weekend reading benchmarks. Milvus crushed Pinecone on performance tests. Sub-50ms queries on datasets with millions of vectors! And it was open source—we could self-host and save money!

Perfect. Let’s migrate.

Week 1: “How Hard Can It Be?”

Our DevOps engineer (who’d never touched Milvus before) started the deployment:

# "Just install it with Helm, they said"
helm install milvus milvus/milvus \
    --set cluster.enabled=true \
    --set etcd.replicaCount=3 \
    --set pulsar.enabled=true
    # ... 47 more configuration parameters

4 days later, we had Milvus running. Sort of. It crashed every few hours for reasons we couldn’t figure out.

Week 2-3: The Debugging Nightmare

Problems we encountered:

Pulsar won’t stay up: Milvus uses Apache Pulsar for message queuing. Pulsar kept crashing. Turns out we didn’t allocate enough memory. How much memory? Nobody could tell us. We just kept increasing it.
etcd fails health checks: Milvus uses etcd for coordination. etcd was failing health checks. Why? Network latency issues we didn’t understand.
MinIO storage issues: Milvus stores data in object storage. We configured MinIO. MinIO ran out of disk space. We increased disk. MinIO performance tanked.
Index building takes forever: Rebuilding indexes took 6 hours. During rebuilds, queries were slow or failed entirely.

Week 4: The Breaking Point

After 3 weeks of debugging, we had Milvus almost stable. Query performance was indeed impressive—15ms p95 latency when it worked.

But “when it worked” was the problem.

One Saturday night at 11 PM, Milvus crashed. The on-call engineer couldn’t figure out why. The logs were inscrutable. Milvus wouldn’t restart.

Our product was down for 6 hours.

Monday morning, the CEO asked me a simple question: “Why are we using technology that only one person on the team understands, and they don’t understand it well enough to keep it running?”

I didn’t have a good answer.

We’d spent 5 weeks migrating to Milvus. We’d invested ~$25K in engineering time. Our product had suffered multiple outages. And we still didn’t have a stable system.

Time for Plan C.

Month 6: pgvector (Because We Already Know PostgreSQL)

Our CTO said something in a meeting that changed everything:

“We’re a 12-person engineering team. We’re not Google. We’re not going to become Milvus experts. What if we just used PostgreSQL?”

“PostgreSQL doesn’t do vector search,” someone said.

“Actually,” our senior backend engineer spoke up, “pgvector is pretty good now. And there’s this new extension called pgvectorscale that makes it way faster.”

We looked at each other. We already run PostgreSQL. Our entire team knows PostgreSQL.

“How fast is it?” I asked.

“Not as fast as Milvus. But probably fast enough. And we can actually operate it.”

The Migration (That Actually Worked)

We spent 1 week migrating to pgvector + pgvectorscale:

-- Install extensions
CREATE EXTENSION vector;
CREATE EXTENSION vectorscale CASCADE;

-- Create table with vector column
CREATE TABLE contract_embeddings (
    id SERIAL PRIMARY KEY,
    contract_id INTEGER REFERENCES contracts(id),
    embedding vector(1536),
    document_type VARCHAR(50),
    client_id INTEGER,
    created_at TIMESTAMP DEFAULT NOW()
);

-- Create index using StreamingDiskANN
CREATE INDEX ON contract_embeddings 
USING diskann (embedding);

-- Add metadata indexes for hybrid search
CREATE INDEX idx_document_type ON contract_embeddings(document_type);
CREATE INDEX idx_client_id ON contract_embeddings(client_id);

That’s it. No Kubernetes operators. No distributed message queues. No object storage configuration. Just PostgreSQL doing what PostgreSQL does.

The Performance Reality

We ran benchmarks comparing our three attempts:

Pinecone:

⚡ Latency: 80-120ms (on small dataset), 800-1200ms (on large dataset)
💰 Cost: $15-20K/month at target scale
🔧 Operational complexity: Low (fully managed)
📈 Scaling: Automatic but expensive

Milvus:

⚡ Latency: 15-30ms (when working)
💰 Cost: $800/month infrastructure + massive engineering overhead
🔧 Operational complexity: Very High
📈 Scaling: Flexible but requires expertise
🔥 Stability: Terrible (for us, due to lack of expertise)

pgvector with pgvectorscale:

⚡ Latency: 45-65ms
💰 Cost: $400/month (incremental on existing PostgreSQL)
🔧 Operational complexity: Low (we already operate PostgreSQL)
📈 Scaling: Standard PostgreSQL approaches
🔥 Stability: Excellent (same as our existing PostgreSQL)

The Surprise Benefits

The most valuable discovery wasn’t the performance—it was the hybrid query capabilities.

In Pinecone and Milvus, combining vector search with complex metadata filtering was awkward. We had to:

Query by vector similarity
Filter results in application code
Make additional database queries for related data

With pgvector, we could do everything in one SQL query:

SELECT 
    ce.contract_id,
    c.title,
    c.client_name,
    ce.embedding <-> $1 AS similarity,
    c.created_at,
    c.document_type
FROM contract_embeddings ce
JOIN contracts c ON c.id = ce.contract_id
WHERE 
    c.client_id = $2
    AND c.document_type = ANY($3)
    AND c.created_at > $4
ORDER BY ce.embedding <-> $1
LIMIT 10;

One query. Joins. Complex filtering. Vector similarity. All optimized by PostgreSQL’s query planner.

This was impossible to do efficiently in our previous attempts.

The Actual Lessons (That Cost $40K)

Lesson 1: Match Technology to Team Expertise

Wrong thinking: “Use the best technology for the problem.”

Right thinking: “Use the best technology your team can successfully operate.”

Milvus is objectively better than pgvector on performance benchmarks. But we couldn’t operate it successfully, so it was the wrong choice for us.

Our team knows PostgreSQL deeply. pgvector, even if slightly slower, is the right choice because we can debug it, optimize it, and keep it running.

Lesson 2: Understand Your ACTUAL Use Case

We thought our use case was “vector search.” It wasn’t.

Our use case was “hybrid search combining vector similarity with complex relational queries and metadata filtering.”

That changes everything. pgvector’s tight integration with PostgreSQL’s query planner made it better for our use case than Milvus’s raw performance.

Benchmarks don’t capture this nuance.

Lesson 3: Consider Total Cost of Ownership

Our cost analysis was naive:

Pinecone: $15K/month seemed expensive Milvus: $800/month infrastructure = cheap! pgvector: $400/month = cheapest!

But we didn’t account for engineering time:

Pinecone: $15K/month + minimal engineering time Milvus: $800/month + $30K+/month in engineering time debugging and maintaining pgvector: $400/month + minimal incremental engineering time

Total cost of ownership (TCO) looked like:

Pinecone: $180K/year
Milvus: $380K/year (!!!)
pgvector: $20K/year

Lesson 4: Operational Simplicity Has Enormous Value

When Pinecone had issues, we filed a support ticket and they fixed it.

When pgvector has issues, we debug it with tools we already know and skills we already have.

When Milvus had issues, we were stuck googling error messages and reading GitHub issues written in Chinese.

Operational simplicity compounds. Every hour saved on operations is an hour spent building features.

Lesson 5: Start Simple, Scale Later

We made the classic mistake: optimizing for a scale problem we didn’t have yet.

Our initial traffic was 5,000 queries per week. Pinecone could have handled that for $50/month. We optimized for 100K queries per day—scale we wouldn’t hit for 2 years.

The right approach: Start with the simplest thing that works. Migrate when you actually hit scale limits, not when you imagine you might.

What I’d Do Differently

If I could go back to Month 1:

Path 1: If We Were Pre-Product/Market Fit

Use Pinecone. Don’t worry about the cost. Move fast, validate the product, get customers.

Then migrate to cost-effective solutions once you know the product works and what your actual scale/performance requirements are.

Never start with Milvus unless you have a dedicated platform engineering team with distributed systems expertise.

Path 2: If We Already Had PostgreSQL Expertise

Start with pgvector. Add pgvectorscale if you need better performance. Prove your product works.

Only migrate if you genuinely outgrow PostgreSQL’s capabilities—which might never happen.

Path 3: If We Were Building AI Infrastructure as a Product

Start with Milvus or similar. But staff appropriately—you need dedicated engineers with distributed systems expertise.

Budget for: 6-12 months of infrastructure work before it’s production-ready.

Where We Are Now: 6 Months Later

We’ve been running on pgvector + pgvectorscale for 6 months now:

Performance:

p95 latency: 52ms (well within our requirements)
p99 latency: 87ms (totally acceptable)
No performance degradation as we’ve scaled to 2M+ vectors

Cost:

Monthly infrastructure cost: $420
Incremental engineering maintenance: ~2 hours per month
Savings vs. Pinecone: $14,580/month
ROI on migration: 2.5 months

Reliability:

Zero outages related to vector search
Same reliability as our PostgreSQL database (99.95%+)
Debugging issues takes minutes, not hours

Developer Experience:

New engineers productive immediately (they know SQL)
Complex hybrid queries are easier than before
No proprietary APIs to learn

The Bottom Line

Pinecone is great for:

Getting started quickly
Teams without infrastructure expertise
Unpredictable scaling needs
When cost isn’t the primary concern

Milvus is great for:

Extreme performance requirements
Teams with distributed systems expertise
Building AI infrastructure as a core competency
When operational complexity is acceptable

pgvector + pgvectorscale is great for:

Teams already using PostgreSQL
Hybrid search use cases
Cost-conscious operations
When operational simplicity matters

For us, pgvector was the right choice. But it took $40K and 6 months to figure that out.

The real lesson: Technology decisions should be based on your team’s capabilities and actual requirements, not benchmarks, hype, or what “everyone else” is doing.

For the complete technical framework and decision criteria we should have used from the start, check out the Vector Database Comparison Guide on CrashBytes.

Learn from our expensive mistakes. Choose based on your context, not ours.

Made similar mistakes with your vector database choice? Or found a different solution that works better? I’d love to hear your story. Reach out at michael@michaeleakins.com.