Amazon S3 Vectors: What It Is, Where It Fits, and the Gotchas Nobody Tells You

S3 Vectors adds native vector storage + ANN search to S3 via vector buckets and vector indexes. It’s cost‑oriented, elastic, and ideal for big volumes with sub‑second queries when throughput is moderate. It’s in preview and has sharp edges: hard Top‑K=30, only float32 vectors. This post focuses on where it fits and the gotchas that actually matter in design.

Why S3 Vectors matters

Embeddings are everywhere (RAG, recommendations, search). Most teams start with a hosted vector DB and quickly hit two things:

Cost at scale: millions to billions of vectors add up.
Ops overhead: clusters, replicas, upgrades, and capacity planning.

S3 Vectors flips that default: store vectors in S3 with a purpose‑built index that you can query directly. You don’t manage nodes; you pay S3‑style pricing for storage + per‑request querying. If your workload tolerates low to mid QPS but needs large capacity and durability, this is a compelling baseline.

Key Concepts

Vector bucket: a special S3 bucket type for vectors.
Vector index: a logical container inside a vector bucket where you write/query vectors (all vectors in an index share dimension and distance metric).
Vector: { key, values: float32[d], metadata }, where metadata can be filterable/non‑filterable.
Queries: approximate nearest neighbor (ANN) with optional metadata filters.

Key properties you’ll design around:

Dimension: 1…4096 (set at index creation).
Distance: Cosine or Euclidean (immutable per index).
Throughput: write RPS per index is low; design for batching and sharding.
Retrieve limit: Top-K is capped at 30 per similarity query — and there is no pagination mechanism for ANN queries.

Hard limits (and how to design around them)

Top‑K ≤ 30 (no pagination). If you need >30 another vector database could be more suitable.
Inserts: up to 500 vectors per PutVectors; 5 write RPS per index.
- Design: batch & backpressure; parallelize across multiple indexes (logical shards) if you need higher throughput.
Data types: float32 only for vector values. If you pass other types, S3 Vectors converts to float32.
Immutable index schema: dimension, distance, and non‑filterable metadata keys can’t be changed.

Things AWS docs don’t spell out (yet)

CloudFormation / Terraform: no native resources to create vector buckets/indexes in preview — automate with CLI/SDK from CDK or CFN Custom Resources. See AWS re:Post confirmation (no CFN/Terraform support as of Sep 2025).
Dedup across indexes? Not documented. Pricing/examples suggest you pay per copy per index. If you want to “reuse” embeddings, prefer a single index with metadata/filters, or partition by namespaces/tenants.
One bucket with many indexes or many buckets? Best practice: one bucket with multiple indexes and access control via IAM/bucket policies. Use many buckets only for strong isolation (accounts/regions/KMS/lifecycle). See S3 Vectors best practices.

Cost intuition (rule‑of‑thumb)

S3 Vectors aims to lower total cost for large volumes with moderate QPS: you pay S3‑like storage + per‑request costs for index/query. No cluster provisioning, and batch ingest jobs are cheap.

When to choose S3 Vectors vs alternatives

Need	S3 Vectors	OpenSearch (Serverless/Managed)	SaaS Vector DB
Massive volume, low cost, low‑to‑mid QPS	Yes	Maybe (if QPS climbs)	Maybe
Ultra‑low latency, high QPS, complex filters	Can fall short	Yes	Yes
Simple ops (no clusters)	Yes	Less	Less
Bedrock tie‑in (embeddings/KB)	Yes	Indirect	Varies

S3 Vectors vs Upstash Vectors (serverless)

High‑level

Aspect	S3 Vectors	Upstash Vectors
Nature	Object store with native vector indexes (preview)	Managed serverless vector DB
Pricing model	Storage + per‑request query (see AWS announcement)	Per‑request pricing (e.g., $0.4/100K req in PAYG) + storage (e.g., $0.25/GB)
Query limits	Top‑K ≤ 30, no pagination in `QueryVectors`	Client‑set `topK` with pagination via Resumable Query (cursor)
Filtering	Metadata filters; declare non‑filterable keys at index create	Metadata filtering; namespaces for isolation
Data type	`float32` only for vector values	Upstash accepts numeric vector payloads (not advertised as multiple dtypes; equivalently treated as float arrays).
Multi‑tenancy	One bucket, many indexes (index per tenant) recommended	Multiple indexes + namespaces; multiple DBs by project

Links

Implications

If you need scroll/infinite‑results UIs, Upstash’s resumable query is simpler. In S3 Vectors you’ll fan‑out across segments and re‑rank to bypass Top‑K=30. The fan-out could end in a clique with means some vectors would never be retrieved.
If your profile is huge corpuses + moderate QPS + lowest storage cost, S3 Vectors is attractive. For real‑time UX with richer ranking/pagination, Upstash is frictionless.
Multi‑tenant: both solve it, but S3 Vectors centralizes observability/logging per bucket; Upstash leans on namespaces and per‑index isolation.

Closing

S3 Vectors won’t replace every vector DB —but it changes the starting point. If your priority is cost and simplicity with large datasets and moderate QPS, start with S3 Vectors and scale out to OpenSearch or a dedicated vector DB only where the latency/query profile demands it.

Questions you want me to benchmark next? Throughput vs. shards, recall vs. filters, or evaluator design for reranking under Top‑K=30.