Vector
Overview
Vector databases enable you to store, index, and search high-dimensional numerical vectors that represent complex data like text embeddings, images, or any feature vectors.
Vector Index provides high-performance storage and similarity search for embedding vectors with configurable distance metrics and metadata filtering. Use it to build recommendation systems, semantic search applications, and content discovery features that rely on vector similarity calculations. The system handles high-dimensional data efficiently while providing familiar database operations like insert, update, delete, and query with additional vector-specific capabilities like nearest neighbor search and batch processing.
Prerequisites
- Understanding of vector embeddings and similarity search concepts
- Raindrop framework installed in your project
- Familiarity with TypeScript and async/await patterns
- Basic knowledge of distance metrics (cosine, euclidean, dot-product)
- Embedding model or vector generation pipeline
Configuration
Define a vector index in your Raindrop application manifest with the required dimensions and distance metric:
application "my-app" { vector_index "embeddings" { dimensions = 1024 metric = "cosine" // or "euclidean", "dot-product" }}
Distance Metrics:
cosine
- Measures angle between vectors (ignores magnitude). Best for text embeddings and semantic similarity.euclidean
- Straight-line distance in vector space. Good when magnitude matters (coordinates, measurements).dot-product
- Combines angle and magnitude. Useful when vector length represents confidence or importance.
Access
The vector index becomes available through your environment bindings, allowing you to perform all vector operations:
export default class extends Service<Env> { async fetch(request: Request): Promise<Response> { // Get index information const indexInfo = await this.env.EMBEDDINGS.describe();
// Store vectors (minimal example) await this.env.EMBEDDINGS.insert([{ id: "doc-1", values: new Float32Array(1024).fill(0.1) // Must match dimensions }]);
// Store vectors with optional metadata and namespace await this.env.EMBEDDINGS.insert([{ id: "doc-2", values: new Float32Array(1024).fill(0.2), namespace: "documents", // Optional: organize vectors metadata: { // Optional: filterable data title: "Getting Started Guide", category: "documentation" } }]);
// Search for similar content const results = await this.env.EMBEDDINGS.query( new Float32Array(1024).fill(0.15), { topK: 5 } );
return new Response(JSON.stringify(results)); }}
Core Interfaces
VectorIndex
Main vector index interface providing all vector operations:
interface VectorIndex { describe(): Promise<VectorIndexIndexInfo>; // Get index information query(vector: VectorFloatArray | number[], options?: VectorIndexQueryOptions): Promise<VectorIndexMatches>; // Search by vector queryById(vectorId: string, options?: VectorIndexQueryOptions): Promise<VectorIndexMatches>; // Search by existing vector ID insert(vectors: VectorIndexVector[]): Promise<VectorIndexAsyncMutation>; // Add new vectors upsert(vectors: VectorIndexVector[]): Promise<VectorIndexAsyncMutation>; // Add or update vectors deleteByIds(ids: string[]): Promise<VectorIndexAsyncMutation>; // Remove vectors by ID getByIds(ids: string[]): Promise<VectorIndexVector[]>; // Retrieve vectors by ID}
VectorIndexVector
Represents a vector with its data and metadata:
interface VectorIndexVector { id: string; // Unique vector identifier values: VectorFloatArray | number[]; // Vector data (must match index dimensions) namespace?: string; // Optional logical grouping metadata?: Record<string, VectorIndexVectorMetadata>; // Optional filterable key-value pairs}
VectorIndexMatch
Search result with similarity score:
interface VectorIndexMatch { id: string; // Vector identifier score: number; // Similarity score (higher = more similar) values?: VectorFloatArray | number[]; // Vector data (if requested) namespace?: string; // Vector namespace metadata?: Record<string, VectorIndexVectorMetadata>; // Vector metadata (if requested)}
VectorIndexMatches
Collection of search results:
interface VectorIndexMatches { matches: VectorIndexMatch[]; // Array of similar vectors count: number; // Total number of matches found}
VectorIndexQueryOptions
Configuration for similarity searches:
interface VectorIndexQueryOptions { topK?: number; // Maximum results to return (max: 100) namespace?: string; // Limit search to specific namespace returnValues?: boolean; // Include vector values in results returnMetadata?: boolean | VectorIndexMetadataRetrievalLevel; // Include metadata in results filter?: VectorIndexVectorMetadataFilter; // Metadata filtering conditions}
VectorIndexDistanceMetric
Supported distance calculation methods:
type VectorIndexDistanceMetric = 'euclidean' | 'cosine' | 'dot-product';
Query Methods
query(vector, options?)
Finds vectors most similar to the provided query vector using the configured distance metric. Control the number of results, filter by metadata, and choose what data to return.
Parameters:
vector
: Query vector asVectorFloatArray
ornumber[]
(must match index dimensions)options
: Optional query configuration object
Returns: VectorIndexMatches
containing:
matches
: Array of similar vectors with scorescount
: Total number of matches found
// Create a query vector (same dimensions as index)const queryVector = new Float32Array([0.1, 0.3, 0.2, 0.4]);
// Basic similarity searchconst results = await this.env.EMBEDDINGS.query(queryVector, { topK: 5, returnMetadata: true, returnValues: false // Don't return the vector values to save bandwidth});
// Process resultsfor (const match of results.matches) { console.log(`${match.id}: ${match.score} - ${match.metadata?.title}`);}
Advanced Query with Filtering:
// Search with metadata filteringconst filteredResults = await this.env.EMBEDDINGS.query(queryVector, { topK: 10, namespace: "articles", // Only search within this namespace returnMetadata: "indexed", // Only return indexed metadata fields filter: { category: "documentation", publishedAt: { $ne: null }, // Must have publication date tags: "tutorial" // Must include "tutorial" tag }});
Query Options:
topK
: Maximum number of results to return (maximum: 100)namespace
: Limit search to specific namespacereturnValues
: Whether to include vector values in resultsreturnMetadata
: Metadata return level (true
,"all"
,"indexed"
,"none"
)filter
: Metadata filtering conditions
queryById(vectorId, options?)
Performs similarity search using an existing vector in the index as the query. Good for finding content similar to a specific document without reconstructing the embedding vector.
Parameters:
vectorId
: ID string of the vector to use as queryoptions
: Optional query configuration object (same as query method)
Returns: VectorIndexMatches
with similar vectors
// Find articles similar to a specific oneconst similarArticles = await this.env.EMBEDDINGS.queryById("article-123", { topK: 5, returnMetadata: true, filter: { category: "documentation" }});
console.log(`Found ${similarArticles.count} similar articles`);
Management Methods
insert(vectors)
Adds new vectors to the index. If any vector ID already exists, the operation fails. Use this when you don’t want to overwrite existing vectors.
Vector Object Structure:
- Required:
id
- Unique string identifier for the vectorvalues
- Vector data asVectorFloatArray
ornumber[]
(must match index dimensions)
- Optional:
namespace
- String to organize vectors into logical groupsmetadata
- Object with filterable key-value pairs (strings, numbers, booleans, string arrays)
Returns: VectorIndexAsyncMutation
with:
mutationId
: Unique identifier for tracking this operation
const vectors = [ { id: "article-123", values: new Float32Array([0.1, 0.2, 0.3, 0.4]), // Must match index dimensions namespace: "articles", // Optional organization metadata: { title: "Vector Search Basics", author: "Jane Smith", publishedAt: "2024-01-15", tags: ["search", "ai", "vectors"] } }, { id: "article-124", values: [0.2, 0.1, 0.4, 0.3], // Can also use regular number arrays namespace: "articles", metadata: { title: "Advanced Embedding Techniques", author: "John Doe", publishedAt: "2024-01-20", category: "advanced" } }];
const result = await this.env.EMBEDDINGS.insert(vectors);console.log(`Mutation ID: ${result.mutationId}`);
upsert(vectors)
Inserts new vectors or updates existing ones. More flexible than insert since it overwrites vectors with the same ID. Good for updating document embeddings when content changes.
Parameters:
vectors
: Array of vector objects to upsert
Returns: VectorIndexAsyncMutation
with mutation tracking ID
const updatedVectors = [ { id: "article-123", // Will update if exists, insert if new values: new Float32Array([0.15, 0.25, 0.35, 0.45]), // Updated embedding metadata: { title: "Vector Search Basics - Updated", author: "Jane Smith", lastModified: "2024-02-01", version: 2 } }];
const result = await this.env.EMBEDDINGS.upsert(updatedVectors);
deleteByIds(ids)
Removes vectors with the specified IDs from the index. The operation is asynchronous and returns a mutation ID for tracking. Deleting non-existent vectors won’t cause errors.
Parameters:
ids
: Array of vector ID strings to delete
Returns: VectorIndexAsyncMutation
with mutation tracking ID
// Remove outdated articlesconst idsToDelete = ["article-old-1", "article-old-2", "draft-123"];const result = await this.env.EMBEDDINGS.deleteByIds(idsToDelete);
console.log(`Deletion mutation ID: ${result.mutationId}`);
getByIds(ids)
Retrieves vectors by their exact IDs without performing similarity searches. Fast operation for fetching known vectors or batch processing.
Parameters:
ids
: Array of vector ID strings to retrieve
Returns: Array of VectorIndexVector
objects (may be fewer than requested if some IDs don’t exist)
const vectorIds = ["article-123", "article-124", "article-125"];const vectors = await this.env.EMBEDDINGS.getByIds(vectorIds);
for (const vector of vectors) { console.log(`Vector ${vector.id}: ${vector.metadata?.title}`); console.log(`Dimensions: ${vector.values.length}`);}
Info Methods
describe()
Returns information about the vector index, including its configuration and current statistics.
const indexInfo = await this.env.EMBEDDINGS.describe();
console.log(`Dimensions: ${indexInfo.dimensions}`);console.log(`Vector count: ${indexInfo.vectorCount}`);console.log(`Last processed: ${indexInfo.processedUpToDatetime}`);console.log(`Mutation ID: ${indexInfo.processedUpToMutation}`);
Returns: VectorIndexIndexInfo
containing:
vectorCount
: Total number of vectors in the indexdimensions
: Number of dimensions each vector must haveprocessedUpToDatetime
: Timestamp string of last processed mutationprocessedUpToMutation
: String ID of the last processed mutation
Batch Operations
Process vectors in batches for better performance and less overhead. The vector index handles bulk operations efficiently.
// Efficient: Batch insertconst batchSize = 100;const vectorBatches = chunkArray(allVectors, batchSize);
for (const batch of vectorBatches) { await this.env.EMBEDDINGS.insert(batch);}
// Less efficient: Individual insertsfor (const vector of allVectors) { await this.env.EMBEDDINGS.insert([vector]); // Avoid this pattern}
Metadata Filtering
Vector search supports metadata filtering that combines with similarity search. Filters use MongoDB-style operators.
Filter Operators
The vector index supports equality and inequality filtering on metadata fields:
const searchResults = await this.env.EMBEDDINGS.query(queryVector, { topK: 10, filter: { // Exact match category: "tutorial",
// Not equal status: { $ne: "draft" },
// Must not be null publishedAt: { $ne: null },
// Numeric comparisons rating: { $eq: 5 },
// String array contains tags: "beginner" }});
Supported Metadata Types
Metadata values can be strings, numbers, booleans, or string arrays:
const vector = { id: "comprehensive-example", values: embeddingVector, metadata: { // String values title: "Complete Guide to Vectors", category: "documentation",
// Numeric values rating: 4.8, viewCount: 1250,
// Boolean values featured: true, published: true,
// String arrays tags: ["vectors", "search", "ai", "tutorial"], authors: ["Alice Johnson", "Bob Smith"] }};
Namespace Organization
Namespaces provide logical separation within a single vector index. Organize vectors by tenant, content type, or any other categorization while maintaining unified search capabilities.
// Store vectors in different namespacesawait this.env.EMBEDDINGS.insert([ { id: "user-doc-1", values: userDocEmbedding, namespace: "user-documents", metadata: { type: "user-generated" } }, { id: "help-doc-1", values: helpDocEmbedding, namespace: "help-documentation", metadata: { type: "official" } }]);
// Search within specific namespaceconst userResults = await this.env.EMBEDDINGS.query(queryVector, { namespace: "user-documents", topK: 5});
Distance Metrics
Choose the distance metric based on your embedding model and use case. The metric affects how similarity is calculated and can impact search quality.
Cosine Similarity
Best for normalized embeddings where vector magnitude doesn’t matter. Most common choice for text embeddings from models like OpenAI’s text-embedding-ada-002.
vector_index "text-embeddings" { dimensions = 1536 metric = "cosine"}
When to use:
- Text embeddings from modern language models
- When embeddings are normalized or magnitude doesn’t matter
- General-purpose semantic search applications
Euclidean Distance
Measures straight-line distance in vector space. Good when vector magnitude matters and you want to consider absolute differences.
vector_index "feature-vectors" { dimensions = 512 metric = "euclidean"}
When to use:
- Image embeddings where spatial relationships matter
- Feature vectors representing measurable quantities
- When you need to consider vector magnitude
Dot Product
Combines similarity direction with magnitude. Good for machine learning applications where both vector direction and magnitude matter.
vector_index "ml-features" { dimensions = 256 metric = "dot-product"}
When to use:
- Specialized ML applications
- When magnitude represents confidence or importance
- Recommendation systems with weighted features