SmartBucket
SmartBucket extends the standard Bucket interface with search and chat capabilities. It provides four additional methods for content discovery: search()
, chunkSearch()
, documentChat()
, and getPaginatedResults()
. This creates a powerful combination of traditional object storage with AI-powered content analysis, making it ideal for document management systems, knowledge bases, and applications that need to search through uploaded content.
All standard bucket operations (get
, put
, delete
, list
) work exactly as with regular buckets. Files uploaded to SmartBucket become searchable through the additional methods. The system automatically processes uploaded documents to extract text content and create searchable indexes, enabling semantic search across your stored files without requiring manual content preprocessing or separate search infrastructure.
See the Bucket reference for complete documentation of standard storage operations.
Prerequisites
- Basic understanding of object storage concepts
- Raindrop framework installed and configured in your project
- Familiarity with TypeScript interfaces and async/await patterns
Creating a SmartBucket
Define a SmartBucket in your application manifest:
application "demo-app" { smartbucket "documents" {}}
Run raindrop build generate
to create the necessary files.
Accessing a SmartBucket
SmartBuckets are accessed through environment variables. The SmartBucket name from your manifest becomes an uppercase environment variable with underscores replacing dashes. This follows the same naming convention as other Raindrop resources, providing consistent access patterns across your application.
For example, smartbucket "documents"
becomes env.DOCUMENTS
. Once accessed, you can use both standard bucket operations and the enhanced search capabilities on the same interface, making it easy to integrate content discovery features into existing storage workflows.
export default class extends Service<Env> { async fetch(request: Request): Promise<Response> { // Standard bucket operation await this.env.DOCUMENTS.put( 'research-paper.pdf', pdfBuffer, { httpMetadata: { contentType: 'application/pdf' } } );
// SmartBucket search method const results = await this.env.DOCUMENTS.search({ input: "climate change research", requestId: "req-1" });
return new Response(JSON.stringify(results)); }}
Core Concepts
Search Results
Search methods return SearchResult
objects containing content matches with relevance scores. These objects provide detailed information about where matches were found, including the source document, extracted text content, and relevance scoring to help you prioritize results. The score field uses a 0-1 scale where higher scores indicate stronger semantic similarity to your search query.
interface SearchResult { chunkSignature?: string; // Unique identifier for content chunk text?: string; // Extracted text content source?: string; // Source file reference payloadSignature?: string; // Content payload identifier score?: number; // Relevance score (0-1) embed?: Float32Array; // Vector embedding data type?: string; // Content type classification}
Pagination
Search operations return pagination information for accessing additional results. This allows you to handle large result sets efficiently by loading results in manageable pages, improving performance and user experience. The pagination system tracks the total number of available results and provides methods to navigate through multiple pages of search results.
interface PaginationInfo { total: number; // Total available results page: number; // Current page number pageSize: number; // Results per page totalPages: number; // Total available pages hasMore: boolean; // More results available}
Search Operations
search()
Performs semantic search across all bucket content using natural language queries. This method analyzes the meaning and context of your search terms rather than just matching keywords, allowing you to find relevant content even when the exact words don’t appear in the documents. Results are ranked by semantic similarity and returned with relevance scores to help you identify the most pertinent matches.
const results = await smartBucket.search({ input: "financial reports and signatures", requestId: "search-001"});
console.log(`Found ${results.pagination.total} matches`);results.results.forEach(result => { console.log(`${result.source}: ${result.text?.substring(0, 100)}...`); console.log(`Relevance: ${result.score}`);});
interface SearchInput { input: string; // Natural language query requestId?: string; // Optional request tracking ID partition?: string; // Optional data partition filter}
interface SearchOutput { results: SearchResult[]; // Ranked search matches pagination: PaginationInfo; // Result pagination details}
chunkSearch()
Returns specific text chunks from documents, useful for RAG (Retrieval-Augmented Generation) applications that need text segments rather than full documents. This method breaks down document content into semantically meaningful chunks and returns the most relevant segments based on your search query. This is particularly valuable when you need to provide context to language models or when documents are too large to process in their entirety.
const chunks = await smartBucket.chunkSearch({ input: "climate change temperature data", requestId: "chunk-search-001"});
const contextChunks = chunks.results .slice(0, 5) .map(chunk => chunk.text) .join('\n\n');
interface RagSearchInput { input: string; // Search query requestId: string; // Required request ID partition?: string; // Optional data partition filter}
interface RagSearchOutput { results: SearchResult[]; // Relevant text chunks}
documentChat()
Generates answers to questions about a specific document’s content using AI-powered document analysis. This method combines document retrieval with natural language processing to provide direct answers to questions about uploaded files. Instead of just finding relevant content, it generates human-readable responses based on the document’s information, making it ideal for document Q&A systems and automated content analysis.
const answer = await smartBucket.documentChat({ objectId: "research-paper.pdf", input: "What are the main findings about renewable energy costs?", requestId: "chat-001"});
console.log(`Answer: ${answer.answer}`);
interface DocumentChatInput { objectId: string; // Target document identifier input: string; // Question or prompt requestId: string; // Request tracking ID partition?: string; // Optional data partition filter}
interface DocumentChatOutput { answer: string; // Generated response}
getPaginatedResults()
Retrieves additional pages from previous search operations using the original request ID. This method allows you to continue browsing through search results without re-executing the original search query, improving performance and providing a consistent user experience. The request ID maintains the search context and ranking, ensuring that subsequent pages contain results that follow the same relevance ordering as the initial search.
const moreResults = await smartBucket.getPaginatedResults({ requestId: "search-001", page: 2, pageSize: 20});
console.log(`Page ${moreResults.pagination.page} of ${moreResults.pagination.totalPages}`);
interface GetPaginatedResultsInput { requestId: string; // Previous search request ID page?: number; // Target page number pageSize?: number; // Results per page partition?: string; // Optional data partition filter}
interface GetPaginatedResultsOutput { results: SearchResult[]; // Search results for page pagination: PaginationInfo; // Updated pagination info}
createPageSummary()
Generates intelligent summaries of search results for specific pages, providing condensed overviews of large result sets. This method analyzes search results from a previous query and creates human-readable summaries that capture key themes, topics, and insights across the specified page of results. This is particularly useful for processing large document collections where users need quick overviews before diving into specific documents.
const summary = await smartBucket.createPageSummary({ requestId: "search-001", page: 1, pageSize: 20});
console.log(`Summary of page 1: ${summary.summary}`);
interface CreatePageSummaryInput { requestId?: string; // Previous search request ID page: number; // Page number to summarize pageSize: number; // Results per page for summary partition?: string; // Optional data partition filter}
interface CreatePageSummaryOutput { summary: string; // Generated summary text}
Interface Reference
The SmartBucket interface extends the standard Bucket interface with additional AI-powered search capabilities:
interface SmartBucket extends Bucket { // Search operations search(input: SearchInput): Promise<SearchOutput>; chunkSearch(input: RagSearchInput): Promise<RagSearchOutput>; documentChat(input: DocumentChatInput): Promise<DocumentChatOutput>; getPaginatedResults(input: GetPaginatedResultsInput): Promise<GetPaginatedResultsOutput>; createPageSummary(input: CreatePageSummaryInput): Promise<CreatePageSummaryOutput>;}
Code Examples
Complete implementations demonstrating SmartBucket integration patterns and common use cases.
Document Management System
export default class extends Service<Env> { async uploadAndSearch(request: Request): Promise<Response> { const formData = await request.formData(); const file = formData.get('document') as File; const searchQuery = formData.get('query') as string;
if (!file || !searchQuery) { return new Response('Missing file or query', { status: 400 }); }
// Store document in SmartBucket const fileBuffer = await file.arrayBuffer(); const uploadResult = await this.env.DOCUMENTS.put( file.name, fileBuffer, { httpMetadata: { contentType: file.type, contentDisposition: `attachment; filename="${file.name}"` }, customMetadata: { uploadedBy: 'user-123', uploadDate: new Date().toISOString() } } );
// Perform immediate search on uploaded content const searchResults = await this.env.DOCUMENTS.search({ input: searchQuery, requestId: `search-${Date.now()}` });
// Generate summary of first page of results const summary = await this.env.DOCUMENTS.createPageSummary({ requestId: `search-${Date.now()}`, page: 1, pageSize: 10 });
return Response.json({ upload: { key: uploadResult.key, size: uploadResult.size, uploaded: uploadResult.uploaded }, search: { totalResults: searchResults.pagination.total, results: searchResults.results.slice(0, 3), // Top 3 matches summary: summary.summary } }); }}
RAG Pipeline with Document Chat
export default class extends Service<Env> { async processDocumentQuery(request: Request): Promise<Response> { const { documentId, question, includeContext } = await request.json();
// Get direct answer from document const chatResponse = await this.env.KNOWLEDGE_BASE.documentChat({ objectId: documentId, input: question, requestId: `chat-${Date.now()}` });
let contextChunks = []; if (includeContext) { // Get relevant chunks for additional context const chunkResults = await this.env.KNOWLEDGE_BASE.chunkSearch({ input: question, requestId: `chunks-${Date.now()}` });
contextChunks = chunkResults.results .slice(0, 3) .map(chunk => ({ text: chunk.text, source: chunk.source, relevance: chunk.score })); }
return Response.json({ answer: chatResponse.answer, context: contextChunks, metadata: { documentId, timestamp: new Date().toISOString(), hasContext: includeContext } }); }}
Multi-Modal Content Discovery
export default class extends Service<Env> { async advancedContentSearch(request: Request): Promise<Response> { const { query, filters, pageSize = 20 } = await request.json();
// Initial semantic search const searchResults = await this.env.MEDIA_LIBRARY.search({ input: query, requestId: `discovery-${Date.now()}` });
// Get detailed chunks for analysis const detailedChunks = await this.env.MEDIA_LIBRARY.chunkSearch({ input: query, requestId: `analysis-${Date.now()}` });
// Generate insights summary const insights = await this.env.MEDIA_LIBRARY.createPageSummary({ requestId: `discovery-${Date.now()}`, page: 1, pageSize: pageSize });
// Process results with metadata const enrichedResults = searchResults.results.map(result => ({ ...result, preview: result.text?.substring(0, 200) + '...', hasHighRelevance: (result.score || 0) > 0.8 }));
return Response.json({ query, results: { total: searchResults.pagination.total, matches: enrichedResults, hasMorePages: searchResults.pagination.hasMore }, analysis: { summary: insights.summary, topChunks: detailedChunks.results.slice(0, 5).map(chunk => ({ content: chunk.text, relevance: chunk.score, source: chunk.source })) }, metadata: { searchTime: new Date().toISOString(), resultsPage: searchResults.pagination.page, totalPages: searchResults.pagination.totalPages } }); }}
raindrop.manifest
Configure SmartBuckets in your manifest for object storage with AI-powered search capabilities:
application "knowledge-app" { smartbucket "documents" { # SmartBucket for document storage and search }
smartbucket "research-papers" { # Academic papers with content analysis }
smartbucket "user-files" { # User-uploaded content with search capabilities }
service "file-api" { domain = "files.example.com" # Service can access SmartBuckets via env.DOCUMENTS, env.RESEARCH_PAPERS, env.USER_FILES }
actor "document-processor" { # Actors can also use SmartBucket storage and search }}