SmartBucket
SmartBucket extends the standard Bucket interface with search and chat capabilities. It provides four additional methods for content discovery: search()
, chunkSearch()
, documentChat()
, and getPaginatedResults()
. This creates a powerful combination of traditional object storage with AI-powered content analysis, making it ideal for document management systems, knowledge bases, and applications that need to search through uploaded content.
All standard bucket operations (get
, put
, delete
, list
) work exactly as with regular buckets. Files uploaded to SmartBucket become searchable through the additional methods. The system automatically processes uploaded documents to extract text content and create searchable indexes, enabling semantic search across your stored files without requiring manual content preprocessing or separate search infrastructure.
See the Bucket reference for complete documentation of standard storage operations.
Prerequisites
- Basic understanding of object storage concepts
- Raindrop framework installed and configured in your project
- Familiarity with TypeScript interfaces and async/await patterns
Creating a SmartBucket
Define a SmartBucket in your application manifest:
application "demo-app" { smartbucket "documents" {}}
Run raindrop build generate
to create the necessary files.
Accessing a SmartBucket
SmartBuckets are accessed through environment variables. The SmartBucket name from your manifest becomes an uppercase environment variable with underscores replacing dashes. This follows the same naming convention as other Raindrop resources, providing consistent access patterns across your application.
For example, smartbucket "documents"
becomes env.DOCUMENTS
. Once accessed, you can use both standard bucket operations and the enhanced search capabilities on the same interface, making it easy to integrate content discovery features into existing storage workflows.
export default class extends Service<Env> { async fetch(request: Request): Promise<Response> { // Standard bucket operation await this.env.DOCUMENTS.put( 'research-paper.pdf', pdfBuffer, { httpMetadata: { contentType: 'application/pdf' } } );
// SmartBucket search method const results = await this.env.DOCUMENTS.search({ input: "climate change research", requestId: "req-1" });
return new Response(JSON.stringify(results)); }}
Core Concepts
Search Results
Search methods return SearchResult
objects containing content matches with relevance scores. These objects provide detailed information about where matches were found, including the source document, extracted text content, and relevance scoring to help you prioritize results. The score field uses a 0-1 scale where higher scores indicate stronger semantic similarity to your search query.
interface SearchResult { chunkSignature?: string; // Unique identifier for content chunk text?: string; // Extracted text content source?: string | { object?: string }; // Source file reference payloadSignature?: string; // Content payload identifier score?: number; // Relevance score (0-1) embed?: string; // Vector embedding reference type?: string; // Content type classification}
Pagination
Search operations return pagination information for accessing additional results. This allows you to handle large result sets efficiently by loading results in manageable pages, improving performance and user experience. The pagination system tracks the total number of available results and provides methods to navigate through multiple pages of search results.
interface PaginationInfo { total: number; // Total available results page: number; // Current page number pageSize: number; // Results per page totalPages: number; // Total available pages hasMore: boolean; // More results available}
Search Operations
search()
Performs semantic search across all bucket content using natural language queries. This method analyzes the meaning and context of your search terms rather than just matching keywords, allowing you to find relevant content even when the exact words don’t appear in the documents. Results are ranked by semantic similarity and returned with relevance scores to help you identify the most pertinent matches.
const results = await smartBucket.search({ input: "financial reports and signatures", requestId: "search-001"});
console.log(`Found ${results.pagination.total} matches`);results.results.forEach(result => { console.log(`${result.source}: ${result.text?.substring(0, 100)}...`); console.log(`Relevance: ${result.score}`);});
interface SearchInput { input: string; // Natural language query requestId?: string; // Optional request tracking ID}
interface SearchOutput { results: SearchResult[]; // Ranked search matches pagination: PaginationInfo; // Result pagination details}
chunkSearch()
Returns specific text chunks from documents, useful for RAG (Retrieval-Augmented Generation) applications that need text segments rather than full documents. This method breaks down document content into semantically meaningful chunks and returns the most relevant segments based on your search query. This is particularly valuable when you need to provide context to language models or when documents are too large to process in their entirety.
const chunks = await smartBucket.chunkSearch({ input: "climate change temperature data", requestId: "chunk-search-001"});
const contextChunks = chunks.results .slice(0, 5) .map(chunk => chunk.text) .join('\n\n');
interface RagSearchInput { input: string; // Search query requestId: string; // Required request ID}
interface RagSearchOutput { results: SearchResult[]; // Relevant text chunks}
documentChat()
Generates answers to questions about a specific document’s content using AI-powered document analysis. This method combines document retrieval with natural language processing to provide direct answers to questions about uploaded files. Instead of just finding relevant content, it generates human-readable responses based on the document’s information, making it ideal for document Q&A systems and automated content analysis.
const answer = await smartBucket.documentChat({ objectId: "research-paper.pdf", input: "What are the main findings about renewable energy costs?", requestId: "chat-001"});
console.log(`Answer: ${answer.answer}`);
interface DocumentChatInput { objectId: string; // Target document identifier input: string; // Question or prompt requestId: string; // Request tracking ID}
interface DocumentChatOutput { answer: string; // Generated response}
getPaginatedResults()
Retrieves additional pages from previous search operations using the original request ID. This method allows you to continue browsing through search results without re-executing the original search query, improving performance and providing a consistent user experience. The request ID maintains the search context and ranking, ensuring that subsequent pages contain results that follow the same relevance ordering as the initial search.
const moreResults = await smartBucket.getPaginatedResults({ requestId: "search-001", page: 2, pageSize: 20});
console.log(`Page ${moreResults.pagination.page} of ${moreResults.pagination.totalPages}`);
interface GetPaginatedResultsInput { requestId: string; // Previous search request ID page?: number; // Target page number pageSize?: number; // Results per page}
interface GetPaginatedResultsOutput { results: SearchResult[]; // Search results for page pagination: PaginationInfo; // Updated pagination info}