Skip to content

SmartBucket

SmartBucket extends the standard Bucket interface with search and chat capabilities. It provides four additional methods for content discovery: search(), chunkSearch(), documentChat(), and getPaginatedResults(). This creates a powerful combination of traditional object storage with AI-powered content analysis, making it ideal for document management systems, knowledge bases, and applications that need to search through uploaded content.

All standard bucket operations (get, put, delete, list) work exactly as with regular buckets. Files uploaded to SmartBucket become searchable through the additional methods. The system automatically processes uploaded documents to extract text content and create searchable indexes, enabling semantic search across your stored files without requiring manual content preprocessing or separate search infrastructure.

See the Bucket reference for complete documentation of standard storage operations.

Prerequisites

  • Basic understanding of object storage concepts
  • Raindrop framework installed and configured in your project
  • Familiarity with TypeScript interfaces and async/await patterns

Creating a SmartBucket

Define a SmartBucket in your application manifest:

application "demo-app" {
smartbucket "documents" {}
}

Run raindrop build generate to create the necessary files.

Accessing a SmartBucket

SmartBuckets are accessed through environment variables. The SmartBucket name from your manifest becomes an uppercase environment variable with underscores replacing dashes. This follows the same naming convention as other Raindrop resources, providing consistent access patterns across your application.

For example, smartbucket "documents" becomes env.DOCUMENTS. Once accessed, you can use both standard bucket operations and the enhanced search capabilities on the same interface, making it easy to integrate content discovery features into existing storage workflows.

export default class extends Service<Env> {
async fetch(request: Request): Promise<Response> {
// Standard bucket operation
await this.env.DOCUMENTS.put(
'research-paper.pdf',
pdfBuffer,
{ httpMetadata: { contentType: 'application/pdf' } }
);
// SmartBucket search method
const results = await this.env.DOCUMENTS.search({
input: "climate change research",
requestId: "req-1"
});
return new Response(JSON.stringify(results));
}
}

Core Concepts

Search Results

Search methods return SearchResult objects containing content matches with relevance scores. These objects provide detailed information about where matches were found, including the source document, extracted text content, and relevance scoring to help you prioritize results. The score field uses a 0-1 scale where higher scores indicate stronger semantic similarity to your search query.

interface SearchResult {
chunkSignature?: string; // Unique identifier for content chunk
text?: string; // Extracted text content
source?: string | { object?: string }; // Source file reference
payloadSignature?: string; // Content payload identifier
score?: number; // Relevance score (0-1)
embed?: string; // Vector embedding reference
type?: string; // Content type classification
}

Pagination

Search operations return pagination information for accessing additional results. This allows you to handle large result sets efficiently by loading results in manageable pages, improving performance and user experience. The pagination system tracks the total number of available results and provides methods to navigate through multiple pages of search results.

interface PaginationInfo {
total: number; // Total available results
page: number; // Current page number
pageSize: number; // Results per page
totalPages: number; // Total available pages
hasMore: boolean; // More results available
}

Search Operations

Performs semantic search across all bucket content using natural language queries. This method analyzes the meaning and context of your search terms rather than just matching keywords, allowing you to find relevant content even when the exact words don’t appear in the documents. Results are ranked by semantic similarity and returned with relevance scores to help you identify the most pertinent matches.

const results = await smartBucket.search({
input: "financial reports and signatures",
requestId: "search-001"
});
console.log(`Found ${results.pagination.total} matches`);
results.results.forEach(result => {
console.log(`${result.source}: ${result.text?.substring(0, 100)}...`);
console.log(`Relevance: ${result.score}`);
});
interface SearchInput {
input: string; // Natural language query
requestId?: string; // Optional request tracking ID
}

chunkSearch()

Returns specific text chunks from documents, useful for RAG (Retrieval-Augmented Generation) applications that need text segments rather than full documents. This method breaks down document content into semantically meaningful chunks and returns the most relevant segments based on your search query. This is particularly valuable when you need to provide context to language models or when documents are too large to process in their entirety.

const chunks = await smartBucket.chunkSearch({
input: "climate change temperature data",
requestId: "chunk-search-001"
});
const contextChunks = chunks.results
.slice(0, 5)
.map(chunk => chunk.text)
.join('\n\n');
interface RagSearchInput {
input: string; // Search query
requestId: string; // Required request ID
}

documentChat()

Generates answers to questions about a specific document’s content using AI-powered document analysis. This method combines document retrieval with natural language processing to provide direct answers to questions about uploaded files. Instead of just finding relevant content, it generates human-readable responses based on the document’s information, making it ideal for document Q&A systems and automated content analysis.

const answer = await smartBucket.documentChat({
objectId: "research-paper.pdf",
input: "What are the main findings about renewable energy costs?",
requestId: "chat-001"
});
console.log(`Answer: ${answer.answer}`);
interface DocumentChatInput {
objectId: string; // Target document identifier
input: string; // Question or prompt
requestId: string; // Request tracking ID
}

getPaginatedResults()

Retrieves additional pages from previous search operations using the original request ID. This method allows you to continue browsing through search results without re-executing the original search query, improving performance and providing a consistent user experience. The request ID maintains the search context and ranking, ensuring that subsequent pages contain results that follow the same relevance ordering as the initial search.

const moreResults = await smartBucket.getPaginatedResults({
requestId: "search-001",
page: 2,
pageSize: 20
});
console.log(`Page ${moreResults.pagination.page} of ${moreResults.pagination.totalPages}`);
interface GetPaginatedResultsInput {
requestId: string; // Previous search request ID
page?: number; // Target page number
pageSize?: number; // Results per page
}