SmartBucket

SmartBucket extends the standard Bucket interface with search and chat capabilities. It provides four additional methods for content discovery: search(), chunkSearch(), documentChat(), and getPaginatedResults(). This creates a powerful combination of traditional object storage with AI-powered content analysis, making it ideal for document management systems, knowledge bases, and applications that need to search through uploaded content.

All standard bucket operations (get, put, delete, list) work exactly as with regular buckets. Files uploaded to SmartBucket become searchable through the additional methods. The system automatically processes uploaded documents to extract text content and create searchable indexes, enabling semantic search across your stored files without requiring manual content preprocessing or separate search infrastructure.

See the Bucket reference for complete documentation of standard storage operations.

Prerequisites

Basic understanding of object storage concepts
Raindrop framework installed and configured in your project
Familiarity with TypeScript interfaces and async/await patterns

Creating a SmartBucket

Define a SmartBucket in your application manifest:

application "demo-app" {
    smartbucket "documents" {}
}

Run raindrop build generate to create the necessary files.

Accessing a SmartBucket

SmartBuckets are accessed through environment variables. The SmartBucket name from your manifest becomes an uppercase environment variable with underscores replacing dashes. This follows the same naming convention as other Raindrop resources, providing consistent access patterns across your application.

For example, smartbucket "documents" becomes env.DOCUMENTS. Once accessed, you can use both standard bucket operations and the enhanced search capabilities on the same interface, making it easy to integrate content discovery features into existing storage workflows.

export default class extends Service<Env> {
  async fetch(request: Request): Promise<Response> {
    // Standard bucket operation
    await this.env.DOCUMENTS.put(
      'research-paper.pdf',
      pdfBuffer,
      { httpMetadata: { contentType: 'application/pdf' } }
    );

    // SmartBucket search method
    const results = await this.env.DOCUMENTS.search({
      input: "climate change research",
      requestId: "req-1"
    });

    return new Response(JSON.stringify(results));
  }
}

Core Concepts

Search Results

Search methods return SearchResult objects containing content matches with relevance scores. These objects provide detailed information about where matches were found, including the source document, extracted text content, and relevance scoring to help you prioritize results. The score field uses a 0-1 scale where higher scores indicate stronger semantic similarity to your search query.

interface SearchResult {
  chunkSignature?: string;    // Unique identifier for content chunk
  text?: string;              // Extracted text content
  source?: string | { object?: string }; // Source file reference
  payloadSignature?: string;  // Content payload identifier
  score?: number;             // Relevance score (0-1)
  embed?: string;             // Vector embedding reference
  type?: string;              // Content type classification
}

Pagination

Search operations return pagination information for accessing additional results. This allows you to handle large result sets efficiently by loading results in manageable pages, improving performance and user experience. The pagination system tracks the total number of available results and provides methods to navigate through multiple pages of search results.

interface PaginationInfo {
  total: number;        // Total available results
  page: number;         // Current page number
  pageSize: number;     // Results per page
  totalPages: number;   // Total available pages
  hasMore: boolean;     // More results available
}

Search Operations

search()

Performs semantic search across all bucket content using natural language queries. This method analyzes the meaning and context of your search terms rather than just matching keywords, allowing you to find relevant content even when the exact words don’t appear in the documents. Results are ranked by semantic similarity and returned with relevance scores to help you identify the most pertinent matches.

const results = await smartBucket.search({
  input: "financial reports and signatures",
  requestId: "search-001"
});

console.log(`Found ${results.pagination.total} matches`);
results.results.forEach(result => {
  console.log(`${result.source}: ${result.text?.substring(0, 100)}...`);
  console.log(`Relevance: ${result.score}`);
});

Search Input
Search Output

interface SearchInput {
  input: string;        // Natural language query
  requestId?: string;   // Optional request tracking ID
}

interface SearchOutput {
  results: SearchResult[];      // Ranked search matches
  pagination: PaginationInfo;   // Result pagination details
}

chunkSearch()

Returns specific text chunks from documents, useful for RAG (Retrieval-Augmented Generation) applications that need text segments rather than full documents. This method breaks down document content into semantically meaningful chunks and returns the most relevant segments based on your search query. This is particularly valuable when you need to provide context to language models or when documents are too large to process in their entirety.

const chunks = await smartBucket.chunkSearch({
  input: "climate change temperature data",
  requestId: "chunk-search-001"
});

const contextChunks = chunks.results
  .slice(0, 5)
  .map(chunk => chunk.text)
  .join('\n\n');

RAG Input
RAG Output

interface RagSearchInput {
  input: string;        // Search query
  requestId: string;    // Required request ID
}

interface RagSearchOutput {
  results: SearchResult[];  // Relevant text chunks
}

documentChat()

Generates answers to questions about a specific document’s content using AI-powered document analysis. This method combines document retrieval with natural language processing to provide direct answers to questions about uploaded files. Instead of just finding relevant content, it generates human-readable responses based on the document’s information, making it ideal for document Q&A systems and automated content analysis.

const answer = await smartBucket.documentChat({
  objectId: "research-paper.pdf",
  input: "What are the main findings about renewable energy costs?",
  requestId: "chat-001"
});

console.log(`Answer: ${answer.answer}`);

Chat Input
Chat Output

interface DocumentChatInput {
  objectId: string;     // Target document identifier
  input: string;        // Question or prompt
  requestId: string;    // Request tracking ID
}

interface DocumentChatOutput {
  answer: string;       // Generated response
}

getPaginatedResults()

Retrieves additional pages from previous search operations using the original request ID. This method allows you to continue browsing through search results without re-executing the original search query, improving performance and providing a consistent user experience. The request ID maintains the search context and ranking, ensuring that subsequent pages contain results that follow the same relevance ordering as the initial search.

const moreResults = await smartBucket.getPaginatedResults({
  requestId: "search-001",
  page: 2,
  pageSize: 20
});

console.log(`Page ${moreResults.pagination.page} of ${moreResults.pagination.totalPages}`);

Pagination Input
Pagination Output

interface GetPaginatedResultsInput {
  requestId: string;    // Previous search request ID
  page?: number;        // Target page number
  pageSize?: number;    // Results per page
}

interface GetPaginatedResultsOutput {
  results: SearchResult[];      // Search results for page
  pagination: PaginationInfo;   // Updated pagination info
}