AI Models
This content is for the 0.6.3 version. Switch to the latest version for up-to-date documentation.
Raindrop provides access to a comprehensive suite of AI models through a unified interface that abstracts the complexity of working with different AI providers while maintaining type safety and performance. The AI system supports text generation, image processing, speech recognition, language translation, embeddings, and specialized capabilities like code generation and mathematical reasoning.
The framework handles model routing automatically, providing consistent interfaces across all model types. Each model has specific input and output types that ensure compile-time safety while supporting both simple one-shot operations and complex streaming workflows. Advanced models support tool calling for function execution and integration with external systems.
Key benefits:
- Unified Interface: Single
env.AI.run()
method for all model types - Type Safety: Compile-time validation of model inputs and outputs
- Tool Calling: Function execution support for compatible models
- Automatic Routing: Seamless integration across multiple AI providers
- Streaming Support: Real-time response streaming for conversational applications
- Advanced Options: Request queuing, caching, and gateway configuration
Prerequisites
- Active Raindrop project with AI binding configured
- Understanding of TypeScript generics and async/await patterns
- Familiarity with AI model concepts (LLMs, embeddings, vision models)
- Basic knowledge of your target AI use cases and model requirements
Configuration
AI capabilities are automatically available to all Raindrop applications through the env.AI
interface - no manifest configuration required.
application "ai-app" { service "api" { domain = "api.example.com" # AI interface available as this.env.AI }}
Generate the service implementation:
raindrop build generate
The AI interface is available in your generated service class:
export default class extends Service<Env> { async fetch(request: Request): Promise<Response> { // AI interface available as this.env.AI const result = await this.env.AI.run( 'llama-3.1-8b-instruct', { prompt: "Hello, AI!" } );
return new Response(result.response); }}
Access
Access AI models through the env.AI.run()
method with model-specific parameters:
// Basic text generationconst response = await this.env.AI.run( 'llama-3.3-70b', { messages: [ { role: "user", content: "Explain quantum computing" } ], max_tokens: 150 });
// Generate embeddingsconst embeddings = await this.env.AI.run( 'embeddings', { input: "Text to embed" });
// Process images with vision modelsconst analysis = await this.env.AI.run( 'llama-3.2-11b-vision', { messages: [ { role: "user", content: [ { type: "text", text: "Describe this image" }, { type: "image_url", image_url: { url: imageUrl } } ] } ] });
The interface automatically handles type checking and validates inputs based on the model selected.
Core Concepts
Model Routing System
Raindrop uses a sophisticated routing system that maps user-friendly model names to provider-specific endpoints. The framework automatically handles model discovery, routing, and response formatting to provide a consistent interface across all available models.
Type-Safe Interfaces
Each model has specific input and output type signatures that provide compile-time validation:
// TypeScript infers correct types automaticallyconst llmResponse = await env.AI.run('llama-3.3-70b', { messages: [{ role: "user", content: "Hello" }], // ← Typed input temperature: 0.7}); // ← Returns typed LLM output
const embedResponse = await env.AI.run('bge-large-en', { input: ["text1", "text2"] // ← Typed embedding input}); // ← Returns typed embedding output
Capability-Based Selection
Models are organized by capabilities to help you choose the right model for your use case:
- Chat/Completion: Conversational AI and text generation
- Vision: Image understanding and multimodal processing
- Embeddings: Text representation and semantic search
- Audio: Speech-to-text transcription
- Specialized: Code generation, mathematical reasoning, PII detection
Function Calling
Many chat models support function calling (tool calling), enabling AI models to execute specific functions and integrate with external systems. This allows models to access real-time data, perform calculations, and interact with APIs during conversations.
Standardized Model Interfaces
Raindrop provides standardized TypeScript interfaces for AI model capabilities, ensuring type safety and consistency across different model types.
Audio Processing
Used by: whisper-large-v3
, whisper
, whisper-large-v3-turbo
, whisper-tiny
, nova-3
, smart-turn-v2
interface AudioInput { audio: number[] | ReadableStream; contentType: string; language?: string; response_format?: 'json' | 'text' | 'srt' | 'vtt';}
interface AudioOutput { text: string;}
Vision Analysis
Used by: llava-1.5-7b
, uform-gen2-qwen-500m
interface VisionInput { messages: Array<{ role: 'system' | 'user' | 'assistant'; content: Array<{ type: 'text' | 'image_url'; text?: string; image_url?: { url: string; detail?: 'low' | 'high' | 'auto'; }; }>; }>; model: string; max_tokens?: number; temperature?: number;}
interface VisionOutput { choices: Array<{ message: { role: 'assistant'; content: string; }; finish_reason?: string; }>;}
Text-to-Speech
Used by: aura-1
, melotts
interface TTSInput { text: string; voice?: string; speed?: number; response_format?: 'mp3' | 'wav' | 'ogg';}
interface TTSOutput { audio: ArrayBuffer | Uint8Array; response_format?: string;}
Image Generation
Used by: flux-1-schnell
, stable-diffusion-xl-base-1.0
, phoenix-1.0
, etc.
interface ImageGenerationInput { prompt: string; negative_prompt?: string; width?: number; height?: number; steps?: number; guidance_scale?: number;}
interface ImageGenerationOutput { data: Array<{ url?: string; b64_json?: string; }>;}
Document Reranking
Used by: bge-reranker-base
interface RerankerInput { query: string; documents: string[]; top_k?: number;}
interface RerankerOutput { ranked_documents: Array<{ index: number; document: string; relevance_score: number; }>;}
Text Classification
Used by: distilbert-sst-2-int8
interface TextClassificationInput { text: string; labels?: string[];}
interface TextClassificationOutput { label: string; score: number;}
Translation
Used by: m2m100-1.2b
interface TranslationInput { text: string; source_language?: string; target_language: string;}
interface TranslationOutput { translation: string; source_lang?: string; target_lang: string;}
Summarization
Used by: bart-large-cnn
interface SummarizationInput { text: string; max_length?: number; min_length?: number;}
interface SummarizationOutput { summary: string;}
Image Classification
Used by: resnet-50
interface ImageClassificationInput { image: string | File | Blob; prompt?: string;}
interface ImageClassificationOutput { label: string; score: number;}
Chat Completion
Used by: All chat models
interface OpenAIChatInput { messages: Array<{ role: 'system' | 'user' | 'assistant'; content: string; }>; model?: string; temperature?: number; max_tokens?: number; tools?: Array<{ type: 'function'; function: { name: string; description?: string; parameters?: Record<string, unknown>; }; }>; tool_choice?: 'auto' | 'required' | 'none';}
interface OpenAIChatOutput { choices: Array<{ message: { role: 'assistant'; content: string; }; finish_reason?: string; }>; usage?: { prompt_tokens: number; completion_tokens: number; total_tokens: number; };}
Embeddings
Used by: embeddings
, bge-m3
, bge-large-en
, bge-base-en
, bge-small-en
, embeddinggemma-300m
type EmbeddingInput = string | { prompt: string } | { input: string }
interface OpenAIEmbeddingOutput { data: Array<{ embedding: number[]; index: number; }>; usage?: { prompt_tokens: number; total_tokens: number; };}
Interface Benefits
- Type Safety: Compile-time validation prevents runtime errors
- Consistency: Predictable patterns across model types and providers
- Flexibility: Automatic format conversion by the model router
- Future-Proofing: New models inherit appropriate interfaces automatically
Text Generation Models
Text generation models handle conversational AI, content creation, and language understanding tasks.
High-Performance Models
Advanced models for complex reasoning and long-context applications:
llama-3.3-70b
- Meta Llama 3.3 70B with advanced reasoning capabilitiesdeepseek-r1
- DeepSeek R1 with advanced chain-of-thought reasoningdeepseek-v3-0324
- DeepSeek V3 high-performance model with long contextkimi-k2
- Kimi K2 with tool integration capabilitiesqwen-3-32b
- Qwen 3 32B with advanced multilingual capabilitiesllama-4-maverick-17b
- Llama 4 Maverick 17B advanced modelgpt-oss-120b
- GPT OSS 120B with chain-of-thought capabilitiesllama-3.1-70b-instruct
- Meta Llama 3.1 70B large language modelqwen-coder-32b
- Qwen 2.5 Coder 32B specialized for code generationdeepseek-r1-distill-qwen-32b
- DeepSeek R1 distilled model with JSON modeqwen-qwq-32b
- QwQ 32B reasoning model capable of thinking and reasoningmistral-small-3.1
- Mistral Small 3.1 with vision understanding and 128K contextgemma-3-12b
- Gemma 3 12B multimodal model with 128K contextllama-3.2-11b-vision
- Llama 3.2 11B with visual recognition capabilitiesllama-4-scout-17b
- Meta Llama 4 Scout 17B multimodal model with MoE architectureqwen-1.5-14b
- Qwen 1.5 14B with AWQ quantizationllama-3.3-70b-instruct-fp8
- Llama 3.3 70B quantized to FP8 for speed
Efficient Models
Optimized for speed and resource efficiency:
llama-3.1-8b-external
- Meta Llama 3.1 8B for efficient processingllama-3.1-8b-instant
- Ultra-fast responses with Llama 3.1 8Bgemma-9b-it
- Google Gemma 9B instruction-tuned modelgpt-oss-20b
- GPT OSS 20B efficient reasoning modelllama-3.1-8b-instruct
- Fast and efficient general-purpose modelllama-3-8b-instruct
- Reliable model for general tasksllama-3.2-3b-instruct
- Compact and efficient modelgemma-2b
- Lightweight model for basic tasksllama-3.1-8b-instruct-fast
- Llama 3.1 8B optimized for speedllama-3.1-8b-instruct-fp8
- Llama 3.1 8B quantized to FP8llama-3.1-8b-instruct-awq
- Llama 3.1 8B with AWQ quantizationllama-3-8b-instruct-awq
- Llama 3 8B with AWQ quantizationllama-3.2-1b-instruct
- Ultra-compact 1B parameter modelgemma-7b
- Gemma 7B with LoRA adapter supportgemma-7b-it
- Gemma 7B instruction-tuned modelqwen-1.5-7b
- Qwen 1.5 7B with AWQ quantizationqwen-1.5-1.8b
- Qwen 1.5 1.8B lightweight modelqwen-1.5-0.5b
- Qwen 1.5 0.5B ultra-lightweight modeltinyllama-1.1b-chat-v1.0
- TinyLlama 1.1B chat model
Reasoning Models
Models specifically optimized for reasoning and problem-solving:
deepseek-r1-distill-llama-70b
- Fast reasoning with long context supportqwen-qwq-32b
- Reasoning model capable of thinking and reasoning
Code Generation Models
Models specialized for programming and code generation:
qwen-coder-32b
- Qwen 2.5 Coder 32B specialized for code generationdeepseek-coder-6.7b
- DeepSeek Coder 6.7B instruction-tuned for codedeepseek-coder-6.7b-base
- DeepSeek Coder 6.7B base modelsqlcoder-7b
- SQL code generation model for database queries
Mathematical Models
Models specialized for mathematical reasoning:
deepseek-math-7b
- DeepSeek Math 7B specialized for mathematical reasoning
Multilingual Models
Models optimized for specific languages or multilingual tasks:
llama-3.3-swallow-70b
- Japanese-optimized modeldiscolm-german-7b-v1-awq
- German language specialized model
Safety & Moderation Models
Models for content safety and moderation:
llama-guard-3-8b
- Content safety classificationllamaguard-7b-awq
- LLM prompt and response safety classification
Specialized Chat Models
Other specialized conversational models:
starling-lm-7b-beta
- Reinforcement learning trained modelneural-chat-7b-v3-1-awq
- Intel Gaudi optimized modelmistral-7b-instruct
- Mistral 7B instruction-tuned modelmistral-7b-instruct-v0.2
- Mistral 7B v0.2 with 32K contextmistral-7b-instruct-v0.2-lora
- Mistral 7B v0.2 with LoRAmistral-7b-instruct-awq
- Mistral 7B with AWQ quantizationuna-cybertron-7b-v2-bf16
- Cybertron 7B v2 unified alignment modelfalcon-7b-instruct
- Falcon 7B instruction-tuned modelhermes-2-pro-mistral-7b
- Hermes 2 Pro with function calling supportopenhermes-2.5-mistral-7b-awq
- OpenHermes 2.5 with code trainingzephyr-7b-beta-awq
- Zephyr 7B with AWQ quantizationopenchat-3.5
- OpenChat 3.5 with C-RLFT trainingphi-2
- Microsoft Phi-2 transformer modelllama-4-scout-17b
- Meta Llama 4 Scout 17B multimodal model
Legacy Models
Older models maintained for compatibility:
llama-2-7b-chat-fp16
- Llama 2 7B full precisionllama-2-7b-chat-int8
- Llama 2 7B quantizedllama-2-7b-chat-hf-lora
- Llama 2 7B with LoRA supportllama-2-13b-chat-awq
- Llama 2 13B with AWQ quantization
Text Generation Usage
Basic Chat Completion:
const response = await env.AI.run('llama-3.3-70b', { messages: [ { role: "system", content: "You are a helpful assistant." }, { role: "user", content: "Write a haiku about coding" } ], max_tokens: 100, temperature: 0.7});
console.log(response.choices[0].message.content);
Streaming Responses:
const stream = await env.AI.run('llama-3.1-8b-instruct', { messages: [{ role: "user", content: "Tell me a story" }], stream: true});
for await (const chunk of stream) { process.stdout.write(chunk.choices[0]?.delta?.content || '');}
JSON Mode:
const analysis = await env.AI.run('deepseek-r1-distill-llama-70b', { messages: [ { role: "user", content: "Analyze this data and return structured JSON" } ], response_format: { type: "json_object" }});
Function Calling:
const response = await env.AI.run('llama-3.3-70b', { messages: [ { role: "user", content: "What's the weather like in San Francisco?" } ], tools: [ { type: "function", function: { name: "get_weather", description: "Get the current weather for a location", parameters: { type: "object", properties: { location: { type: "string", description: "The city and state, e.g. San Francisco, CA" }, unit: { type: "string", enum: ["celsius", "fahrenheit"], description: "Temperature unit" } }, required: ["location"] } } } ], tool_choice: "auto"});
// Handle tool calls in responseif (response.choices[0].message.tool_calls) { const toolCall = response.choices[0].message.tool_calls[0]; if (toolCall.function.name === "get_weather") { const args = JSON.parse(toolCall.function.arguments); // Execute function and provide result back to model }}
Required Function Calling:
const response = await env.AI.run('kimi-k2', { messages: [ { role: "user", content: "Calculate the compound interest on $1000 for 5 years at 3% annual rate" } ], tools: [ { type: "function", function: { name: "calculate_compound_interest", description: "Calculate compound interest", parameters: { type: "object", properties: { principal: { type: "number", description: "Initial investment amount" }, rate: { type: "number", description: "Annual interest rate (as decimal)" }, time: { type: "number", description: "Time period in years" }, frequency: { type: "number", description: "Compounding frequency per year", default: 1 } }, required: ["principal", "rate", "time"] } } } ], tool_choice: "required"});
Vision Models
Vision models process and understand images alongside text for multimodal applications.
Available Vision Models
Vision-Language Models:
llava-1.5-7b
- LLaVA 1.5 7B vision-language model for image analysisuform-gen2-qwen-500m
- Compact vision-language model for image captioning and VQA
Vision Model Usage
Image Description:
const description = await env.AI.run('llava-1.5-7b', { messages: [{ role: "user", content: [ { type: "text", text: "What's in this image?" }, { type: "image_url", image_url: { url: "...", detail: "high" } } ] }], max_tokens: 300});
Image Analysis with Context:
const analysis = await env.AI.run('llama-3.2-11b-vision', { messages: [ { role: "system", content: "You are an expert art critic. Analyze images in detail." }, { role: "user", content: [ { type: "text", text: "Analyze the artistic style and techniques" }, { type: "image_url", image_url: { url: imageUrl } } ] } ], temperature: 0.3});
Embedding Models
Embedding models convert text into numerical vectors for semantic search, similarity comparison, and retrieval-augmented generation.
Available Embedding Models
High-Quality Embeddings:
embeddings
- Default embeddings model (BGE Large English v1.5)bge-large-en
- BGE Large English 1024-dimensional embeddingsbge-base-en
- BGE Base English 768-dimensional embeddingsbge-small-en
- BGE Small English 384-dimensional embeddings
Multilingual Embeddings:
bge-m3
- Multi-lingual embeddings supporting 100+ languagesembeddinggemma-300m
- Gemma 3-based 300M parameter embedding model
Specialized:
pii-detection
- PII Detection service for identifying personally identifiable information
Embedding Usage
Single Text Embedding:
const embedding = await env.AI.run('embeddings', { input: "Natural language processing with embeddings"});
const vector = embedding.data[0].embedding;
Batch Text Embedding:
const embeddings = await env.AI.run('embeddings', { input: [ "First document text", "Second document text", "Third document text" ]});
embeddings.data.forEach((item, index) => { console.log(`Document ${index}: ${item.embedding.length} dimensions`);});
Multilingual Embedding:
const multilingualEmbedding = await env.AI.run('bge-m3', { text: ["Hello world", "Hola mundo", "Bonjour le monde"]});
Audio Models
Audio models provide speech-to-text transcription with support for multiple languages and output formats.
Available Audio Models
Speech Recognition:
whisper-large-v3
- Advanced speech-to-text transcriptionwhisper
- General-purpose speech recognition modelwhisper-large-v3-turbo
- Faster, more accurate speech-to-textwhisper-tiny
- Lightweight English-only speech recognitionnova-3
- Deepgram’s speech-to-text modelsmart-turn-v2
- Audio turn detection model
Audio Usage
Basic Transcription:
// Audio file from requestconst audioFile = await request.blob();
const transcription = await env.AI.run('whisper', { file: audioFile, response_format: 'text'});
console.log(transcription.text);
Text-to-Speech Models
Text-to-speech models convert written text into natural-sounding speech audio.
Available TTS Models
High-Quality TTS:
aura-1
- Context-aware TTS with natural pacing and expressivenessmelotts
- High-quality multi-lingual text-to-speech
TTS Usage
Basic Text-to-Speech:
const speech = await env.AI.run('aura-1', { text: "Hello, this is a text-to-speech example.", voice: "default", response_format: "mp3"});
const audioBuffer = speech.audio;
Multi-language TTS:
const speech = await env.AI.run('melotts', { text: "Bonjour le monde", voice: "french", speed: 1.0});
Image Generation Models
Image generation models create images from text descriptions using diffusion techniques.
Available Image Generation Models
Advanced Generation:
flux-1-schnell
- FLUX.1 12B parameter model for high-quality image generationstable-diffusion-xl-base-1.0
- SDXL base model for high-resolution imagesstable-diffusion-xl-lightning
- SDXL Lightning for fast generationphoenix-1.0
- Leonardo.AI model with exceptional prompt adherencelucid-origin
- Leonardo.AI’s most adaptable and prompt-responsive model
Specialized Generation:
stable-diffusion-v1-5-inpainting
- Stable Diffusion with inpainting capabilitystable-diffusion-v1-5-img2img
- Generate images from input imagesdreamshaper-8-lcm
- Fine-tuned for photorealism
Image Generation Usage
Basic Image Generation:
const image = await env.AI.run('flux-1-schnell', { prompt: "A serene mountain landscape at sunset", width: 1024, height: 1024});
const imageData = image.data[0].url || image.data[0].b64_json;
Advanced Image Generation:
const image = await env.AI.run('phoenix-1.0', { prompt: "Professional headshot of a businesswoman, studio lighting", negative_prompt: "blurry, low quality, distorted", guidance_scale: 7.5, steps: 20});
Image Inpainting:
const inpaintedImage = await env.AI.run('stable-diffusion-v1-5-inpainting', { prompt: "A red car in the driveway", image: originalImageBlob, mask: maskImageBlob});
Text Classification Models
Text classification models categorize and analyze text content.
Available Text Classification Models
Sentiment Analysis:
distilbert-sst-2-int8
- Sentiment classification (positive/negative)
Document Ranking:
bge-reranker-base
- Document relevance ranking and scoring
Text Classification Usage
Sentiment Analysis:
const sentiment = await env.AI.run('distilbert-sst-2-int8', { text: "This product is amazing and works perfectly!"});
console.log(sentiment.label); // "POSITIVE"console.log(sentiment.score); // 0.98
Document Reranking:
const rankings = await env.AI.run('bge-reranker-base', { query: "machine learning algorithms", documents: [ "Neural networks and deep learning techniques", "Traditional statistical methods", "Computer vision applications" ], top_k: 3});
rankings.ranked_documents.forEach(doc => { console.log(`Score: ${doc.relevance_score} - ${doc.document}`);});
Image Classification Models
Image classification models identify and categorize objects within images.
Available Image Classification Models
Object Recognition:
resnet-50
- 50-layer CNN trained on ImageNet for object classification
Image Classification Usage
Basic Image Classification:
const classification = await env.AI.run('resnet-50', { image: imageBlob});
console.log(classification.label); // "golden_retriever"console.log(classification.score); // 0.95
Translation Models
Translation models convert text between different languages.
Available Translation Models
Multilingual Translation:
m2m100-1.2b
- Many-to-many multilingual translation model
Translation Usage
Basic Translation:
const translation = await env.AI.run('m2m100-1.2b', { text: "Hello, how are you today?", source_language: "en", target_language: "es"});
console.log(translation.translation);
Summarization Models
Summarization models create concise summaries of longer text content.
Available Summarization Models
Extractive Summarization:
bart-large-cnn
- BART model fine-tuned for text summarization
Summarization Usage
Text Summarization:
const summary = await env.AI.run('bart-large-cnn', { text: "Very long article text here...", max_length: 150, min_length: 50});
console.log(summary.summary);
PII Detection Models
Personally Identifiable Information (PII) detection models identify sensitive data in text.
Available PII Detection Models
PII Identification:
pii-detection
- Identifies personally identifiable information in text
PII Detection Usage
Basic PII Detection:
const piiResult = await env.AI.run('pii-detection', { text: "My name is John Doe and my email is john@example.com. My SSN is 123-45-6789."});
piiResult.pii_detection.forEach(entity => { console.log(`${entity.entity_type}: ${entity.text} (confidence: ${entity.confidence})`);});