RESTful API Endpoints
Clean, well-documented REST endpoints for transcription, translation, subtitle formatting, and file management. Predictable URL structure, standard HTTP methods, and JSON responses make integration straightforward.
Developer API
Integrate CapsAI's powerful transcription and subtitle generation directly into your application, platform, or workflow with our production-ready REST API. Full-featured SDKs for Python, Node.js, and Go make integration effortless. Webhooks notify your system when processing completes, batch endpoints handle bulk operations, and enterprise rate limits support millions of requests. Build captioning into your product without training your own models.
99.9%
API Uptime SLA
3
Official SDKs
<2s
Avg Response Time
Features
Clean, well-documented REST endpoints for transcription, translation, subtitle formatting, and file management. Predictable URL structure, standard HTTP methods, and JSON responses make integration straightforward.
Register webhook URLs to receive real-time POST notifications when transcription jobs complete, fail, or reach milestones. No polling required - your system reacts instantly to processing events.
Production-ready SDK libraries for Python, Node.js, and Go with type definitions, error handling, retry logic, and streaming support built in. Install via pip, npm, or go get and start building in minutes.
Submit hundreds of files in a single API call with our batch endpoint. Track bulk job progress, receive completion webhooks, and download results programmatically without managing individual file states.
API key authentication with environment-scoped keys (development, staging, production). IP allowlisting, request signing, and OAuth 2.0 support for enterprise integrations with strict security requirements.
Transparent rate limits from 100 requests/minute (free) to 10,000+/minute (enterprise). Pay-per-minute pricing with volume discounts, committed-use contracts, and real-time usage dashboards.
Workflow

Step 1
Sign up and generate API keys from your dashboard. Create separate keys for development, staging, and production environments with independent rate limits and permissions.

Step 2
Install our Python, Node.js, or Go SDK. Submit audio/video files via upload or public URL. The API returns a job ID for tracking the asynchronous transcription process.

Step 3
Get notified via webhook when transcription completes, or poll the status endpoint. The response includes full transcript, timestamps, speaker labels, and confidence scores.

Step 4
Download generated subtitles as SRT, VTT, JSON, or plain text via the results endpoint. Apply translations, styling, or custom formatting through additional API parameters.
Use Cases
Auto-generate subtitles for every video uploaded to your platform. Trigger transcription on upload events and attach results to video metadata without manual intervention.
Ensure every course video has accurate captions by integrating caption generation into your content pipeline. Meet accessibility compliance automatically at scale.
Add transcription to your digital asset management pipeline. Index video content by transcript for searchability and auto-generate metadata for content libraries.
Ship captioning features in your product without building ML infrastructure. Our API handles the heavy lifting while you focus on your core product experience.
FAQ
Free tier: 100 requests/minute, 10 concurrent jobs. Pro tier: 1,000 requests/minute, 50 concurrent jobs. Enterprise: 10,000+ requests/minute with custom concurrent job limits and dedicated infrastructure.
We provide official SDKs for Python (pip install capsai), Node.js (npm install @capsai/sdk), and Go (go get github.com/capsai/capsai-go). Community SDKs exist for Ruby, PHP, and Java.
Register a webhook URL via the API or dashboard. When transcription jobs complete or fail, we send a signed POST request to your URL with the job results. Failed deliveries retry with exponential backoff up to 24 hours.
Pay-per-minute of audio processed. Free tier includes 60 minutes/month. Pro starts at $0.006/minute with volume discounts. Enterprise gets custom pricing with committed-use contracts and SLA guarantees.
Our REST API is optimized for file-based (asynchronous) transcription. For real-time streaming transcription, use our WebSocket API endpoint which delivers partial results as speech is detected with sub-second latency.
Get your free API key and start generating subtitles programmatically in minutes. Official SDKs for Python, Node.js, and Go - 99.9% uptime SLA with enterprise support available.
Get API Key Free →