TranscriptAI is a powerful, completely free audio transcription tool that leverages OpenAI's state-of-the-art Whisper AI model for highly accurate speech-to-text conversion. Whether you need to transcribe meetings, interviews, podcasts, or any audio content, TranscriptAI provides two convenient interfaces:
- Web Interface: Browser-based tool for quick transcriptions with drag-and-drop functionality
- CLI Tool: Command-line interface for developers and power users requiring batch processing and automation
Supporting 99+ languages with automatic detection and translation capabilities, TranscriptAI processes multiple audio formats including MP3, WAV, M4A, FLAC, OGG, WebM, and AAC (CLI only).
Web Interface
Browser-based transcription with drag & drop
CLI Tool
Command-line interface for batch processing
AI-Powered
OpenAI Whisper for highest accuracy
Privacy-First
Your API keys never leave your browser
🚀 Quick Start
🌐 Web Interface (Easiest)
- Visit the web app
- Upload your audio file
- Choose demo mode or add your OpenAI API key
- Get instant transcription results
Try Web App Now
💻 CLI Installation
# Clone the repository
git clone https://github.com/ombharatiya/transcript-ai.git
cd transcript-ai/cli
# Run setup script
python setup.py
# Start transcribing
source venv/bin/activate
python src/audio_transcriber.py input/audio.mp3
🌐 Web Interface Guide
Two Modes Available
Demo Mode
Perfect for testing
- No API key required
- Uses sample text responses
- Test all UI features
- Validate audio file formats
Real AI Mode
Actual transcription
- Requires OpenAI API key
- Real Whisper AI transcription
- Support for all languages
- Translation capabilities
Using Your OpenAI API Key
Security Guarantee
- ✅ Stored only in browser memory
- ✅ Never sent to our servers
- ✅ Direct communication with OpenAI
- ✅ Cleared when you close the tab
Get your API key: OpenAI Platform
✨ Features
Audio Format Support
Web Interface: MP3, WAV, M4A, FLAC, OGG, WebM
CLI Tool: All above + AAC (converted via FFmpeg)
AI Models
- Whisper-1: Web interface (OpenAI API)
- Tiny/Base/Small/Medium/Large: CLI tool options
Languages
99+ languages supported with automatic detection
Built-in translation to English
🔧 Troubleshooting
Common Issues
Web Interface
- Invalid API key: Check your OpenAI key starts with 'sk-'
- File format error: Use supported formats (no AAC)
- Rate limit: Wait a few minutes between requests
CLI Tool
- FFmpeg not found: Run the setup script or install manually
- Out of memory: Use smaller model (tiny/base)
- Permission denied: Check file permissions
🎯 Use Cases
Business
- Meeting transcriptions
- Interview documentation
- Customer support analysis
Education
- Lecture notes
- Research interviews
- Language learning
Content Creation
- Podcast show notes
- Video subtitles
- Social media captions
Accessibility
- Hearing accessibility
- Voice disabilities
- Multi-language support
🤝 Contributing
TranscriptAI is open source and welcomes contributions!
Ways to Contribute
- Report bugs: GitHub Issues
- Suggest features: Open a feature request
- Improve documentation: Submit PRs for docs
- Add translations: Help with internationalization
Development Setup
# Fork the repository
git clone https://github.com/yourusername/transcript-ai.git
# Set up CLI development
cd transcript-ai/cli
python setup.py
# Set up web development
cd transcript-ai/web
npm install # if using build tools
python -m http.server 8000 # serve locally
❓ Frequently Asked Questions
What is TranscriptAI and how does it work?
TranscriptAI is a free, open-source audio transcription tool that uses OpenAI's Whisper AI model to convert speech to text. It works by processing audio files through advanced machine learning algorithms that can recognize speech patterns in 99+ languages with high accuracy.
How accurate is TranscriptAI's transcription?
TranscriptAI uses OpenAI's Whisper model, which achieves industry-leading accuracy rates of 95%+ for clear audio in supported languages. Accuracy depends on audio quality, speaker clarity, background noise, and language/accent.
What audio formats does TranscriptAI support?
The web interface supports MP3, WAV, M4A, FLAC, OGG, and WebM formats. The CLI tool supports all these formats plus AAC (converted via FFmpeg). Maximum file size is 25MB for the web interface.
Is TranscriptAI really free? Are there any hidden costs?
TranscriptAI is completely free and open source. For real AI transcription, you need your own OpenAI API key (which has usage-based pricing from OpenAI). The demo mode and CLI tool (with local models) are entirely free.
How do I install and use the CLI tool?
Clone the repository from GitHub, navigate to the CLI directory, run 'python setup.py' for automatic setup, then use 'python src/audio_transcriber.py [audio_file]' to transcribe. The setup script installs all dependencies including FFmpeg.
What languages are supported for transcription?
TranscriptAI supports 99+ languages including English, Spanish, French, German, Italian, Portuguese, Russian, Japanese, Korean, Chinese, Arabic, Hindi, and many more. It features automatic language detection and can translate any language to English.
Is my data secure? Do you store my audio files?
Your data is completely secure. We never store your audio files or API keys. The web interface processes files client-side and sends them directly to OpenAI (when using real mode). The CLI tool processes everything locally on your machine.
Can I use TranscriptAI for commercial purposes?
Yes! TranscriptAI is released under the MIT License, allowing commercial use. However, when using OpenAI's API, you must comply with OpenAI's terms of service for commercial usage.
How do I get an OpenAI API key?
Visit platform.openai.com/api-keys, create an account, and generate a new API key. You'll need to add billing information to your OpenAI account to use the API, but you only pay for actual usage.
What's the difference between demo mode and real AI mode?
Demo mode shows sample transcription text to test the interface without requiring an API key. Real AI mode uses your OpenAI API key to perform actual transcription using the Whisper model, providing accurate results for your audio files.