Question 1

Does my audio get uploaded to a server?

Accepted Answer

No. Transcription runs entirely in your browser using Whisper, an open-source speech recognition model from OpenAI. The audio never leaves your device. The Whisper model files download once from Hugging Face and are then cached locally — subsequent uses are fully offline.

Question 2

How large is the model download?

Accepted Answer

The English Fast model (whisper-tiny.en) downloads approximately 39 MB on first use. The Multilingual model (whisper-base) is approximately 73 MB. The High Accuracy model (whisper-small) is approximately 237 MB. All sizes are for the default quantization (q8). After the first download, the model is cached in your browser and no further network traffic occurs.

Question 3

Which model should I use for English speech?

Accepted Answer

English Fast (whisper-tiny.en) is the right starting point for most English content — meetings, interviews, lectures, and podcasts. It is the smallest model and runs fastest. Switch to High Accuracy (whisper-small) if you need better results on technical jargon, heavy accents, or noisy audio. The Multilingual model is optimized for non-English languages.

Question 4

How fast is transcription?

Accepted Answer

Speed depends entirely on your hardware and which backend the browser uses. On devices with a compatible GPU, the WebGPU backend is used and transcription is significantly faster than CPU. Without WebGPU, the WebAssembly backend runs on CPU. As a rough frame of reference: a device running in CPU mode might process a 10-minute file in anywhere from 2 to 15 minutes depending on the processor. No specific numbers are guaranteed.

Question 5

What is the practical limit on file length?

Accepted Answer

There is no hard limit imposed by this tool. In practice, longer files require more RAM to hold the decoded PCM in memory and more processing time. Files up to about 2 hours are generally manageable on modern devices. For very long recordings, consider extracting the audio track first (use the Extract Audio tool) to reduce the file size before transcribing.

Transcribe Audio

Drop audio or video here, or click to browse

On-device audio transcription — free, private, no upload

Frequently asked questions

Related tools

Voice Recorder

Extract Audio

Convert Audio