Voice models

Pick the engine
per meeting.

Mac Note Taker ships with Parakeet TDT v3 by default. Need 99-language support? Switch to Whisper Large v3. Want Whisper accuracy at 2-3× the speed? Distil-Large. On a 8 GB Mac? Small.en. One click in Settings → Models. Each model downloads once, caches locally, and runs entirely on the Apple Neural Engine.

Get lifetime - $149 Pipeline architecture →

All five models

Speed vs accuracy, your call.

Parakeet TDT v3
ParakeetDefault
FluidAudio's CoreML port of NVIDIA Parakeet TDT 0.6B. Streaming-friendly. Lowest end-to-end latency on M1+. Tight on battery.
Use when The default. Best balance for daily standups and 1:1s.
Disk
250 MB
Peak RAM
~500 MB
Throughput
~17× real-time on M2 Max
Languages
25 EU languages
Whisper Large v3
WhisperBest accuracy
OpenAI's flagship multilingual model. Highest accuracy on accented speech and noisy rooms. Heavier RAM and slower on Intel.
Use when High-stakes meetings: legal calls, customer interviews, multilingual standups.
Disk
1550 MB
Peak RAM
~2200 MB
Throughput
~1× real-time on M2 Pro
Languages
99 languages
Whisper Distil-Large v3
WhisperSweet spot
Knowledge-distilled from Large v3. Same English accuracy at 2-3× the speed. Recommended Whisper variant for most users.
Use when When you want Whisper accuracy without the Large RAM tax.
Disk
760 MB
Peak RAM
~1100 MB
Throughput
~3× real-time on M1
Languages
English (distilled)
Whisper Small.en
WhisperLightweight
OpenAI's compact English model. Fits comfortably alongside other apps. Best fit for older Macs and battery-sensitive workflows.
Use when 8 GB Macs, long meetings on battery, older M1 hardware.
Disk
480 MB
Peak RAM
~700 MB
Throughput
~5× real-time on M1
Languages
English
Parakeet TDT v2
ParakeetLegacy
Older Parakeet build kept around for hardware regressions. Slightly slower than v3. Use only if v3 misbehaves on your machine.
Use when Fallback only.
Disk
240 MB
Peak RAM
~480 MB
Throughput
~10× real-time on M1
Languages
English

How switching works.

Open Settings → Models. Pick a row. The first time you select a Whisper variant the app pulls the model from HuggingFace (Parakeet pulls from NVIDIA's CoreML cache). Every subsequent launch loads from disk - no network round-trip, no API key, no SaaS account.

Diarization (pyannote-segmentation-3.0 + CAM++) runs alongside whatever transcription model you pick. Speaker turns, voice fingerprints, and cross-meeting recognition stay identical regardless of which engine produced the words.

Live in-app stats panel shows current memory and CPU per model so you can see exactly what each choice costs.

Open Settings → Models

Voice model picker lists Parakeet v3 (active by default), Parakeet v2, Whisper Large v3, Distil-Large v3, Small.en.

Click 'Use'

First click downloads the model with a real progress bar. Subsequent clicks switch instantly.

Records as usual

Auto-detect, mic + system audio, diarization, AI summary - all flow through the model you picked.

Re-transcribe any meeting

Right-click any past meeting → Re-transcribe. Same audio, new model output. Compare and keep the better one.

FAQ

Common questions.

+ Do all models run on-device?

Yes. Parakeet runs via FluidAudio's CoreML port. Whisper variants run via WhisperKit's CoreML port. No audio leaves the Mac. The only network call is the daily license-validate ping.

+ Which model is most accurate?

Whisper Large v3 has the highest raw accuracy across 99 languages and is the recommendation for legal/customer interviews. Distil-Large v3 matches it on English at 2-3× the speed. Parakeet TDT v3 is competitive on English and far faster end-to-end - a better default for daily meetings.

+ Can I run multiple models in parallel?

Not yet. The picker activates one model at a time. You can re-transcribe a past meeting with a different model to compare outputs side-by-side.

+ How much disk space?

Parakeet v3: 250 MB. Whisper Large v3: 1.55 GB. Whisper Distil-Large v3: 760 MB. Whisper Small.en: 480 MB. Parakeet v2: 240 MB. Models are cached under ~/Documents/huggingface/ (Whisper) and ~/Library/Application Support/FluidAudio/ (Parakeet). Delete a folder to free space.

+ Will Whisper run on Intel Macs?

Mac Note Taker requires Apple Silicon (M1+). Whisper Large is borderline real-time on M2 Pro and slower on M1; pick Distil-Large or Small.en on older hardware.

+ What about diarization quality across models?

Diarization is the same regardless of which transcription model you pick - it runs separately via pyannote-segmentation-3.0 + CAM++ embeddings. Speaker labels and voice fingerprints stay consistent.

Five models. Zero subscription.

$149 lifetime · 3 Macs · code FOUDNER for $79.

Pick the engineper meeting.

Speed vs accuracy, your call.

Parakeet TDT v3

Whisper Large v3

Whisper Distil-Large v3

Whisper Small.en

Parakeet TDT v2

How switching works.

Common questions.

Five models. Zero subscription.

Pick the engine
per meeting.