Import audio · generate summary
Drop a file.
Get the summary.
Already have an audio file from a meeting, an interview, a podcast, or a phone call? Drag it into Mac Note Taker and we hand back a diarized transcript, a 5-bullet summary, action items, and chapter timestamps. All on-device.
Drag audio here
customer-call-may-23.m4a
42 min · 38 MB · 2 speakers
- Transcribe (Parakeet TDT v3)1m 24s
- Diarize + speaker re-id18s
- Generate summary + actions + chapters11s · Ollama
Total ~1m 53s for a 42-min file on M3 Pro.
Three steps
Same pipeline as a live recording.
Whether the audio came from a live call or a file you dragged in, the rest of the pipeline is identical. Same ASR, same diarization, same summary prompt, same export targets.
- 01
Drop or import
File → Import audio, or drop into the app window. We support every format AVFoundation can decode - WAV, M4A, MP3, AAC, MP4, MOV, FLAC, OGG, AIFF.
- 02
Transcribe on-device
Parakeet TDT v3 transcribes by default. A 30-minute file finishes in roughly 90 seconds on an M3 Pro. Switch to Whisper Large v3 in Settings for harder audio or non-English content.
- 03
Diarize + summarize
pyannote-segmentation-3.0 splits the audio into speaker turns. CAM++ embeddings match against your voice profile library. An LLM (Ollama or BYO key) writes the 5-bullet summary, the action items, and the chapters.
- 04
Export
Markdown, JSON, SRT, plain text. Or land action items straight in Linear / Notion. The transcript is the audit trail; the summary is the draft.
What can you actually drop in?
Voice Memos exports
iPhone Voice Memos drag into the app as .m4a. Diarization sometimes finds two speakers in a one-mic recording - quality varies by recording distance.
Zoom local recordings
Zoom's local recording feature drops a .m4a per call. Drop it in and skip the bot - get a real diarized summary.
Microsoft Teams downloads
Recordings exported from Teams come as .mp4. We extract the audio, transcribe, and ignore the video track.
Podcast / interview recordings
Two-mic + room interview recordings diarize cleanly. The voice fingerprint library remembers the host so re-imports auto-label them.
Phone call recordings
Voicemail, conference calls, support call recordings - any audio file Apple AVFoundation can decode.
Old archives
A folder of meetings from last year? Batch import, generate summaries, build the searchable archive you wish you had at the time.
Why not a free online summarizer?
The web is full of free audio-to-summary tools. They are useful for ten-minute YouTube clips. They become a problem the moment the audio is sensitive.
| Aspect | Mac Note Taker | Free online summarizer |
|---|---|---|
| Where audio goes | Stays on your Mac | Uploaded to vendor |
| Length cap | None (depends on disk) | 10-30 min on free tier |
| Diarization | On-device, named speakers | Often none or per-call IDs |
| Re-id across files | Yes (voice fingerprint) | No |
| AI provider | Ollama or your own key | Vendor cloud |
| Action items | Owner / task / timestamp | Bullet list |
| Export | MD / JSON / SRT / Linear / Notion | Copy-paste |
| Cost | $149 once | Free until paywall |
The file never leaves your Mac.
We do not upload audio. We do not send the transcript to our cloud. If you select OpenAI for the summary step, the transcript goes directly from your Mac to your OpenAI tenant under your own API key - we don't proxy. If you select Ollama, nothing leaves at all.
Got an audio file? Get a summary.
$149 lifetime · 3 Macs · code FOUNDER for $79.