ScreenCaptureKit · macOS 14.2+
Both sides of the call.
Both clean.
Mic plus system audio captured side-by-side, mixed at the source, diarized into one transcript. No bot in the meeting. No virtual audio cable. No QuickTime gymnastics.
AVAudioEngine · 16 kHz mono · zero added latency
ScreenCaptureKit · audio-only filter · OS-level tap
The old way
A bot in your meeting.
Cloud notetakers join the call as a participant, upload your audio to a third-party server, and bill you monthly. Your customers see the bot. Your compliance team sees a vendor list. Your finance team sees a recurring charge.
The Mac Note Taker way
A tap on macOS.
ScreenCaptureKit (the modern audio path Apple shipped with macOS 13) lets a signed Mac app subscribe to the system audio stream. We mix that with your mic locally, run on-device ASR + diarization, and you get a finished, named-speaker transcript when the call ends. Nothing leaves the Mac.
Tested with
Anything that plays audio through your Mac is in scope. ScreenCaptureKit doesn't care about the meeting client - it taps the OS mixer.
Pipeline
Two streams in. One transcript out.
- 01
Permissions, once
On first launch, Mac Note Taker asks for Microphone and Screen Recording permission. macOS handles the prompts. We don't store any extra data because of them - the entitlements just unlock the capture APIs.
- 02
Two captures, parallel
AVAudioEngine taps the input device for your mic. ScreenCaptureKit subscribes to a system-audio-only stream (no display recording, no video frames). Both stream into the same on-device buffer.
- 03
VAD splits speech
FluidAudio's voice-activity detector cuts each stream into speech segments. Silence and music get dropped before they hit the heavier models.
- 04
ASR + diarization, on the Neural Engine
Parakeet TDT v3 transcribes; pyannote-segmentation-3.0 + CAM++ split the segments into speaker turns. All on Apple Neural Engine. Real-time on M1 and newer.
- 05
Merge by timestamp
The two streams' diarized turns land in one timeline. Your mic stream is force-labeled You; remote speakers get matched against the cross-meeting fingerprint database.
No bot. No upload. No subscription.
$149 lifetime, three Macs. First 100 buyers pay $79 with code FOUDNER.
Common questions
Do I need BlackHole or a virtual audio device?
No. ScreenCaptureKit is the supported way on macOS 13+. Virtual cables are a workaround for an API that's no longer the recommended path.
Will the meeting client see anything weird?
No. ScreenCaptureKit reads at the OS mixer level. The meeting client doesn't see a tap or a virtual device added.
Does it work for FaceTime?
Yes. Same path as Zoom - system audio is system audio.
What about meetings on iPhone, with the Mac taking notes?
Route the iPhone's audio to the Mac via Continuity / AirPlay or take the call on the Mac. ScreenCaptureKit can't see another device.
Battery hit?
About 6-8% of charge for a one-hour meeting on M3 Pro. Less than the Zoom client itself.