🧩 Skills
OpenAI Whisper

OpenAI Whisper

Whisper runs OpenAI's speech-to-text model locally on your machine. Fast, accurate transcription across dozens of languages β€” audio never leaves your device.

Install:

npx clawhub@latest install openai-whisper

What it does

Transcribes audio files to text using the Whisper model running locally. Supports .mp3, .mp4, .m4a, .wav, .webm, and most common audio formats.

Use caseExample
Meeting notesTranscribe a Zoom/Meet recording to a structured notes file
Voice memosConvert voice recordings to searchable text in MEMORY.md
Podcast clipsPull quotes from an episode without manual listening
InterviewsTranscribe a recorded interview for a written piece
DictationSpeak your thoughts, get a draft document

Basic usage

Transcribe the audio file at ~/recordings/meeting-2026-05-25.m4a
Transcribe ~/voice-memo.m4a and save the result to ~/notes/2026-05-25-memo.md
Transcribe this meeting recording and format the output as structured notes with: attendees, key decisions, and action items.
~/recordings/team-standup.mp3

Local processing

Whisper runs fully on your machine β€” no audio is sent to OpenAI's servers or any external service. This matters for:

  • Confidential meetings and calls
  • Legal or medical recordings
  • Any audio you wouldn't want uploaded to a third party

The first run downloads the Whisper model weights to your machine (roughly 1–3 GB depending on model size). Subsequent runs are fast and fully offline.

Pair with other skills

  • Morning brief β€” transcribe your morning voice note and include the summary in the brief
  • Memory β€” save transcriptions to dated files in ~/memory/ for searchable recall
  • Humanizer β€” transcriptions often sound natural already; use Humanizer to clean up transcribed text before publishing
  • Nano PDF β€” transcribe a recorded presentation, then edit the accompanying PDF to match