Generally, there was some friction to capturing tasks on the go. Typing on a phone is tedious, and by the time I’m …
Generally, there was some friction to capturing tasks on the go. Typing on a phone is tedious, and by the time I’m back at my computer, I’ve often forgotten half of what I needed to capture. Capturing all tasks that came up is probably the most important aspect of task management.
Enter voice notes. They’re fast, natural, and perfect for brain dumps. But then you’re stuck with a pile of audio files that need to be transcribed and processed. This was what the Todoist ramble feature offered to solve, and I wanted a rudimentary replica of this. Ultimately, what I made works really well and allows me to even record audio from my smart watch to be processed into my inbox file. It is available here.
The Problem
I wanted to:
- Speak naturally: “Call the dentist next Wednesday, finish the quarterly report by Friday, and study pharmacology tomorrow”
- Have it automatically become properly formatted org-mode tasks with deadlines, scheduled dates, tags, and priorities
- Do it all locally (no sending my voice to cloud APIs)
The Solution
I built a Python script that:
- Watches a folder for new audio recordings (synced via Syncthing from my phone)
- Transcribes them locally using OpenAI’s Whisper
- Parses natural language to extract:
- Individual tasks from rambling speech
- Dates (“next Wednesday”, “by February 26th”, “tomorrow”)
- Context tags (@home, @computer, @phone)
- Priorities (“urgent”, “important”)
- Whether something is a deadline vs. scheduled date
- Generates org-mode entries with proper formatting and properties
- Archives the audio for reference
Example
I record on my phone:
Call the dentist next Wednesday, finish quarterly report by Friday tag that as important, and study cardiovascular pharmacology on Tuesday
The python script automatically creates in my inbox.org file:
* TODO Call the dentist
SCHEDULED: <2026-02-12 Wed>
:PROPERTIES:
:RECORDED: [2026-02-08 15:30]
:RECORDING_FILE: Voice_001.m4a
:TRANSCRIPT: Call the dentist next Wednesday...
:END:
* TODO Finish quarterly report
DEADLINE: <2026-02-14 Fri>
[#A]
:PROPERTIES:
:RECORDED: [2026-02-08 15:30]
:RECORDING_FILE: Voice_001.m4a
:TRANSCRIPT: Call the dentist next Wednesday...
:END:
* TODO Study cardiovascular pharmacology :@notetaking:
SCHEDULED: <2026-02-11 Tue>
:PROPERTIES:
:RECORDED: [2026-02-08 15:30]
:RECORDING_FILE: Voice_001.m4a
:TRANSCRIPT: Call the dentist next Wednesday...
:END:
How It Works
The script uses:
- faster-whisper for local transcription (no API calls, fully private)
- watchdog for monitoring file system changes
- Regular expressions for parsing natural language dates and keywords
- Pattern matching to split rambles into discrete tasks
It handles edge cases like:
- Files syncing via Syncthing (which don’t always trigger normal file creation events)
- Multiple date formats (“tomorrow”, “next Friday”, “February 26th”)
- Distinguishing deadlines from scheduled dates based on context (“by Friday” vs “on Friday”)
- Automatic tagging based on keywords
Running It
The script runs as a background service on my Mac (via launchd), constantly watching the sync folder. When a new voice memo appears, it’s processed within seconds.
The Workflow
The friction of capture is real. Every extra step between “having a thought” and “having it in your system” is a chance for that thought to be lost. Voice notes remove that friction, but only if they’re automatically processed into your actual workflow.
Now my capture flow is:
- Pull out phone or tap record on my watch
- Record voice note
- Done
The tasks appear in my org-mode inbox, properly formatted, with the right dates and contexts. No typing, no manual processing, no friction. I do need to later process my inbox to ensure it gets refiled to the correct spot, but that’s part of my daily review workflow.



