Devlog by @Churu - Stardance

@Churu on Jarvis · 6 days ago

10h 24m 46s logged

so i made this voice assistant “jarvis” in python
basically it listens for a wake word, then you talk to it, and it replies or runs stuff like opening apps / answering questions / whatever i wired into it
I built a Python voice assistant called Jarvis. It listens for a wake word “Jarvis”, processes spoken input, and then responds using tts or executes basic system actions like opening apps, running commands, or answering queries through an AI/API layer.

How it works (in simple terms)

The flow is basically:

Microphone input
The program constantly listens through the mic until it detects the wake word.
Wake word detection
Once “Jarvis” is detected, it switches into active listening mode.
Speech-to-text
Your speech is converted into text using a speech recognition engine.
Processing layer
The text is either:
matched to predefined commands (like opening apps, searching, etc.), or
sent to an AI model/API for a response
Response output
The response is converted back to speech and played through speakers.
State control
It manages states like “listening”, “thinking”, and “speaking” so it doesn’t overlap audio input and output.
at first it was kinda simple but then everything started breaking

mic input was the first pain
it either didn’t hear me or picked up the wrong device
spent way too long just figuring out why it was “silent” but actually listening to the wrong mic

then speech recognition started being annoying
like it would randomly misunderstand words or just freeze if i spoke too fast

also made the classic mistake of letting it listen while it was speaking
so it would hear its own voice and start looping responses , so i had to add a lock so it doesnt talk while talking

biggest headache was the API / env stuff
worked fine when i ran python main.py
then i packaged it and suddenly everything was “missing key / file not found / module not found”

turned out i had hardcoded paths and assumed my machine layout is universal fixed that by making paths dynamic + cleaning up env loading

also packaging with pyinstaller was pain
some imports just randomly didn’t show up in dist build and i had to manually force include them

now it mostly works but i still feel like it’s fragile,
one change and it might break again.

but yeah overall it taught me:

audio stuff is way harder than it looks
packaging python apps is cursed
and debugging voice stuff is basically just “try everything until it works”