Jesse Moraga

I Built a Voice AI That Lives in My Headset — Full Source Code

I built a voice AI that lives in my headset and has full access to my computer. No subscription. No app to open. Just talk.

Why I built it

Every AI assistant on the market requires you to stop what you’re doing, pick up your phone, open an app, and type. I work in the field. I’m moving. I needed something that worked the way I do — hands-free, always on, zero friction.

So I built one.

What it actually does

J.A.R.V.I.S. runs locally on my Mac. It’s always listening through my JLab headset. I speak, it processes through a 70B language model via OpenRouter, and the response comes back through my earpiece in under two seconds — in a voice I chose, using Fish Audio’s TTS engine.

It has full access to my system. Not a demo. Not sandboxed. Real access:

See my screen and describe what’s on it
Run any shell command and speak the result back
Search the web and summarize findings
Read my calendar and email
Open apps, browse URLs, control Spotify
Spawn a full coding terminal with an uncensored model
Wake on a custom phrase — no button press

How I built it

The backend is FastAPI running a WebSocket server on my machine. Voice input comes through the Web Speech API in Chrome. TTS runs through Fish Audio. The LLM is served via OpenRouter. The frontend is a Three.js audio-reactive orb that pulses when she’s thinking and reacts to her voice in real time.

The whole thing is maybe 3,000 lines of Python and TypeScript. No cloud dependency. No monthly API bill beyond what I already use. It runs on my hardware and goes wherever my headset goes.

Voice AI. Full system access. Lives in your headset.
Jesse Moraga

It’s open source

The full codebase is on GitHub. Clone it, swap in your own API keys, and you’ve got a voice AI running locally in about 20 minutes.

→ github.com/JesseMoraga/jarvis

This is the same approach I take with every tool I build — I run it on my own operation first, prove it works under real conditions, then open it up. The software isn’t theoretical. It’s what I’m actually using right now.