monkeyspeak final devlog before ship ( hopefully )
tl;dr
- so brave and edge broke everything ( ye read what broke for more info )
- deepgram mode now transcribes live text again (words dissolve, wpm moves, momentum still reacts to your voice)
- brave and edge no longer hang for 25 seconds then fake a mic error
- production uses a render websocket proxy + vercel env, same path that worked locally
- github: nothariharan/monkeyspeak (commit
095b731and earlier speech routing work onmain) - live: monkeyspeak-delta.vercel.app
what i changed
speech routing (back to something that made sense)
mode behavior browser web speech api only. no deepgram hijack on brave/edge mount. deepgram try deepgram first (proxy → bridge fallback on chrome only). if that fails, fall back to web speech with a clear error.the ui now shows errors for the provider that actually failed, not a generic “mic blocked” when stt died for other reasons.
deepgram client hardening
- prefer
var name ur wishwebsocket proxy over the vercel http bridge - brave/edge without a reachable proxy: fail fast with a useful message (no 25s timeout)
-
utterance_end_mslocked to1000everywhere (deepgram rejects live ws with400below that) - bridge watchdog clears on
BridgeReady, not on first random chunk - server bridge waits for upstream deepgram before closing the socket
the transcript fix (the big one)
- client parses deepgram json whether it arrives as a string, blob, or arraybuffer
- render proxy forwards deepgram replies as utf-8 text frames instead of opaque binary
infra
-
backend/express +wsproxy deployable on render (render.yamlincluded) -
set up env with ur var name at vercel
-
redeployed frontend via vercel cli after env update
small polish
- clearer vad fallback logs (worker load fail vs no voice detected vs timeout)
- config bar hint on brave/edge when browser mode is selected
- momentum sprites + gsap monkey states (from earlier in the sprint)
what broke (and why it looked cursed)
-
brave and edge block or mishandle the vercel http audio bridge (duplex
fetchupload), so deepgram connections hung ~25s and never returned transcripts. -
those browsers also cannot open an authenticated websocket straight to
api.deepgram.com, so they need the render ws proxy (wss://…/api/deepgram/proxy) instead of the chrome-friendly paths. -
even after the proxy worked, deepgram sent json in binary ws frames and the client ignored anything that was not a string, so words never updated until we parsed blob/arraybuffer payloads.
what’s next (maybe)
- vendor ort wasm for vad so the worker stops complaining
- gsap scale split for clean console
- render keep-alive or paid instance if cold starts hurt demos
- preview env on vercel for pr deployments (production is wired today)
sorry if i yapped a lot and the fixes where actually slighly more techy stuff so i didnt wanna yap abt that as wel so yeah if u want to know lmk in replies :)
Comments 0
No comments yet. Be the first!
Sign in to join the conversation.