i made a desktop agent (still in development)
now im making a browser extension to let it attach onto browser sessions that users are already signed in on.
Cool stuff:
Push-based DOM snapshots (MutationObserver auto-detects page changes)
Stable element IDs via WeakMap (same element always has same ID)
Works with Gmail, Google Forms, real sessions
Tab management: switch, open, close specific tabs
File upload/download support
Technical stuff:
Python FastAPI daemon (port 8788)
Manifest V3 Chrome extension
Content script walks DOM → structured JSON
200ms debounce for settle detection, 3s hard cap
Biggest limitation: Cross-origin iframes blocked by browser security. Can’t interact with YouTube embeds, payment forms, etc. But handles ~80% of normal browsing fine.
right now it does work
Comments 0
No comments yet. Be the first!
Sign in to join the conversation.