Devlog by @raviiiibro

@raviiiibro on swarmtrace · about 2 months ago

3h 51m 53s logged

🐝 SwarmTrace — Developer Log 1
Project: SwarmTrace
Version: 0.1.7
Built for: AMD Hackathon
Author: Ravi Kumar

What is it?
SwarmTrace is a tracing and observability library for LLM-based multi-agent systems — essentially “pytest for AI agents.” The core idea: AI swarms are unpredictable, and existing tools give you no visibility into what happened, where it broke, or whether a prompt change caused a regression. SwarmTrace fixes that.

What it does
Tracing — wrap any Python function (sync or async) with @observe and every call is automatically recorded: inputs, outputs, latency, token usage, cost, errors, and parent–child relationships between nested agent calls.
Token Budget — the @budget decorator tracks cumulative token usage across calls and warns you (or hard-stops) before an agent blows past its limit.
Regression Detection — the compare() function runs two prompt versions against the same inputs and uses an LLM to score output similarity, flagging regressions automatically.
Tool Attention (ISO Scoring) — implements the technique from arXiv:2604.21816. Instead of injecting all tool schemas into every prompt, it uses FAISS + sentence embeddings to select only the top-k relevant tools per query — reducing tool token overhead by up to 95%.
Web Scraping Traces — scraper.scrape(url) wraps Scrapling with full trace recording so scrape calls show up in the call tree like any other agent action.
CLI — swarmtrace prints a rich call tree with costs and statuses. swarmtrace-replay replays any individual trace. swarmtrace-export dumps everything to JSON or CSV.

Frontend Dashboard
A React + TanStack Router frontend visualises live traces from the FastAPI backend:

Stat bar — total traces, cumulative tokens, total cost, error count at a glance
Call Tree — hierarchical view of parent→child agent calls for a full run
Waterfall — timeline view showing which agents ran in parallel vs sequentially
Token Chart — per-trace token usage over time
Failures page — isolated view of every errored trace with its call chain and error message
Detail Drawer — click any trace to inspect its full args, output, latency, and lineage
Live Mode — polls the backend every 2 seconds for real-time trace updates

Storage
All traces land in a local SQLite database (~/.tracely.db) with WAL mode enabled. Auto-purges oldest rows beyond 10,000 to stay bounded. Indexed on timestamp for fast ordered reads. Export to JSON/CSV available via CLI.

Stack
LayerTechnologyCore libraryPython 3.8+, SQLite, contextvarsRegression LLMClaude Haiku via litaiTool selectionFAISS + sentence-transformersAPIFastAPI + PydanticFrontendReact, TanStack Router/Start, Tailwind, RechartsCLI displayrichPackagingsetuptools, pip-installable

Key design decisions
contextvars for parent tracking — parent IDs propagate automatically through async and threaded code without any manual wiring. Nested @observe calls build the tree on their own.
SQLite over a hosted DB — zero setup for the user; works offline; the auto-purge keeps it from growing unboundedly.
Graceful degradation everywhere — if sentence-transformers/FAISS aren’t installed, Tool Attention falls back to returning the first k tools. If tiktoken isn’t installed, budget tracking falls back to len // 4. Storage errors never crash the agent being traced.

Current limitations

Failures page still uses static demo data instead of the live API
swarm_module_0–4.py are five identical placeholder files (to be collapsed into one)
No authentication on the API (development only)
pyproject.toml is incomplete; setup.py not yet migrated.
i have to complete ui first now then i will do other api work because it will be connected to ui because it was originally built a python frame work.byee