rumi - Devlog #6
June 11, 2026
so yea… pulled an all nighter for this one 🥀
most of the work this time wasn’t adding entirely new systems.
it was stress testing RUMI hard enough to find where the architecture starts breaking.
and honestly i found a lot more than i expected.
- Track B Architecture Improvements
spent a lot of time refining Track B after the initial curiosity pipeline integration.
instead of just generating curiosity questions, it now runs through much more of the actual discovery stack.
lots of internal changes here that don’t really show up visually but massively affect the quality of the final reports.
💔 Constraint Pipeline Fix
found a pretty nasty issue where parts of the generated curiosity constraints weren’t making it all the way through the pipeline.
after fixing it, the results became immediately obvious.
latest dual-track run:
Track A Unique Theories: 9
Track B Unique Theories: 8
Shared Theories: 0
which is honestly one of the strongest signals i’ve seen so far that the dual-track system is actually working as intended.
both tracks are now exploring completely different hypothesis spaces instead of converging on the same ideas.
- Claude Fable 5 Experiment
managed to get temporary access to Claude Fable 5 and immediately decided to throw RUMI at it.
before i could even test it though…
the integration broke 💔
spent a while fixing provider compatibility issues and getting everything working again.
eventually got a discovery run started on:
What happens to information when it crosses a black hole event horizon?
Fable managed to reach roughly Phase 8.5 before the free trial looked at RUMI’s request count and basically said:
aight imma head out 😭
- What Fable Revealed
this ended up being way more valuable than i expected.
while comparing Fable’s outputs with the models i normally run, i noticed a massive difference.
most models tend to generate:
Hidden Variable
↓
Mechanism Description
↓
Prediction
Fable was generating:
Hidden Variable
↓
Mechanism
↓
Equation
↓
Parameter Extraction
↓
Derivation
↓
Numerical Validation
↓
Prediction
actual equations.
actual variables.
actual derivation chains.
actual numerical checks.
🥀 Switching Back To MiMo
after the Fable credits died, i moved everything back to MiMo and started comparing outputs.
and that’s when another weakness became obvious.
the pipeline itself wasn’t failing.
MiMo simply wasn’t producing the mathematical depth needed for the mechanism stage.
which means the current bottleneck isn’t discovery anymore.
it’s mathematical formalization.
Mathematical Formalization Architecture
one of the most interesting things i noticed while testing Claude Fable 5 was how differently it handled mechanism generation.
most models would generate something like:
Hidden Variable
↓
Mechanism Description
↓
Prediction
while Fable was generating:
Hidden Variable
↓
Mechanism
↓
Equation
↓
Parameter Extraction
↓
Derivation
↓
Numerical Validation
↓
Prediction
and honestly that immediately exposed one of RUMI’s biggest weaknesses.
theories and mechanisms were already being generated.
the math wasn’t.
so instead of just complaining about it, i started redesigning the mechanism pipeline around that pattern.
RUMI now has the foundations for a much more quantitative discovery process where mechanisms aren’t just descriptions anymore, which is lwk tuff ig 🥀✌
- equations
- variables
- derivations
- parameter extraction
- numerical validation
Switching Back To MiMo 🥀
after Fable ran out of credits i moved everything back to MiMo and started testing the updated architecture and still doing it cause mimo is like hella slow bruh normally umi takes upto 30-40 mins but now with mimo shes taking aorund 2hr 30mins 💔
Comments 1
dont mind the trash writing and formatting of the devlog 💔✌ im so fried rn 🥀
Sign in to join the conversation.