Devlog by @subhansh

@subhansh on rumi · about 2 months ago

15h 14m 19s logged

rumi - Devlog #6
June 11, 2026

so yea… pulled an all nighter for this one 🥀

most of the work this time wasn’t adding entirely new systems.

it was stress testing RUMI hard enough to find where the architecture starts breaking.

and honestly i found a lot more than i expected.

Track B Architecture Improvements

spent a lot of time refining Track B after the initial curiosity pipeline integration.

instead of just generating curiosity questions, it now runs through much more of the actual discovery stack.

lots of internal changes here that don’t really show up visually but massively affect the quality of the final reports.

💔 Constraint Pipeline Fix

found a pretty nasty issue where parts of the generated curiosity constraints weren’t making it all the way through the pipeline.

after fixing it, the results became immediately obvious.

latest dual-track run:

Track A Unique Theories: 9
Track B Unique Theories: 8
Shared Theories: 0

which is honestly one of the strongest signals i’ve seen so far that the dual-track system is actually working as intended.

both tracks are now exploring completely different hypothesis spaces instead of converging on the same ideas.

Claude Fable 5 Experiment

managed to get temporary access to Claude Fable 5 and immediately decided to throw RUMI at it.

before i could even test it though…

the integration broke 💔

spent a while fixing provider compatibility issues and getting everything working again.

eventually got a discovery run started on:

What happens to information when it crosses a black hole event horizon?

Fable managed to reach roughly Phase 8.5 before the free trial looked at RUMI’s request count and basically said:

aight imma head out 😭

What Fable Revealed

this ended up being way more valuable than i expected.

while comparing Fable’s outputs with the models i normally run, i noticed a massive difference.

most models tend to generate:

Hidden Variable
↓
Mechanism Description
↓
Prediction

Fable was generating:

Hidden Variable
↓
Mechanism
↓
Equation
↓
Parameter Extraction
↓
Derivation
↓
Numerical Validation
↓
Prediction

actual equations.

actual variables.

actual derivation chains.

actual numerical checks.

🥀 Switching Back To MiMo

after the Fable credits died, i moved everything back to MiMo and started comparing outputs.

and that’s when another weakness became obvious.

the pipeline itself wasn’t failing.

MiMo simply wasn’t producing the mathematical depth needed for the mechanism stage.

which means the current bottleneck isn’t discovery anymore.

it’s mathematical formalization.

Mathematical Formalization Architecture

one of the most interesting things i noticed while testing Claude Fable 5 was how differently it handled mechanism generation.

most models would generate something like:

Hidden Variable
↓
Mechanism Description
↓
Prediction

while Fable was generating:

Hidden Variable
↓
Mechanism
↓
Equation
↓
Parameter Extraction
↓
Derivation
↓
Numerical Validation
↓
Prediction

and honestly that immediately exposed one of RUMI’s biggest weaknesses.

theories and mechanisms were already being generated.

the math wasn’t.

so instead of just complaining about it, i started redesigning the mechanism pipeline around that pattern.

RUMI now has the foundations for a much more quantitative discovery process where mechanisms aren’t just descriptions anymore, which is lwk tuff ig 🥀✌

equations
variables
derivations
parameter extraction
numerical validation

Switching Back To MiMo 🥀

after Fable ran out of credits i moved everything back to MiMo and started testing the updated architecture and still doing it cause mimo is like hella slow bruh normally umi takes upto 30-40 mins but now with mimo shes taking aorund 2hr 30mins 💔