Devlog by @subhansh

@subhansh on rumi · about 2 months ago

2h 17m 35s logged

spent basically the whole day inside RUMI’s discovery engine and ngl it was pain 💔

fixed 50+ bugs across the pipeline. some of the biggest ones were:

empty hypotheses
theory tournament crashing
unicode errors on windows
graph contamination
novelty scoring issues
mechanism validation deleting everything 😭

also built a few new systems:

math verification engine
counterfactual/“what if” hypothesis generation
entity enrichment (NASA, PubChem, UniProt, PDB, etc.)
better novelty scoring
tournament winner override

before the fixes, some discoveries were scoring:

0/100
40/100
45/100

after everything:

FRB Magnetar: 66/100 (B)
KRAS G12D: 65/100 (B)

the coolest run today was on Fast Radio Bursts and magnetars.

RUMI went through 69 papers, generated new hidden variables, built mechanisms around them, made testable predictions, then tried to destroy its own ideas through adversarial testing.

finally it actually felt like the pipeline was working end-to-end instead of randomly exploding lol 😭

still got a long way to go though. biggest weaknesses right now are novelty and mathematical rigor. RUMI is getting better at explaining science, now I need to make her better at finding genuinely new stuff.

back to debugging tomorrow frfr 💔🥀😭