Devlog by @ArjunCodess - Stardance

@ArjunCodess on MINTS · about 2 months ago

58m 16s logged

showed model performance before the interpretability analysis.
separated task performance from probe results.

combined all interpretability analyses into one pipeline.
made CTCF the main case study.
made the overall framing clearer.

explained every analysis threshold.
validated thresholds with sensitivity tests and control experiments.
showed that the main results stay consistent across different thresholds.

Comments 0

No comments yet. Be the first!

Sign in to join the conversation.