You are browsing as a guest. Sign up (or log in) to start making projects!

Open comments for this post

20m 14s logged

penumbra-markdown: the parser arc

so I was supposed to work on other stuff today.

I did not work on other stuff today.

instead I wrote an entire markdown pipeline from scratch. parser, AST, HTML renderer, plain text extractor, streaming layer for live editing, and 639 lines of tests. because apparently “notes app needs notes parsing” was more urgent than “notes app needs to save notes.”

No I did not do this in 15 minutes. I spent much more time on it. I’m just, like I said, one commit behind (no longer!!!)


the custom syntax

the whole point of this crate is two custom inline types that make Penumbra’s markdown different from regular markdown:

NoteEmbed: [[some-note-id]]. double-bracket references to other notes on the canvas. the parser tracks bracket depth so nested brackets don’t break things.

TagRef: #tag-name. inline tags. the parser distinguishes #tag from # heading by checking if there’s a space after the hash and whether the preceding character is a word boundary.

these get parsed as structured AST nodes, not raw text. when the HTML renderer hits them, they become custom elements: <pe-embed data-ref="id"> and <pe-tag data-name="name">. the Dioxus UI will intercept these and make them interactive (click to pan, click to filter).

I know they’re technically also Obsidian things, but smh my head, I did it too


the parser

612 lines wrapping pulldown-cmark. every text chunk goes through split_text_for_custom() for embed/tag extraction before hitting the inline stack. uses a frame stack with saved positions for nesting (strong, emphasis, links, etc). tables required a state machine. tight list items needed special flush handling. the usual “markdown is simple until it isn’t” experience.

The image shows my excellent idea for inline stuff, as remember how println!() works with vars? Okay, no this is a pretty common thing now that I think about it… sigh. It’s the thought that counts


streaming

wraps mdstream for live editing. instead of re-parsing the whole document on every keystroke, it tracks which blocks changed and only re-parses those. append(), finalize(), snapshot(), reset(). the common case (typing at the end) is fast.


the quadtree fix

also fixed a subtle bug in Barnes-Hut. the force calculation was clamping distance AFTER the sqrt but using unclamped dist_sq in the same formula. moved the .max(1.0) to dist_sq so both values stay consistent. the old version could produce weirdly large forces for very close nodes.

Yeah this is why i’m ripping it out, and replacing it


tests and sanitization

boring stuff, just writing the same thing with some tiny changes a lot of times


ten crates now. TEN CRATES??!?!!?

0
1

Comments 0

No comments yet. Be the first!