You are browsing as a guest. Sign up (or log in) to start making projects!

Penumbra

  • 11 Devlogs
  • 13 Total hours

Penumbra is a spatial notes app where your ideas live on a giant map instead of a list. When you finish writing a note and hit Esc to go to the main screen, the card drifts across the canvas and snaps next to its related notes using local embeddings, vector search, and a tiny physics engine. It feels like the map is thinking with you.

Open comments for this post

2h 40m 58s logged

the polish commit (that didn’t polish enough)

okay so this commit looks massive and impressive on paper. I touched basically everything. and the code changes are real. but I need to be honest: the app still looks and feels terrible.

like, genuinely bad. the UI is broken in ways that make me want to close the window and pretend I never opened it.

but the changes are real so let me document them anyway.


layout tuning

repulsion went from 1000 to 8000, attraction from 0.01 to 0.004, ideal_length from 50 to 220. notes were clumping into an unreadable blob. the physics are better now even if you can’t tell because the rendering is fighting itself.


theme overhaul

flattened the entire theme system into one struct with semantic names. every hardcoded color replaced with CSS vars. dark and light themes. toggle in settings. this part actually works correctly and I’m proud of it even though nobody can see it because the layout is broken.


TipTap editor

replaced the textarea with TipTap via JS eval. StarterKit with bold/italic/strike/code/headings/lists. toolbar with format buttons. sends HTML back via dioxus.send(). this is genuinely nice when it works. which is actually most of the time wow.


manual linking, note list, auto-linking on save, context menu, keyboard shortcuts, zoom controls, empty state

all implemented. all technically functional. all look like they were designed by someone who has never seen a GUI before. (me. I’m that someone. i am not good at RSX, which is funny because it SHOULD just be a different form of HTML. you would think.)


what I learned

dioxus-desktop’s webview is not a browser with devtools. iterating on UI by recompiling a Rust binary every time you move a div 2 pixels is actual torture. I should have designed the UI in plain HTML first, gotten it looking right in a real browser with real devtools, and THEN ported it to rsx.

I did not do that. I have been doing it backwards this entire time. smh my silly head.

next step: HTML mockup first, then port. the backend is solid. the architecture is clean. the UI just needs to catch up.

still broken but less broken than before. progress? maybe?

onwards (to HTML mockups) :}

the polish commit (that didn’t polish enough)

okay so this commit looks massive and impressive on paper. I touched basically everything. and the code changes are real. but I need to be honest: the app still looks and feels terrible.

like, genuinely bad. the UI is broken in ways that make me want to close the window and pretend I never opened it.

but the changes are real so let me document them anyway.


layout tuning

repulsion went from 1000 to 8000, attraction from 0.01 to 0.004, ideal_length from 50 to 220. notes were clumping into an unreadable blob. the physics are better now even if you can’t tell because the rendering is fighting itself.


theme overhaul

flattened the entire theme system into one struct with semantic names. every hardcoded color replaced with CSS vars. dark and light themes. toggle in settings. this part actually works correctly and I’m proud of it even though nobody can see it because the layout is broken.


TipTap editor

replaced the textarea with TipTap via JS eval. StarterKit with bold/italic/strike/code/headings/lists. toolbar with format buttons. sends HTML back via dioxus.send(). this is genuinely nice when it works. which is actually most of the time wow.


manual linking, note list, auto-linking on save, context menu, keyboard shortcuts, zoom controls, empty state

all implemented. all technically functional. all look like they were designed by someone who has never seen a GUI before. (me. I’m that someone. i am not good at RSX, which is funny because it SHOULD just be a different form of HTML. you would think.)


what I learned

dioxus-desktop’s webview is not a browser with devtools. iterating on UI by recompiling a Rust binary every time you move a div 2 pixels is actual torture. I should have designed the UI in plain HTML first, gotten it looking right in a real browser with real devtools, and THEN ported it to rsx.

I did not do that. I have been doing it backwards this entire time. smh my silly head.

next step: HTML mockup first, then port. the backend is solid. the architecture is clean. the UI just needs to catch up.

still broken but less broken than before. progress? maybe?

onwards (to HTML mockups) :}

Replying to @NellowTCS

0
2
Open comments for this post

1h 11m 21s logged

making it actually work (kind of)

so remember how last commit I said “everything is broken”? I spent tonight fixing the worst of it. it’s not done but it’s a lot less broken.


the big fix: drag + layout fighting

the core problem was that when you drag a note, the layout engine keeps running and overwriting your drag position every 16ms. the note jitters back and forth between where you’re dragging it and where the physics wants it.

fix: new SetNodePosition event. when you drag a note, the UI publishes it through the event bus. the layout worker drains these events at the top of each cycle and calls engine.set_position() before stepping. so the layout engine knows where you put the card and computes forces from there instead of fighting you.

also: while any note is being dragged (tracked by dragged_set), the layout event loop skips updating positions entirely. other notes stay still instead of jittering from stale force calculations. once you release, everything syncs up again.


mouse events moved to root div

pan/drag/mouseup handlers were on the canvas element, which meant releasing the mouse over a note card didn’t fire mouseup on the canvas. dragging would “stick.” moved all the move/up/leave handlers to the root div so they fire regardless of what’s under the cursor.


canvas renderer simplified

removed the node drawing code from WebCanvasRenderer entirely. nodes were being drawn twice: once on the canvas in Rust/web-sys, once as DOM cards in Dioxus. the canvas now only draws the dot grid and bezier edges. the DOM handles cards. no more double rendering.

the inline JS RAF loop was also replaced with a standalone canvas-draw.js that gets called explicitly via window.__penumbra_draw() whenever render state changes. no more requestAnimationFrame spinning at 60fps when nothing changed.


spring-animated card positions

cards now spring-animate to their new positions instead of teleporting. each AnimatedCard tracks target x/y and animates via dioxus-motion springs when the target changes. so when the layout engine repositions notes, they glide smoothly. the spring from the ideas issue, basically.


sidebar panels are real now

the floating sidebar buttons actually do things now. four panels:

Search: reactive search input that queries the hybrid search engine as you type. results show title, preview, and similarity score. click a result and the camera drifts to that note.

Tags: shows all tags sorted by count. click a tag to filter the graph view to only notes with that tag. click again to clear the filter. below the tag list, shows the filtered notes.

Pins: lists all pinned notes. click to pan to them.

Settings: just shows note count and link count for now. placeholder.


context menu on right-click

right-click a note card and you get a context menu with “Open in editor,” “Pin to canvas” / “Unpin,” and “Delete note.” the pin toggle calls a new toggle_pin method on AppState that flips the meta flag, persists, and publishes an event.


note cards render markdown

NoteCard preview now runs through markdown_to_html() and uses dangerous_inner_html. (XSS? yes) the editor got a preview toggle button that switches between the textarea and rendered HTML.


real embeddings on init

switched from SimpleEmbedder to CandleEmbedder::load() at startup. downloads the real Snowflake model from HuggingFace on first launch. falls back to SimpleEmbedder if the download fails.


position persistence

the layout event loop now debounce-saves positions to storage every 2 seconds. on next launch, saved positions are fed into the layout engine before the first step so notes start where you left them instead of random positions.


other stuff done as well but i’m eepyyyy

making it actually work (kind of)

so remember how last commit I said “everything is broken”? I spent tonight fixing the worst of it. it’s not done but it’s a lot less broken.


the big fix: drag + layout fighting

the core problem was that when you drag a note, the layout engine keeps running and overwriting your drag position every 16ms. the note jitters back and forth between where you’re dragging it and where the physics wants it.

fix: new SetNodePosition event. when you drag a note, the UI publishes it through the event bus. the layout worker drains these events at the top of each cycle and calls engine.set_position() before stepping. so the layout engine knows where you put the card and computes forces from there instead of fighting you.

also: while any note is being dragged (tracked by dragged_set), the layout event loop skips updating positions entirely. other notes stay still instead of jittering from stale force calculations. once you release, everything syncs up again.


mouse events moved to root div

pan/drag/mouseup handlers were on the canvas element, which meant releasing the mouse over a note card didn’t fire mouseup on the canvas. dragging would “stick.” moved all the move/up/leave handlers to the root div so they fire regardless of what’s under the cursor.


canvas renderer simplified

removed the node drawing code from WebCanvasRenderer entirely. nodes were being drawn twice: once on the canvas in Rust/web-sys, once as DOM cards in Dioxus. the canvas now only draws the dot grid and bezier edges. the DOM handles cards. no more double rendering.

the inline JS RAF loop was also replaced with a standalone canvas-draw.js that gets called explicitly via window.__penumbra_draw() whenever render state changes. no more requestAnimationFrame spinning at 60fps when nothing changed.


spring-animated card positions

cards now spring-animate to their new positions instead of teleporting. each AnimatedCard tracks target x/y and animates via dioxus-motion springs when the target changes. so when the layout engine repositions notes, they glide smoothly. the spring from the ideas issue, basically.


sidebar panels are real now

the floating sidebar buttons actually do things now. four panels:

Search: reactive search input that queries the hybrid search engine as you type. results show title, preview, and similarity score. click a result and the camera drifts to that note.

Tags: shows all tags sorted by count. click a tag to filter the graph view to only notes with that tag. click again to clear the filter. below the tag list, shows the filtered notes.

Pins: lists all pinned notes. click to pan to them.

Settings: just shows note count and link count for now. placeholder.


context menu on right-click

right-click a note card and you get a context menu with “Open in editor,” “Pin to canvas” / “Unpin,” and “Delete note.” the pin toggle calls a new toggle_pin method on AppState that flips the meta flag, persists, and publishes an event.


note cards render markdown

NoteCard preview now runs through markdown_to_html() and uses dangerous_inner_html. (XSS? yes) the editor got a preview toggle button that switches between the textarea and rendered HTML.


real embeddings on init

switched from SimpleEmbedder to CandleEmbedder::load() at startup. downloads the real Snowflake model from HuggingFace on first launch. falls back to SimpleEmbedder if the download fails.


position persistence

the layout event loop now debounce-saves positions to storage every 2 seconds. on next launch, saved positions are fed into the layout engine before the first step so notes start where you left them instead of random positions.


other stuff done as well but i’m eepyyyy

Replying to @NellowTCS

0
0
Open comments for this post

3h 24m 54s logged

the Dioxus commit (it’s broken but it exists)

okay so.

it does not work properly. I want to be upfront about that. it compiles, it launches, things appear on screen, but the interaction is janky, the canvas rendering fights with the DOM cards, and the layout worker doesn’t sync positions correctly yet. it’s a WIP commit and I’m committing it anyway because there’s genuinely a lot of infrastructure here and I don’t want to lose it.


three new crates

penumbra-theme: dark and light themes defined as Rust structs. colors, radii, glass blur config. to_css_vars() generates the full CSS custom properties string so the theme can be injected into the DOM at runtime. purple accent because penumbra means shadow and shadows are purple, obviously.

penumbra-canvas: a GraphCanvasRenderer trait with a WebCanvasRenderer that draws to an HTML canvas via web-sys. dot grid background, bezier curve edges between nodes, rounded-rect cards with titles. plus a NullCanvasRenderer for when there’s no canvas available. the RenderState struct holds the camera, nodes, edges, and selection state.

penumbra-thread: cross-platform threading. std::thread on native, wasm_thread on WASM. one #[cfg] block (the only one in the project, and it’s in a platform abstraction crate where it belongs). Worker struct with atomic cancellation flag. spawn_worker() and spawn_detached() helpers.


the app

Build/ui/penumbra-app/ is a full Dioxus desktop app. the component count looks scary but most of it is from the dioxus-components library (badge, button, card, dialog, dropdown menu, separator, sheet, sidebar, skeleton, tabs, tooltip). I styled them but didn’t write them. the custom ones are:

NoteCard: the little frosted-glass card that represents a note on the canvas. title, preview, positioned absolutely in world coordinates.

GraphCards: renders all notes as AnimatedCards inside a CSS-transformed container. each card gets a spring animation on mount via dioxus-motion so it scales in from zero. cards are draggable.

FloatingSidebar: the vertical icon bar on the left. grid, search, pin, tag, settings. just state toggles for now, no panels wired up yet.

TopBar: centered pill bar with the app name and search area.

Fab: “New note” button in the bottom right.

NoteEditor: full-screen editor view with title, body, and tags fields. auto-saves on back.


the bridge

bridge/mod.rs connects the UI to the backend. load_graph() and load_positions() pull from storage. restore_state() inserts everything into the graph and index. create_layout_engine() builds the GPU layout engine with all current nodes. start_layout_worker() spawns a background thread that steps the layout at 60fps, syncs node additions/removals, and publishes position updates through the event bus. sleeps longer when the layout has converged.


the interaction model

pan: mousedown on canvas starts tracking, mousemove applies delta to camera, mouseup stops.

zoom: wheel events on canvas adjust zoom level around the cursor position.

drag: mousedown on a note card captures the offset, mousemove in drag mode updates the note’s position directly.

note creation: click fab -> create empty note in graph -> wait for layout engine to assign position -> camera drifts to the new note (spring animation) -> switch to editor view. this is the flow from the ideas issue. it doesn’t work smoothly yet but the state machine is there.


what’s broken

everything, kind of. the canvas renderer and the DOM cards are two separate rendering paths that don’t coordinate well. the layout worker publishes positions but the signals don’t always pick them up in time. the camera drift animation triggers but sometimes snaps instead of drifting. the editor saves but doesn’t trigger re-embedding. the sidebar buttons toggle state but nothing happens.

it’s a prototype. sigh.

the Dioxus commit (it’s broken but it exists)

okay so.

it does not work properly. I want to be upfront about that. it compiles, it launches, things appear on screen, but the interaction is janky, the canvas rendering fights with the DOM cards, and the layout worker doesn’t sync positions correctly yet. it’s a WIP commit and I’m committing it anyway because there’s genuinely a lot of infrastructure here and I don’t want to lose it.


three new crates

penumbra-theme: dark and light themes defined as Rust structs. colors, radii, glass blur config. to_css_vars() generates the full CSS custom properties string so the theme can be injected into the DOM at runtime. purple accent because penumbra means shadow and shadows are purple, obviously.

penumbra-canvas: a GraphCanvasRenderer trait with a WebCanvasRenderer that draws to an HTML canvas via web-sys. dot grid background, bezier curve edges between nodes, rounded-rect cards with titles. plus a NullCanvasRenderer for when there’s no canvas available. the RenderState struct holds the camera, nodes, edges, and selection state.

penumbra-thread: cross-platform threading. std::thread on native, wasm_thread on WASM. one #[cfg] block (the only one in the project, and it’s in a platform abstraction crate where it belongs). Worker struct with atomic cancellation flag. spawn_worker() and spawn_detached() helpers.


the app

Build/ui/penumbra-app/ is a full Dioxus desktop app. the component count looks scary but most of it is from the dioxus-components library (badge, button, card, dialog, dropdown menu, separator, sheet, sidebar, skeleton, tabs, tooltip). I styled them but didn’t write them. the custom ones are:

NoteCard: the little frosted-glass card that represents a note on the canvas. title, preview, positioned absolutely in world coordinates.

GraphCards: renders all notes as AnimatedCards inside a CSS-transformed container. each card gets a spring animation on mount via dioxus-motion so it scales in from zero. cards are draggable.

FloatingSidebar: the vertical icon bar on the left. grid, search, pin, tag, settings. just state toggles for now, no panels wired up yet.

TopBar: centered pill bar with the app name and search area.

Fab: “New note” button in the bottom right.

NoteEditor: full-screen editor view with title, body, and tags fields. auto-saves on back.


the bridge

bridge/mod.rs connects the UI to the backend. load_graph() and load_positions() pull from storage. restore_state() inserts everything into the graph and index. create_layout_engine() builds the GPU layout engine with all current nodes. start_layout_worker() spawns a background thread that steps the layout at 60fps, syncs node additions/removals, and publishes position updates through the event bus. sleeps longer when the layout has converged.


the interaction model

pan: mousedown on canvas starts tracking, mousemove applies delta to camera, mouseup stops.

zoom: wheel events on canvas adjust zoom level around the cursor position.

drag: mousedown on a note card captures the offset, mousemove in drag mode updates the note’s position directly.

note creation: click fab -> create empty note in graph -> wait for layout engine to assign position -> camera drifts to the new note (spring animation) -> switch to editor view. this is the flow from the ideas issue. it doesn’t work smoothly yet but the state machine is there.


what’s broken

everything, kind of. the canvas renderer and the DOM cards are two separate rendering paths that don’t coordinate well. the layout worker publishes positions but the signals don’t always pick them up in time. the camera drift animation triggers but sometimes snaps instead of drifting. the editor saves but doesn’t trigger re-embedding. the sidebar buttons toggle state but nothing happens.

it’s a prototype. sigh.

Replying to @NellowTCS

0
0
Open comments for this post

40m 29s logged

sync conflict detection + wiremock tests

two changes in one commit: the sync worker now rejects stale pushes, and the Rust sync client has a real test suite.


conflict detection

if the client sends a snapshotId with its push, the worker checks it against the current server snapshot. if they don’t match, someone else pushed in between, and you get a 409 back with the current server snapshot ID and a message to pull first.

the snapshotId used to be passed through from the client or generated fresh. now it’s always generated server-side. the client’s snapshotId is only used for the conflict check, never as the new snapshot. small change, but it means two clients can’t accidentally create the same snapshot ID.


last_sync is now a Mutex

WorkerSyncProvider.last_sync went from Option<DateTime> to Mutex<Option<DateTime>>. it gets updated after both push and pull succeed. this matters because SyncProvider is behind an Arc in the real app, and last_sync needs interior mutability. also added #[serde(rename_all = "camelCase")] on SyncSnapshot so the JSON fields match what the worker actually sends.


wiremock tests

238 lines of new tests using wiremock to mock the HTTP worker. each test spins up a local mock server, registers response expectations, and exercises the WorkerSyncProvider against it. no real network, no real worker, fully deterministic.

covers: connect (200 and 500), push (returns snapshot, forwards snapshot ID, handles 500, handles 409 conflict), pull (returns notes/embeddings/positions, sends since query param), status (parses all fields), last_sync (none initially, updated after push).

the with_rt() helper builds a single-threaded tokio runtime with IO enabled for each test since wiremock needs async + network.


the TODO

“Cloud sync” is checked off. one item left: Dioxus UI.

aaaaaaaaa

sync conflict detection + wiremock tests

two changes in one commit: the sync worker now rejects stale pushes, and the Rust sync client has a real test suite.


conflict detection

if the client sends a snapshotId with its push, the worker checks it against the current server snapshot. if they don’t match, someone else pushed in between, and you get a 409 back with the current server snapshot ID and a message to pull first.

the snapshotId used to be passed through from the client or generated fresh. now it’s always generated server-side. the client’s snapshotId is only used for the conflict check, never as the new snapshot. small change, but it means two clients can’t accidentally create the same snapshot ID.


last_sync is now a Mutex

WorkerSyncProvider.last_sync went from Option<DateTime> to Mutex<Option<DateTime>>. it gets updated after both push and pull succeed. this matters because SyncProvider is behind an Arc in the real app, and last_sync needs interior mutability. also added #[serde(rename_all = "camelCase")] on SyncSnapshot so the JSON fields match what the worker actually sends.


wiremock tests

238 lines of new tests using wiremock to mock the HTTP worker. each test spins up a local mock server, registers response expectations, and exercises the WorkerSyncProvider against it. no real network, no real worker, fully deterministic.

covers: connect (200 and 500), push (returns snapshot, forwards snapshot ID, handles 500, handles 409 conflict), pull (returns notes/embeddings/positions, sends since query param), status (parses all fields), last_sync (none initially, updated after push).

the with_rt() helper builds a single-threaded tokio runtime with IO enabled for each test since wiremock needs async + network.


the TODO

“Cloud sync” is checked off. one item left: Dioxus UI.

aaaaaaaaa

Replying to @NellowTCS

0
2
Open comments for this post

55m 41s logged

making Candle actually WASM-compatible

so remember the “no #[cfg] in cross-cutting interfaces” rule? the CandleEmbedder was breaking it. not with #[cfg] exactly, but with hf-hub, which uses filesystem APIs (dirs, mmap) that don’t exist in WASM. the model loading was desktop-only and I was pretending that was fine.

it was not fine. I fixed it.


hf-hub is gone

replaced with reqwest. instead of hf-hub’s sync filesystem API that downloads to a local cache directory, the embedder now does raw HTTP GETs to huggingface.co/{model}/resolve/main/{file}. three downloads: model.safetensors, config.json, tokenizer.json. the bytes stay in memory, no filesystem touch.

reqwest is platform-conditional in Cargo.toml: rustls on native (no OpenSSL dependency), bare defaults on wasm32 (uses browser fetch under the hood). plus getrandom with the wasm_js feature so random number generation works in WASM.


two feature flags instead of one

candle now just gives you the model and tokenizer types. no download capability, no reqwest. you construct a CandleEmbedder from bytes you already have.

candle-load adds reqwest and the CandleEmbedder::load() method that downloads from HuggingFace. this is the one that pulls in the network stack.

the split matters because the WASM build might want to load model bytes from OPFS or a bundled asset instead of downloading every time. the download path is opt-in.


mmap is gone too

from_mmaped_safetensors (with its unsafe block) became from_buffered_safetensors. loads from a byte vec instead of memory-mapping a file path. works everywhere. no unsafe. the vocab_size is now read from config.json instead of hardcoded to 30522.

Tokenizer::from_file became Tokenizer::from_bytes. same pattern.


error handling cleanup

every candle operation was using ? directly, which only works if PenumbraError implements From<candle_core::Error>. it doesn’t, and it shouldn’t, because candle errors are an implementation detail. added an e_msg helper that wraps any Display into PenumbraError::Embedding, and switched every candle call to .map_err(e_msg)?. verbose but correct.

ArcticEmbedXS::new and forward now return candle’s own error type instead of PenumbraError. the boundary between “candle stuff” and “penumbra stuff” is cleaner. the CandleEmbedder wrapper handles the translation.


tests

New candle tests behind #[cfg(feature = "candle")]. the trick: building a synthetic safetensors file and a minimal WordLevel tokenizer entirely in memory. no model download needed for the test suite.

test_safetensors() generates fake embedding weights and encoder weights with deterministic values, writes the safetensors header manually (much better than pulling it in) (length prefix + JSON metadata + raw f32 bytes), and hands it to VarBuilder. test_tokenizer_bytes() builds a 10-word vocabulary with a Whitespace pre-tokenizer.

tests cover: forward pass output shape, L2 normalization, non-zero output, embedder dimensions, embed_text roundtrip. plus one candle-load gated test that actually downloads the real model from HuggingFace (only runs when you explicitly pass --features candle-load).


the “no #[cfg] in cross-cutting interfaces” rule now holds for real. the entire embed pipeline compiles on wasm32 without conditional compilation. reqwest handles the platform difference internally. candle handles the compute. the embedder trait doesn’t know or care.


Quick note, sorry about there mostly being terminal or VSCode images lol, there’s no UI to demo rn…

making Candle actually WASM-compatible

so remember the “no #[cfg] in cross-cutting interfaces” rule? the CandleEmbedder was breaking it. not with #[cfg] exactly, but with hf-hub, which uses filesystem APIs (dirs, mmap) that don’t exist in WASM. the model loading was desktop-only and I was pretending that was fine.

it was not fine. I fixed it.


hf-hub is gone

replaced with reqwest. instead of hf-hub’s sync filesystem API that downloads to a local cache directory, the embedder now does raw HTTP GETs to huggingface.co/{model}/resolve/main/{file}. three downloads: model.safetensors, config.json, tokenizer.json. the bytes stay in memory, no filesystem touch.

reqwest is platform-conditional in Cargo.toml: rustls on native (no OpenSSL dependency), bare defaults on wasm32 (uses browser fetch under the hood). plus getrandom with the wasm_js feature so random number generation works in WASM.


two feature flags instead of one

candle now just gives you the model and tokenizer types. no download capability, no reqwest. you construct a CandleEmbedder from bytes you already have.

candle-load adds reqwest and the CandleEmbedder::load() method that downloads from HuggingFace. this is the one that pulls in the network stack.

the split matters because the WASM build might want to load model bytes from OPFS or a bundled asset instead of downloading every time. the download path is opt-in.


mmap is gone too

from_mmaped_safetensors (with its unsafe block) became from_buffered_safetensors. loads from a byte vec instead of memory-mapping a file path. works everywhere. no unsafe. the vocab_size is now read from config.json instead of hardcoded to 30522.

Tokenizer::from_file became Tokenizer::from_bytes. same pattern.


error handling cleanup

every candle operation was using ? directly, which only works if PenumbraError implements From<candle_core::Error>. it doesn’t, and it shouldn’t, because candle errors are an implementation detail. added an e_msg helper that wraps any Display into PenumbraError::Embedding, and switched every candle call to .map_err(e_msg)?. verbose but correct.

ArcticEmbedXS::new and forward now return candle’s own error type instead of PenumbraError. the boundary between “candle stuff” and “penumbra stuff” is cleaner. the CandleEmbedder wrapper handles the translation.


tests

New candle tests behind #[cfg(feature = "candle")]. the trick: building a synthetic safetensors file and a minimal WordLevel tokenizer entirely in memory. no model download needed for the test suite.

test_safetensors() generates fake embedding weights and encoder weights with deterministic values, writes the safetensors header manually (much better than pulling it in) (length prefix + JSON metadata + raw f32 bytes), and hands it to VarBuilder. test_tokenizer_bytes() builds a 10-word vocabulary with a Whitespace pre-tokenizer.

tests cover: forward pass output shape, L2 normalization, non-zero output, embedder dimensions, embed_text roundtrip. plus one candle-load gated test that actually downloads the real model from HuggingFace (only runs when you explicitly pass --features candle-load).


the “no #[cfg] in cross-cutting interfaces” rule now holds for real. the entire embed pipeline compiles on wasm32 without conditional compilation. reqwest handles the platform difference internally. candle handles the compute. the embedder trait doesn’t know or care.


Quick note, sorry about there mostly being terminal or VSCode images lol, there’s no UI to demo rn…

Replying to @NellowTCS

0
1
Open comments for this post

55m 56s logged

the sync backend exists now

so the TODO said “Cloud sync” and I was like “how hard can it be”

it was not hard actually. which is suspicious. but it works.

new directory: Backend/sync-worker/. it’s a Cloudflare Worker backed by R2 for storage. the whole sync API is three JS files and a wrangler config.


the API

  • POST /sync/push - batch upload notes, embeddings, and positions
  • POST /sync/pull - batch download everything (or since a snapshot)
  • GET /sync/status - storage stats (note count, bytes used, 512MB limit)
  • POST /sync/clear - nuke everything
  • plus individual CRUD routes for single notes, embeddings, and positions

notes are JSON. embeddings are raw f32 bytes (384 * 4 = 1,536 bytes per note). positions are JSON. everything lives in R2 under a clean key scheme: notes/{id}.json, embeddings/{id}.bin, positions/{id}.json.


the storage layer

storage.js is the R2 abstraction. getJson, putJson, getBinary, putBinary, del, listKeys. plus a manifest system that tracks note count, last modified time, and the current snapshot ID. snapshots are versioned so the client can eventually do “pull only what changed since X” (right now it just dumps everything, but the plumbing is there).


push and pull

push accepts a JSON body with notes, embeddings, and positions objects. writes them all to R2, updates the manifest, creates a snapshot record, returns the snapshot ID.

pull lists all note keys, fetches each note plus its embedding and position, and returns the whole bundle. embeddings get round-tripped through Float32Array so the binary format is preserved correctly.

CORS is wide open (*) because this is a personal sync worker, not a public API. the client will be the Penumbra app running in the browser or on desktop.
I’ll probably change this tho


the generated types file

wrangler generated a 14,000-line TypeScript declaration file for the Cloudflare runtime. I committed it because it’s how wrangler works and I don’t want to fight it. it defines the R2Bucket binding and every other Workers API type. it’s big. it’s fine.


this is the first non-Rust code in the repo. feels weird. but the sync layer is inherently a server-side thing and Cloudflare Workers with R2 is genuinely the simplest way to get object storage with an HTTP API. no server to manage, no database to provision, just a bucket and some routing.


Also, had to kick out usearch since it wasn’t WASM compatible now everything but Candle is WASM compatible!

the sync backend exists now

so the TODO said “Cloud sync” and I was like “how hard can it be”

it was not hard actually. which is suspicious. but it works.

new directory: Backend/sync-worker/. it’s a Cloudflare Worker backed by R2 for storage. the whole sync API is three JS files and a wrangler config.


the API

  • POST /sync/push - batch upload notes, embeddings, and positions
  • POST /sync/pull - batch download everything (or since a snapshot)
  • GET /sync/status - storage stats (note count, bytes used, 512MB limit)
  • POST /sync/clear - nuke everything
  • plus individual CRUD routes for single notes, embeddings, and positions

notes are JSON. embeddings are raw f32 bytes (384 * 4 = 1,536 bytes per note). positions are JSON. everything lives in R2 under a clean key scheme: notes/{id}.json, embeddings/{id}.bin, positions/{id}.json.


the storage layer

storage.js is the R2 abstraction. getJson, putJson, getBinary, putBinary, del, listKeys. plus a manifest system that tracks note count, last modified time, and the current snapshot ID. snapshots are versioned so the client can eventually do “pull only what changed since X” (right now it just dumps everything, but the plumbing is there).


push and pull

push accepts a JSON body with notes, embeddings, and positions objects. writes them all to R2, updates the manifest, creates a snapshot record, returns the snapshot ID.

pull lists all note keys, fetches each note plus its embedding and position, and returns the whole bundle. embeddings get round-tripped through Float32Array so the binary format is preserved correctly.

CORS is wide open (*) because this is a personal sync worker, not a public API. the client will be the Penumbra app running in the browser or on desktop.
I’ll probably change this tho


the generated types file

wrangler generated a 14,000-line TypeScript declaration file for the Cloudflare runtime. I committed it because it’s how wrangler works and I don’t want to fight it. it defines the R2Bucket binding and every other Workers API type. it’s big. it’s fine.


this is the first non-Rust code in the repo. feels weird. but the sync layer is inherently a server-side thing and Cloudflare Workers with R2 is genuinely the simplest way to get object storage with an HTTP API. no server to manage, no database to provision, just a bucket and some routing.


Also, had to kick out usearch since it wasn’t WASM compatible now everything but Candle is WASM compatible!

Replying to @NellowTCS

0
1
Open comments for this post

21m 13s logged

auto-linker + TODO overhaul

new crate: penumbra-auto-link. 158 lines. this is the thing that makes the canvas feel fun.

the pipeline is simple: embed the note, search the vector index for neighbours, create implicit links for anything above the similarity threshold. process_note() takes a Note, embeds its text via the EmbeddingProvider, inserts the embedding into the index (so it’s immediately discoverable by future saves), searches for the top-k closest neighbours, filters by score, checks for duplicates, and creates implicit links in the graph. explicit links are never touched. I don’t mess with your links. (why does that sound funny lol)

configurable: top_k (default 10), min_score (default 0.75), max_links per pass (default 5). the AutoLinker holds Arc references to the embedder, index, graph, and event bus. every new link fires a LinkAdded event so the UI can animate the card drifting to its new neighbours.

the duplicate check matters. between the search returning results and the link being created, another thread could have already linked the same pair. so it locks the graph, checks if the link exists, and only creates it if it doesn’t. two process_note() calls on the same note produce links on the first call and an empty vec on the second.

Some tests in auto_link_matrix.rs: creates implicit links, skips self-links, obeys score threshold, respects max_links cap, handles empty index (no candidates), no duplicate links on re-process, empty body doesn’t panic, top_k=0 returns empty.


also cleaned up some stray em-dashes and arrow symbols in comments across the codebase. vscode keeps auto-correcting -- to em-dashes and -> to the arrow symbol and I keep not noticing until I read the diff. idk if it’s a bug or a feature but it’s annoying.


the TODO got a real overhaul. the old flat checklist is gone. everything that’s done is removed. what’s left is organized into sections: Dioxus UI plans (spring animations, camera lerp, pinned stars, cached positions), WASM worker threading, storage and sync (Google Drive, GitHub repo, offline mode, merge strategy), and future stuff (encrypted vault). it reads like a roadmap now instead of a grocery list.

eleven crates. too much? no.
modularity is never “too much,” rather it’s better to be more modular than less. (unless you have something insane like a file per function)

auto-linker + TODO overhaul

new crate: penumbra-auto-link. 158 lines. this is the thing that makes the canvas feel fun.

the pipeline is simple: embed the note, search the vector index for neighbours, create implicit links for anything above the similarity threshold. process_note() takes a Note, embeds its text via the EmbeddingProvider, inserts the embedding into the index (so it’s immediately discoverable by future saves), searches for the top-k closest neighbours, filters by score, checks for duplicates, and creates implicit links in the graph. explicit links are never touched. I don’t mess with your links. (why does that sound funny lol)

configurable: top_k (default 10), min_score (default 0.75), max_links per pass (default 5). the AutoLinker holds Arc references to the embedder, index, graph, and event bus. every new link fires a LinkAdded event so the UI can animate the card drifting to its new neighbours.

the duplicate check matters. between the search returning results and the link being created, another thread could have already linked the same pair. so it locks the graph, checks if the link exists, and only creates it if it doesn’t. two process_note() calls on the same note produce links on the first call and an empty vec on the second.

Some tests in auto_link_matrix.rs: creates implicit links, skips self-links, obeys score threshold, respects max_links cap, handles empty index (no candidates), no duplicate links on re-process, empty body doesn’t panic, top_k=0 returns empty.


also cleaned up some stray em-dashes and arrow symbols in comments across the codebase. vscode keeps auto-correcting -- to em-dashes and -> to the arrow symbol and I keep not noticing until I read the diff. idk if it’s a bug or a feature but it’s annoying.


the TODO got a real overhaul. the old flat checklist is gone. everything that’s done is removed. what’s left is organized into sections: Dioxus UI plans (spring animations, camera lerp, pinned stars, cached positions), WASM worker threading, storage and sync (Google Drive, GitHub repo, offline mode, merge strategy), and future stuff (encrypted vault). it reads like a roadmap now instead of a grocery list.

eleven crates. too much? no.
modularity is never “too much,” rather it’s better to be more modular than less. (unless you have something insane like a file per function)

Replying to @NellowTCS

0
2
Open comments for this post

21m 59s logged

GPU layout engine + collision avoidance

ripped out the entire hand-rolled ForceAtlas2 + Barnes-Hut implementation and replaced it with vibe-graph-layout-gpu. (OMG i do not like that name, you could have called it anything. LITERALLY ANYTHING else)

195 lines of quadtree code: deleted. RIP. the custom ForceAccumulator, NodeState, adaptive speed calculation: all gone. replaced by a wgpu-backed force-directed layout that runs on the GPU. (no it’s not that intensive, only if you have like a million notes, and then too, we be smart and only calculate the stuff for notes you can see (animation and gamedev tip #1) rather than everything!

the engine now wraps GpuLayout with lazy initialization. the GPU doesn’t spin up until the first step() call, so construction stays synchronous. if the graph changes (node added, removed, links updated), it sets a dirty flag and re-inits on the next step. pinned nodes get their positions restored after each GPU step since the GPU doesn’t know about pinning.

step_neighborhood() is now just step(). with GPU acceleration the full graph is always computed, so neighborhood-only updates don’t make sense anymore. the test got rewritten to match: step_moves_unpinned_nodes instead of the old neighborhood-specific assertions.


collision avoidance

the GPU handles forces and attraction but it doesn’t know about card sizes. nodes are points to it. so after the GPU step runs, there’s now a CPU-side collision resolution pass that pushes apart overlapping bounding boxes.

each node can have a Bounds (width, height) set by the UI. after forces are applied and pinned nodes are restored, resolve_collisions() runs up to 5 iterative passes. each pass checks every overlapping pair, computes a center-to-center separation vector, and pushes both nodes apart by half the overlap distance (clamped to 20px per pass). pinned nodes never move from collision push.

the fallback for coincident centers is fun: when two nodes land on the exact same pixel, there’s no center-to-center vector to push along. so it derives a deterministic angle from the node indices using big primes as hash seeds. deterministic because the same pair should always separate in the same direction. not random because randomness in layout makes things jitter.

three new tests: overlapping nodes get separated, nodes without bounds are skipped (no crash), pinned nodes stay put during collision resolution.

the Cargo.lock diff (image) is… yeah. wgpu pulls in the entire graphics stack. metal, vulkan (ash), naga (oooh fun fact! this is the Hindi name for a bunch of semi-divine half-snake beings!), glow, spirv (pov you try to type spin but absolutely FAIL). the dependency tree roughly tripled. but the layout runs on the GPU now so I’m not complaining.

also deleted the test_events.rs example from penumbra-markdown that I forgot to clean up last commit. oops.

and also made the LayoutEngine WASM-compatible by making it async. WHOOPS

GPU layout engine + collision avoidance

ripped out the entire hand-rolled ForceAtlas2 + Barnes-Hut implementation and replaced it with vibe-graph-layout-gpu. (OMG i do not like that name, you could have called it anything. LITERALLY ANYTHING else)

195 lines of quadtree code: deleted. RIP. the custom ForceAccumulator, NodeState, adaptive speed calculation: all gone. replaced by a wgpu-backed force-directed layout that runs on the GPU. (no it’s not that intensive, only if you have like a million notes, and then too, we be smart and only calculate the stuff for notes you can see (animation and gamedev tip #1) rather than everything!

the engine now wraps GpuLayout with lazy initialization. the GPU doesn’t spin up until the first step() call, so construction stays synchronous. if the graph changes (node added, removed, links updated), it sets a dirty flag and re-inits on the next step. pinned nodes get their positions restored after each GPU step since the GPU doesn’t know about pinning.

step_neighborhood() is now just step(). with GPU acceleration the full graph is always computed, so neighborhood-only updates don’t make sense anymore. the test got rewritten to match: step_moves_unpinned_nodes instead of the old neighborhood-specific assertions.


collision avoidance

the GPU handles forces and attraction but it doesn’t know about card sizes. nodes are points to it. so after the GPU step runs, there’s now a CPU-side collision resolution pass that pushes apart overlapping bounding boxes.

each node can have a Bounds (width, height) set by the UI. after forces are applied and pinned nodes are restored, resolve_collisions() runs up to 5 iterative passes. each pass checks every overlapping pair, computes a center-to-center separation vector, and pushes both nodes apart by half the overlap distance (clamped to 20px per pass). pinned nodes never move from collision push.

the fallback for coincident centers is fun: when two nodes land on the exact same pixel, there’s no center-to-center vector to push along. so it derives a deterministic angle from the node indices using big primes as hash seeds. deterministic because the same pair should always separate in the same direction. not random because randomness in layout makes things jitter.

three new tests: overlapping nodes get separated, nodes without bounds are skipped (no crash), pinned nodes stay put during collision resolution.

the Cargo.lock diff (image) is… yeah. wgpu pulls in the entire graphics stack. metal, vulkan (ash), naga (oooh fun fact! this is the Hindi name for a bunch of semi-divine half-snake beings!), glow, spirv (pov you try to type spin but absolutely FAIL). the dependency tree roughly tripled. but the layout runs on the GPU now so I’m not complaining.

also deleted the test_events.rs example from penumbra-markdown that I forgot to clean up last commit. oops.

and also made the LayoutEngine WASM-compatible by making it async. WHOOPS

Replying to @NellowTCS

0
3
Open comments for this post

20m 14s logged

penumbra-markdown: the parser arc

so I was supposed to work on other stuff today.

I did not work on other stuff today.

instead I wrote an entire markdown pipeline from scratch. parser, AST, HTML renderer, plain text extractor, streaming layer for live editing, and 639 lines of tests. because apparently “notes app needs notes parsing” was more urgent than “notes app needs to save notes.”

No I did not do this in 15 minutes. I spent much more time on it. I’m just, like I said, one commit behind (no longer!!!)


the custom syntax

the whole point of this crate is two custom inline types that make Penumbra’s markdown different from regular markdown:

NoteEmbed: [[some-note-id]]. double-bracket references to other notes on the canvas. the parser tracks bracket depth so nested brackets don’t break things.

TagRef: #tag-name. inline tags. the parser distinguishes #tag from # heading by checking if there’s a space after the hash and whether the preceding character is a word boundary.

these get parsed as structured AST nodes, not raw text. when the HTML renderer hits them, they become custom elements: <pe-embed data-ref="id"> and <pe-tag data-name="name">. the Dioxus UI will intercept these and make them interactive (click to pan, click to filter).

I know they’re technically also Obsidian things, but smh my head, I did it too


the parser

612 lines wrapping pulldown-cmark. every text chunk goes through split_text_for_custom() for embed/tag extraction before hitting the inline stack. uses a frame stack with saved positions for nesting (strong, emphasis, links, etc). tables required a state machine. tight list items needed special flush handling. the usual “markdown is simple until it isn’t” experience.

The image shows my excellent idea for inline stuff, as remember how println!() works with vars? Okay, no this is a pretty common thing now that I think about it… sigh. It’s the thought that counts


streaming

wraps mdstream for live editing. instead of re-parsing the whole document on every keystroke, it tracks which blocks changed and only re-parses those. append(), finalize(), snapshot(), reset(). the common case (typing at the end) is fast.


the quadtree fix

also fixed a subtle bug in Barnes-Hut. the force calculation was clamping distance AFTER the sqrt but using unclamped dist_sq in the same formula. moved the .max(1.0) to dist_sq so both values stay consistent. the old version could produce weirdly large forces for very close nodes.

Yeah this is why i’m ripping it out, and replacing it


tests and sanitization

boring stuff, just writing the same thing with some tiny changes a lot of times


ten crates now. TEN CRATES??!?!!?

penumbra-markdown: the parser arc

so I was supposed to work on other stuff today.

I did not work on other stuff today.

instead I wrote an entire markdown pipeline from scratch. parser, AST, HTML renderer, plain text extractor, streaming layer for live editing, and 639 lines of tests. because apparently “notes app needs notes parsing” was more urgent than “notes app needs to save notes.”

No I did not do this in 15 minutes. I spent much more time on it. I’m just, like I said, one commit behind (no longer!!!)


the custom syntax

the whole point of this crate is two custom inline types that make Penumbra’s markdown different from regular markdown:

NoteEmbed: [[some-note-id]]. double-bracket references to other notes on the canvas. the parser tracks bracket depth so nested brackets don’t break things.

TagRef: #tag-name. inline tags. the parser distinguishes #tag from # heading by checking if there’s a space after the hash and whether the preceding character is a word boundary.

these get parsed as structured AST nodes, not raw text. when the HTML renderer hits them, they become custom elements: <pe-embed data-ref="id"> and <pe-tag data-name="name">. the Dioxus UI will intercept these and make them interactive (click to pan, click to filter).

I know they’re technically also Obsidian things, but smh my head, I did it too


the parser

612 lines wrapping pulldown-cmark. every text chunk goes through split_text_for_custom() for embed/tag extraction before hitting the inline stack. uses a frame stack with saved positions for nesting (strong, emphasis, links, etc). tables required a state machine. tight list items needed special flush handling. the usual “markdown is simple until it isn’t” experience.

The image shows my excellent idea for inline stuff, as remember how println!() works with vars? Okay, no this is a pretty common thing now that I think about it… sigh. It’s the thought that counts


streaming

wraps mdstream for live editing. instead of re-parsing the whole document on every keystroke, it tracks which blocks changed and only re-parses those. append(), finalize(), snapshot(), reset(). the common case (typing at the end) is fast.


the quadtree fix

also fixed a subtle bug in Barnes-Hut. the force calculation was clamping distance AFTER the sqrt but using unclamped dist_sq in the same formula. moved the .max(1.0) to dist_sq so both values stay consistent. the old version could produce weirdly large forces for very close nodes.

Yeah this is why i’m ripping it out, and replacing it


tests and sanitization

boring stuff, just writing the same thing with some tiny changes a lot of times


ten crates now. TEN CRATES??!?!!?

Replying to @NellowTCS

0
1
Open comments for this post

1h 10m 17s logged

Oh wow I locked in

so remember yesterday when I said “today was all architecture, all decisions, all research”?

yeah.

today I wrote the architecture. all of it. 27 files. 3,795 lines added (1,900 ish is just Cargo.lock btw). one commit.

I sat down to “maybe scaffold the embed crate” and then I just… didn’t stop. the hyperfixation hit and I looked up and every single crate had real code in it. not stubs. not //! penumbra-whatever. actual implementations with actual tests.

let me try to explain what happened.


three embedders walked into a crate

(no the title is NOT a punchline to a joke, however much it sounds like one)

penumbra-embed went from a one-line comment to three full embedding providers:

CandleEmbedder is the real one. it loads Snowflake-Arctic-Embed-XS from HuggingFace Hub, tokenizes input, runs it through a BERT-style forward pass (word embeddings, mean pooling, linear encoder), and L2-normalizes the output into a 384-dim vector. the model weights come from safetensors via candle_nn::VarBuilder::from_mmaped_safetensors. the whole Candle pipeline is behind a feature flag so you don’t pull in the ML universe if you don’t need it.

SimpleEmbedder is the clever one. it slides character trigrams across the input, hashes each one, and accumulates them into buckets across a fixed-dimension vector. then L2-normalizes. it’s deterministic, fast, needs zero ML deps, and produces embeddings where similar text actually lands in similar regions of the vector space. not semantically meaningful in the way a transformer is, but way better than random for development and testing. (You know? Well actually probably you don’t know, most people didn’t spend all of yesterday doing a bunch of NLP and ML research lollll)

NullEmbedder is the lazy one. returns a zero vector. exists for when you need the trait satisfied but don’t care about the output. sometimes you just need a stub and that’s fine. (aka a lot of tests, DO NOT USE THIS anywhere else… or maybeee hehe /j)

all three implement EmbeddingProvider from core. swap them at construction time. the rest of the system doesn’t know or care which one is running.
(i love doing that pattern. it’s just the best. Thanks, Saikuro)


the vector index

penumbra-index wraps USearch with a clean VectorIndex trait: insert, remove, search, len, is_empty. the USearchIndex backend handles all the HNSW configuration. SearchHit carries a NoteId and a score.

the trait exists so I can mock the index in tests without spinning up the real HNSW structure every time. (learned that lesson from Honzo’s test suite.)


the graph got a cleanup

penumbra-graph didn’t get new features but it got a cargo fmt.

NoteId lost its redundant to_string() method because… it already derives Display. why did I write that. past me, explain yourself. what were you planning.


core got an Index error variant

PenumbraError::Index(String). three lines. because the index crate needed its own error type and I wasn’t going to abuse Embedding for it.


the stuff I did but still isn’t fully done

the layout engine with Barnes-Hut quadtree (i tried a simpler manual implementation, but it’s icky, i’m switching to a different crate), the hybrid search engine (i should probably switch this to a usearch method or different crate idk), and the storage layer (c’est mostly bon) all got real implementations in this commit too. plus a full test suite: core_matrix.rs, embed_matrix.rs, events_matrix.rs, graph_matrix.rs, index_matrix.rs, search_matrix.rs, storage_matrix.rs. seven test files. the naming convention carries over from Honzo because it works and I’m not fixing what isn’t broken.


You might notice that I’m running one commit behind lol.
Welllll I am, yeah
Ooops
Yeahhhh devlogging after every commit is the best way to do it, but sometimes you just wanna code

Oh wow I locked in

so remember yesterday when I said “today was all architecture, all decisions, all research”?

yeah.

today I wrote the architecture. all of it. 27 files. 3,795 lines added (1,900 ish is just Cargo.lock btw). one commit.

I sat down to “maybe scaffold the embed crate” and then I just… didn’t stop. the hyperfixation hit and I looked up and every single crate had real code in it. not stubs. not //! penumbra-whatever. actual implementations with actual tests.

let me try to explain what happened.


three embedders walked into a crate

(no the title is NOT a punchline to a joke, however much it sounds like one)

penumbra-embed went from a one-line comment to three full embedding providers:

CandleEmbedder is the real one. it loads Snowflake-Arctic-Embed-XS from HuggingFace Hub, tokenizes input, runs it through a BERT-style forward pass (word embeddings, mean pooling, linear encoder), and L2-normalizes the output into a 384-dim vector. the model weights come from safetensors via candle_nn::VarBuilder::from_mmaped_safetensors. the whole Candle pipeline is behind a feature flag so you don’t pull in the ML universe if you don’t need it.

SimpleEmbedder is the clever one. it slides character trigrams across the input, hashes each one, and accumulates them into buckets across a fixed-dimension vector. then L2-normalizes. it’s deterministic, fast, needs zero ML deps, and produces embeddings where similar text actually lands in similar regions of the vector space. not semantically meaningful in the way a transformer is, but way better than random for development and testing. (You know? Well actually probably you don’t know, most people didn’t spend all of yesterday doing a bunch of NLP and ML research lollll)

NullEmbedder is the lazy one. returns a zero vector. exists for when you need the trait satisfied but don’t care about the output. sometimes you just need a stub and that’s fine. (aka a lot of tests, DO NOT USE THIS anywhere else… or maybeee hehe /j)

all three implement EmbeddingProvider from core. swap them at construction time. the rest of the system doesn’t know or care which one is running.
(i love doing that pattern. it’s just the best. Thanks, Saikuro)


the vector index

penumbra-index wraps USearch with a clean VectorIndex trait: insert, remove, search, len, is_empty. the USearchIndex backend handles all the HNSW configuration. SearchHit carries a NoteId and a score.

the trait exists so I can mock the index in tests without spinning up the real HNSW structure every time. (learned that lesson from Honzo’s test suite.)


the graph got a cleanup

penumbra-graph didn’t get new features but it got a cargo fmt.

NoteId lost its redundant to_string() method because… it already derives Display. why did I write that. past me, explain yourself. what were you planning.


core got an Index error variant

PenumbraError::Index(String). three lines. because the index crate needed its own error type and I wasn’t going to abuse Embedding for it.


the stuff I did but still isn’t fully done

the layout engine with Barnes-Hut quadtree (i tried a simpler manual implementation, but it’s icky, i’m switching to a different crate), the hybrid search engine (i should probably switch this to a usearch method or different crate idk), and the storage layer (c’est mostly bon) all got real implementations in this commit too. plus a full test suite: core_matrix.rs, embed_matrix.rs, events_matrix.rs, graph_matrix.rs, index_matrix.rs, search_matrix.rs, storage_matrix.rs. seven test files. the naming convention carries over from Honzo because it works and I’m not fixing what isn’t broken.


You might notice that I’m running one commit behind lol.
Welllll I am, yeah
Ooops
Yeahhhh devlogging after every commit is the best way to do it, but sometimes you just wanna code

Replying to @NellowTCS

0
2
Open comments for this post

47m 50s logged

The Architecture Day (aka 0.4h on Hackatime but like 10h in reality)

So uh.

Hi.

New project.

Today was one of those days where I didn’t really code. Like, Hackatime says 0.4 hours. ZERO POINT FOUR. That’s nothing.

But here’s the thing: I spent basically the entire day thinking. Researching. Drawing arrows between boxes in my head. Googling “is candle-core wasm compat” and “best wasm compatible tokio alternative, smol or futures” and going down rabbit holes about OPFS and #[cfg] gates and whether petgraph is pure Rust (it is, thank god).

And I think I came out the other side with something actually good?


the architecture

so Penumbra is a spatial notes app. notes live on a canvas. related notes pull toward each other. unrelated notes drift apart. the whole thing is powered by embeddings and force-directed layout and it should feel like a living map of your thoughts.

and the ENTIRE thing needs to work on web AND desktop without platform #[cfg] spaghetti. that’s the rule. no #[cfg(target_arch = "wasm32")] in any cross-cutting interface. if I have to write #[cfg] I’ve already lost.

here’s what I landed on:

penumbra-core        # domain types + traits. zero deps beyond serde/uuid/thiserror
penumbra-events      # async-channel event bus
penumbra-graph       # petgraph-based graph store
penumbra-layout      # ForceAtlas2 + Barnes-Hut
penumbra-embed       # Candle embedding provider
penumbra-index       # USearch vector index
penumbra-search      # hybrid search (vector + text + tags + temporal)
penumbra-storage     # OPFS-backed persistence
penumbra-sync        # cloud sync abstraction

nine crates. for a notes app. am I overengineering this? maybe. do I care? absolutely not. Saikuro taught me that clean crate boundaries make everything easier later. I’d rather have nine small crates with clear jobs than one megacrate that does everything and hates me. (learned never to entangle your code from HTMLPlayer v2 (yeah that still exists, but i kinda need to press the big Archive this Repo button, I’m making a better new one)


the dependency flow

this is the part I’m actually proud of:

everything depends on core. core depends on nothing interesting. the graph feeds into layout. embed feeds into index feeds into search. storage and sync are siblings. events is the bridge to the UI.

no cycles. no tangled imports. one-directional flow. I drew this like five times on paper before I was happy with it. (I should really learn to use Mermaid for everything but my paper and pen are faster for brainstorming, sue me)


the WASM thing

okay so this was the big research rabbit hole. the whole point of Penumbra is true cross-platform. not “it works on desktop and we have a janky web version.” ACTUAL cross-platform. same code. same behavior. same everything.

the trick is picking libraries that just… work on both targets without needing platform gates:

  • serde/serde_json: universal. obviously.
  • uuid with the js feature: uses js-sys getRandomValues on wasm32. no #[cfg].
  • chrono with wasmbind: uses js-sys Date on wasm32. no #[cfg].
  • futures + async-trait: universal. no runtime dependency.
  • petgraph: pure Rust. no system deps. just works.
  • opfs: this crate is kind of magic. tokio::fs on native, browser OPFS on wasm. one API. the whole reason I don’t need a StorageProvider trait.
  • candle: wasm32 with SIMD backend. Snowflake-Arctic-Embed-XS quantized to GGUF.
  • usearch: wasm32 compatible. HNSW index with add/search/remove.
  • async-channel: universal. no runtime coupling.

I am unreasonably happy about this list.

The Architecture Day (aka 0.4h on Hackatime but like 10h in reality)

So uh.

Hi.

New project.

Today was one of those days where I didn’t really code. Like, Hackatime says 0.4 hours. ZERO POINT FOUR. That’s nothing.

But here’s the thing: I spent basically the entire day thinking. Researching. Drawing arrows between boxes in my head. Googling “is candle-core wasm compat” and “best wasm compatible tokio alternative, smol or futures” and going down rabbit holes about OPFS and #[cfg] gates and whether petgraph is pure Rust (it is, thank god).

And I think I came out the other side with something actually good?


the architecture

so Penumbra is a spatial notes app. notes live on a canvas. related notes pull toward each other. unrelated notes drift apart. the whole thing is powered by embeddings and force-directed layout and it should feel like a living map of your thoughts.

and the ENTIRE thing needs to work on web AND desktop without platform #[cfg] spaghetti. that’s the rule. no #[cfg(target_arch = "wasm32")] in any cross-cutting interface. if I have to write #[cfg] I’ve already lost.

here’s what I landed on:

penumbra-core        # domain types + traits. zero deps beyond serde/uuid/thiserror
penumbra-events      # async-channel event bus
penumbra-graph       # petgraph-based graph store
penumbra-layout      # ForceAtlas2 + Barnes-Hut
penumbra-embed       # Candle embedding provider
penumbra-index       # USearch vector index
penumbra-search      # hybrid search (vector + text + tags + temporal)
penumbra-storage     # OPFS-backed persistence
penumbra-sync        # cloud sync abstraction

nine crates. for a notes app. am I overengineering this? maybe. do I care? absolutely not. Saikuro taught me that clean crate boundaries make everything easier later. I’d rather have nine small crates with clear jobs than one megacrate that does everything and hates me. (learned never to entangle your code from HTMLPlayer v2 (yeah that still exists, but i kinda need to press the big Archive this Repo button, I’m making a better new one)


the dependency flow

this is the part I’m actually proud of:

everything depends on core. core depends on nothing interesting. the graph feeds into layout. embed feeds into index feeds into search. storage and sync are siblings. events is the bridge to the UI.

no cycles. no tangled imports. one-directional flow. I drew this like five times on paper before I was happy with it. (I should really learn to use Mermaid for everything but my paper and pen are faster for brainstorming, sue me)


the WASM thing

okay so this was the big research rabbit hole. the whole point of Penumbra is true cross-platform. not “it works on desktop and we have a janky web version.” ACTUAL cross-platform. same code. same behavior. same everything.

the trick is picking libraries that just… work on both targets without needing platform gates:

  • serde/serde_json: universal. obviously.
  • uuid with the js feature: uses js-sys getRandomValues on wasm32. no #[cfg].
  • chrono with wasmbind: uses js-sys Date on wasm32. no #[cfg].
  • futures + async-trait: universal. no runtime dependency.
  • petgraph: pure Rust. no system deps. just works.
  • opfs: this crate is kind of magic. tokio::fs on native, browser OPFS on wasm. one API. the whole reason I don’t need a StorageProvider trait.
  • candle: wasm32 with SIMD backend. Snowflake-Arctic-Embed-XS quantized to GGUF.
  • usearch: wasm32 compatible. HNSW index with add/search/remove.
  • async-channel: universal. no runtime coupling.

I am unreasonably happy about this list.

Replying to @NellowTCS

0
2

Followers

Loading…