You are browsing as a guest. Sign up (or log in) to start making projects!

VeritasAI

  • 17 Devlogs
  • 110 Total hours

VeritasAI - A Chrome extension that reads the news article you're on, extracts its factual claims, and verifies each one against live sources in real time. For every claim it returns a confidence level, a list of supporting and contradicting sources with political lean labels, and a divergence summary showing how outlets across the spectrum are framing the same underlying fact. No verdicts, just sources, so you can judge for yourself.

Open comments for this post

1h 13m 30s logged

I did a cleanup session today: moved hardcoded API keys from config.js into .env, added a startup check that exits immediately with a clear message if any required key is missing, and split a 650-line index.js into pipeline.js, cli.js, and utils.js. The tool runs cleanly on my machine and I have been using it on articles I read this past week, but it requires setting up two API keys to run so nobody else can actually use it yet, and figuring out distribution is probably the next real question.

I did a cleanup session today: moved hardcoded API keys from config.js into .env, added a startup check that exits immediately with a clear message if any required key is missing, and split a 650-line index.js into pipeline.js, cli.js, and utils.js. The tool runs cleanly on my machine and I have been using it on articles I read this past week, but it requires setting up two API keys to run so nobody else can actually use it yet, and figuring out distribution is probably the next real question.

Replying to @rianchad

0
2
Open comments for this post

7h 6m 19s logged

This session was supposed to take two hours and took seven because I assumed fetching and parsing a news article from a URL would be straightforward. It is not: axios plus cheerio works fine for AP and Reuters, NYT and WaPo render their bodies client-side so the raw HTML comes back nearly empty, Puppeteer solved that but added 5 seconds of headless Chrome startup time to every run, and @mozilla/readability worked better across most sites but choked on paywalls. I ended up with a three-layer fallback that tries readability first, then the longest <article> tag, then the largest div by text length, which covers about 70% of sites. The session ended with a clean formatted terminal report using chalk for color-coded confidence labels, which was the first time the tool looked like something a person would actually use.

This session was supposed to take two hours and took seven because I assumed fetching and parsing a news article from a URL would be straightforward. It is not: axios plus cheerio works fine for AP and Reuters, NYT and WaPo render their bodies client-side so the raw HTML comes back nearly empty, Puppeteer solved that but added 5 seconds of headless Chrome startup time to every run, and @mozilla/readability worked better across most sites but choked on paywalls. I ended up with a three-layer fallback that tries readability first, then the longest <article> tag, then the largest div by text length, which covers about 70% of sites. The session ended with a clean formatted terminal report using chalk for color-coded confidence labels, which was the first time the tool looked like something a person would actually use.

Replying to @rianchad

0
1
Open comments for this post

3h 31m 34s logged

I rewrote the whole pipeline in Node.js and parallelized all the agents because the Python sequential version was timing a 7-claim article at 68 seconds, which is not usable. The rewrite took about 90 minutes and the rest of the session was debugging differences between the Python and Node versions, mostly around how I had structured the retry logic. Running 14 concurrent API requests at once got me rate-limited immediately, so I added a concurrency queue capped at 4 simultaneous calls, and the same article that took 68 seconds dropped to 14 seconds.

I rewrote the whole pipeline in Node.js and parallelized all the agents because the Python sequential version was timing a 7-claim article at 68 seconds, which is not usable. The rewrite took about 90 minutes and the rest of the session was debugging differences between the Python and Node versions, mostly around how I had structured the retry logic. Running 14 concurrent API requests at once got me rate-limited immediately, so I added a concurrency queue capped at 4 simultaneous calls, and the same article that took 68 seconds dropped to 14 seconds.

Replying to @rianchad

0
1
Open comments for this post

59m 37s logged

Two things kept crashing longer test runs so I fixed them both today. Claude was returning malformed JSON about one in fifteen runs despite explicit instructions, either wrapping output in backticks or adding a trailing comma, so I added a cleanup step before JSON.parse and a retry that sends the raw response back with “this was not valid JSON, return only the array with no other text.” The second issue was unverifiable claims burning three full search iterations and returning nothing, so I added a classification step upfront where subjective claims skip the loop entirely and return a subjective_claim flag instead.

Two things kept crashing longer test runs so I fixed them both today. Claude was returning malformed JSON about one in fifteen runs despite explicit instructions, either wrapping output in backticks or adding a trailing comma, so I added a cleanup step before JSON.parse and a retry that sends the raw response back with “this was not valid JSON, return only the array with no other text.” The second issue was unverifiable claims burning three full search iterations and returning nothing, so I added a classification step upfront where subjective claims skip the loop entirely and return a subjective_claim flag instead.

Replying to @rianchad

0
1
Open comments for this post

3h 35m 57s logged

I added the divergence layer today, which surfaces how outlets with different political leans frame the same underlying fact rather than just whether the fact is accurate. I first tried scraping lean ratings from AllSides at runtime but scrapped it after 45 minutes because it was fragile, probably against their ToS, and too slow for a per-request lookup. I replaced it with a hardcoded lookup table sourced from Ad Fontes Media data covering about 40 major outlets, where anything not in the table returns “Unknown” rather than a guess. Testing it on a border crossings story came back with CNN describing the encounter numbers as a humanitarian crisis, Fox describing the same numbers as a Biden policy failure, and the DHS primary source showing the raw figure all three were citing.

I added the divergence layer today, which surfaces how outlets with different political leans frame the same underlying fact rather than just whether the fact is accurate. I first tried scraping lean ratings from AllSides at runtime but scrapped it after 45 minutes because it was fragile, probably against their ToS, and too slow for a per-request lookup. I replaced it with a hardcoded lookup table sourced from Ad Fontes Media data covering about 40 major outlets, where anything not in the table returns “Unknown” rather than a guess. Testing it on a border crossings story came back with CNN describing the encounter numbers as a humanitarian crisis, Fox describing the same numbers as a Biden policy failure, and the DHS primary source showing the raw figure all three were citing.

Replying to @rianchad

0
3
Open comments for this post

15h 16m 59s logged

I started with SerpAPI for web search and spent the first two hours realizing every result it returned was months old because of aggressive free-tier caching, then switched to Brave Search API and got current results immediately. My first verification approach was a single prompt passing the claim and search results to Claude at once, but the outputs were shallow and the first search query was usually wrong anyway, so I built an iterative loop where Claude picks a query, I run the search, and Claude decides whether to go deeper or return a result. The loop ran forever in early testing until I added a hard stopping rule to the prompt: “If you have at least 2 independent sources that confirm or deny the claim, stop searching and return your assessment now.” By the end of the day, the pipeline surfaced CNN and Fox both citing the same BLS statistic with completely different framings alongside the actual BLS.gov primary source, which was exactly the output I had been trying to build toward.

I started with SerpAPI for web search and spent the first two hours realizing every result it returned was months old because of aggressive free-tier caching, then switched to Brave Search API and got current results immediately. My first verification approach was a single prompt passing the claim and search results to Claude at once, but the outputs were shallow and the first search query was usually wrong anyway, so I built an iterative loop where Claude picks a query, I run the search, and Claude decides whether to go deeper or return a result. The loop ran forever in early testing until I added a hard stopping rule to the prompt: “If you have at least 2 independent sources that confirm or deny the claim, stop searching and return your assessment now.” By the end of the day, the pipeline surfaced CNN and Fox both citing the same BLS statistic with completely different framings alongside the actual BLS.gov primary source, which was exactly the output I had been trying to build toward.

Replying to @rianchad

0
1
Open comments for this post

1h 7m 44s logged

I caught a bad bug where the extractor was hallucinating numbers: I ran the same AP unemployment article three times and the reported rate came out as 4.2%, 4.1%, and 3.9% across the three runs when the article clearly says 4.2%. The model was filling in from training data instead of reading the text in front of it, which is a subtle failure mode because the numbers look plausible enough that you would never notice without explicitly checking. The fix was adding a verbatim extraction instruction to the system prompt and setting temperature to 0.

I caught a bad bug where the extractor was hallucinating numbers: I ran the same AP unemployment article three times and the reported rate came out as 4.2%, 4.1%, and 3.9% across the three runs when the article clearly says 4.2%. The model was filling in from training data instead of reading the text in front of it, which is a subtle failure mode because the numbers look plausible enough that you would never notice without explicitly checking. The fix was adding a verbatim extraction instruction to the system prompt and setting temperature to 0.

Replying to @rianchad

0
3
Open comments for this post

5h 51m 24s logged

I started this after my dad and I spent 45 minutes arguing over an immigration article where we were both citing different sources reporting the same statistic at different numbers, and neither of us could figure out who was right because we couldn’t agree on which source to trust. My first instinct was a binary fact-checker that returns TRUE or FALSE, but the first time I prompted Claude that way I got confident verdicts with no sources and realized immediately that a confidently wrong answer with no citations is worse than no answer at all. I tried GPT-4 for claim extraction and kept getting paragraph-length analysis instead of a JSON array, then switched to Claude and got clean structured output on the first attempt once I tightened the prompt.

Extract factual claims from the article below. Return ONLY a JSON array of strings.
No explanation before the array. No explanation after. No markdown fences.

I finished the session with a Python script that takes pasted article text and returns a JSON array of 5-8 discrete factual claims, all traceable to specific sentences in the article.

I started this after my dad and I spent 45 minutes arguing over an immigration article where we were both citing different sources reporting the same statistic at different numbers, and neither of us could figure out who was right because we couldn’t agree on which source to trust. My first instinct was a binary fact-checker that returns TRUE or FALSE, but the first time I prompted Claude that way I got confident verdicts with no sources and realized immediately that a confidently wrong answer with no citations is worse than no answer at all. I tried GPT-4 for claim extraction and kept getting paragraph-length analysis instead of a JSON array, then switched to Claude and got clean structured output on the first attempt once I tightened the prompt.

Extract factual claims from the article below. Return ONLY a JSON array of strings.
No explanation before the array. No explanation after. No markdown fences.

I finished the session with a Python script that takes pasted article text and returns a JSON array of 5-8 discrete factual claims, all traceable to specific sentences in the article.

Replying to @rianchad

0
2

Followers

Loading…