DevOps is the stuff that lurks under my bed and keeps me at night: DevLog #3

Dramatic title, but there is definitely truth to the statement.

From the beginning, my goal was to make this extension available to anyone, anywhere, at all times. You might think “Oh Sai, doesn’t that mean you have to actually deploy your app to the webstore?”

Yes dear reader, it does mean that. It also means I have to go around miles of red tape set up by the dear dear developers at google.

This hellish landscape of DevOps is shaped my monoliths of docker images and rolling hills of AWS provisioning.

I’ve spent over 10 hours configuring AWS setups and begging my dockerfile to do what I want it to, and only 3 hours logged to show for it.

If only there was a way to log hours outside the IDE ig

So what all did I accomplish?

Well, the backend of my extension runs off a python backend with a postGres database. Yes, I could have used something like supabase, but thats for cowards, real ones build their own.

I’m just kidding, I chose not to use Supabase because:

I wanted to build my full-stack skills
Costs need to be low, i’m kinda broke
I could have more freedom with my database if I used my own configuration

So wish me luck as I push through the long haul of making my project prod-ready

Open comments for this post

@saigiridhar_chitturi on PrivacyLens · about 2 months ago

4h 31m 46s logged

Devlog 2!!

It turns out, there’s a reason that webscraping is such a pain for so many people, and I found that out the HARD way. Normally, programs trying to get info off of a website have to check the robots.txt website to see if the creators of the website:

Are okay with you scraping
Will make it easy for you to scrape
Will sue you into oblivion (satire)

These things are great for regulating scrapers, but unfortunately, browser extensions are collateral damage.

The way a browser extension get info off a webpage is VERY similar to web scraping but not exactly the same thing, so websites hostile to webcrawlers are also inadvertently harming extensions that try to read site content.

What exact problems did I face to find all this out you ask?

Well, my project reads privacy policies off of websites, which happen to be copyrighted material, so web developers take care to prevent any automating stealing of the policies; however, my project needs to read those policies in order to send them to the backend and summarize the whole thing.

Most of my initial attempts resulted in the scraped information being cluttered with JS, HTML Tags, Formatting, and other junk stuff that wayyy increased the tokens I used inputting the policy into AI.

This is when I discovered Mozilla’s Readability Library. This was SUCH a lifesaver as it automatically took out any junk and left the REAL text. It way outperformed my previous methods and reduced the token usage up to three fold!

Currently, I just imported the library js file into my frontend, but I know that’s not prod-ready, so it’s likely that I have to import it another way and add a build step 😑

(Image is of the Readability Library)

Open comments for this post

@saigiridhar_chitturi on PrivacyLens · about 2 months ago

1h 2m 23s logged

Are you tired of getting your data stolen by companies without even knowing why?

I’m fixing that with PrivacyLens, an AI tool that summarizes and gives you the information you need to know about when you visit a site. PrivacyLens gives you the major Red flags and some green flags about the website’s privacy policy. Right now, im working on making the UI better and looking for some clean, and thematic layouts that match the investigator type look i’m going for.