Open comments for this post

@Ansh904 on Pokedex · 5 days ago

10h 2m 4s logged

Cosmidex Web app is complete !!!

Devlog #7

The web version of the Cosmidex app is also finished, FINALLYYY!!!

Here is what I have added:

Converted the tflite model to TFJS, took a lot of time, many libraries were deprecated, so I had to manually remove imports for those.
Created seperate web components using .web.js to upload/capture images -> preprocess -> predict, my codebase for both the android and the web version of the app is still the same.
Created seperate page components for the web and changed the design for the Pokemon Page and the Home page.

What’s left?

The app is mostly ready for it’s first ship, I have hit the 10 hr mark thats why I am uploading a devlog early, but there’s not much left to do, I know I have been saying this for the last 2 devlogs. Here’s what is left:

Remove all the console logs that still linger in components for debug.
The camera doesn’t close in the web version after capturing and predicting so I have to fix that.
Probably change the home page for the android version to match the web.
Add some cool animations to it.
At last I want to push this project one bit further and add a capturing feature, like you capture a photo of a pokemon and it gets added to your caught pokemon in the pokedex and after capturing all 151 pokemon, you get a medal.

Open comments for this post

@Ansh904 on Pokedex · 15 days ago

10h 11m 55s logged

Pokedex Mobile app is finally finished

Pokedex Devlog #6

The Pokedex mobile app is now finished and I am working on finishing the web version of the app too.

Here is what I have added:

A NotConfident page that is displayed when the model can’t give a solid prediction about the image.
A ’AllPokemon` page that displays all 151 pokemons from generation 1 in a flatList.
I was using a a very messy state based navigation so I updated it to a Stack navigation.
I also add a tab at the bottom of the Pokemon, Not confident and All Pokemon pages, though it sometimes overlaps the page content like that of the mewtwo picture.
Create a seperate AppContext for the web version to load the model in the web.

What’s left in the project before the ship?

The Mobile app is finished, I just have to make the Camera and Upload components for the web version of the app and then the project would officially be completed.

I was having my exams so I wasn’t able to code for a while :(

Open comments for this post

@Ansh904 on Pokedex · about 1 month ago

15h 26m 53s logged

Pokedex App is finally complete (mostly)

Devlog #5

The app is now mostly complete, this phase of the project went very smoothly.

Here is what I have added:

After predicting the pokemon, The app sends a request to PokeAPI at the https://pokeapi.co/api/v2/{pokemon_name} endpoint and gets the detailed information of the pokemon.
The information is displayed on a pokemon page which includes basic stats, the type of the pokemon, height , wight et.
I have also worked a lot of ui, majority of the time was spent in designing the ui. However the home page still looks empty, it is just a div with ImageBackground and a logo with two buttons and 1 scanner icon that I designed in canva.
I have also added typed backgrounds, so that every pokemon type has a seperate space themed background, they are mostly nebula images that i found in the NASA Gallery but boyyyy do i like the themes, they look soooo gooood!!!. I am so proud of myself….

What’s left?

While the app is mostly ready, there a few things that I still need/want to add:

A page displaying all the 151 pokemons in generation 1
A page that is displayed when there is no confident prediction
A web version of the app, which would be pretty easy, I just need to change the uploading and capturing logic for it to work on web

So hopefully this is my second last devlog for this project, stay tuned !!!!

Open comments for this post

@Ansh904 on Pokedex · about 1 month ago

14h 48m 12s logged

Pokedex Devlog #4

Heyyyyy Everyone!!!

The training of the model is finally completed and I have made the first version of the Pokedex app, It captures an image using react-native-vision-camera, crops only the middle portion, resizes the image to match my model’s image size of 224*224 px, then gets the raw pixelbuffer, converts into rgb and normalises the values from 0-255 to 0-1.0. This normalised pixel data is then fed to the model which predicts the class of the image. There were a lot of problems in making the app and there are still lot more to come , but here is everything I did:

1. An Ironical Training

After collecting 75,000 images from my custom scraper, I zipped all of those images, and uploaded them directly to my colab notebook, uploading them took half an hour though😫 . I ran the training script at 50 epochs and training the model took another 8 hours . It was really ironical to see the model train in google colab on the same images that google was trying to hide from my little scraper.

2. A Successful Model

After the training was completed, I had to test the model to see how it performed and whether I need to retrain it. The results shocked me, I gave the model 12 images from pintrest that it had never seen before and it predicted all of them correctly. The accuracy was way better than what I expect from a model trained on raw data. However, if needed I will clean the dataset and train again.

3. React native Camera Hell

Everything was going very smoothly at this point, but I was never prepared for the next phase. I built a blank react native project, installed react-native-vision-camera and added a camera to the app.js, the camera was working, great, so I built a function to capture the photo and display it on the screen, that too worked. But the next step was to process the raw image pixel data into the format that my model required.

I searched online, how to resize an image in react-native-vision-camera, the ai results told me to use the resizer package that comes with it. I spent three, three whole days!!! figuring why the resizer was not able to resize the image only to realise that it took a frame output not an image, so I rewrote the whole camera code again to use a live stream instead of capturing a photo. The resizer was working, I then normalized the pixel values while converting it into rgb format and fed the data to my model which outputed complete gibberish, all that work for nothing!!! 😭 I tried logging in the shape of input and output of the model, and the pixelformat of the bufffer and the raw pixel data. But all those were giving either undefined or 0. Nothing seemed to be working at this about, there was something seriously wrong with how I was processing the images. But then going through the docs of react-native-vision-camera, I saw that the image object that my previous capturing images code, that i had dumped, was outputting was a react-native-nitro-image object and it had an inbuilt resizeAsync function.

I got frustrated again and rewrote the entire code again, this time with the previous capture photo logic. After cropping, resizing and getting the rgb data in to the required format, I tested the app with a photo of Gloom displayed on my laptop screen and it predicted it correctly. I was sooooo happyyy that it worked and is still working tho the accuracy is low due to pixel noises.

What’s next?

The hardest parts of the project are completed now, I just need to an api call, after the model predicts the pokemon to, get the details of the pokemon, like it’s type, pokedex entry and other things. The UI also sucks so I will work on that, I think I want to make it space theme, I don’t know how it would look but I will try. Till then… stay tuned!!!

Open comments for this post

@Ansh904 on Pokedex · about 2 months ago

2h 27m 52s logged

ME Vs GOOGLE

And I WON!!!

Pokedex Devlog #3

Hey everyone

I have been working on making a real life pokedex, one in which you just take the photo and get the stats of that pokemon on your screen. To achieve this I decided to train a yolov8n-cls model on images of pokemon. But getting those images were a nightmare due to GOOGLE!!!

Fighting Google’s anti bot systems

I built a custom scraper to scrape pokemon images from google images. But since the very start of the project, Google has been blocking me, it took a lot time for me, who had never touched selenium at that time, to bypass google’s anti bot systems. Since my laptop is very low end, I was using selenium in headless mode which triggered even more red flags for my bot. I switched from regular selenium to undetected_chromedriver, but still google’s system was flagging me with a big captcha that my little bot couldn’t solve. The main issue was that in headless mode, selenium runs a very small window size and it leaves clear machine tracks, I fixed that by using a larger window size and adding a user agent string that I chosen from a list of 10 different ua strings.

Getting those Images

Not being flagged by google was easy in comparison to this. Targetting the Images inside the search results was an absolute nightmare because instead of fixed static classes or ids google uses dynamic strings that change frequently, I wanted to write code that anyone can use anywhere and that is why couldn’t target those ever changing class names. I tried a lot things from targeting images inside nested divs to images inside divs with specific roles (eg: listitems) or targeting images that have /imgres in their href attribute. But it took me an entire, with 6+ hours of coding and trying and a lot more researching and inspecting the elements of google images results but I found out that by targetting the images div with role of main with the images having a src or data-deffered attribute using the XPATH: //div[@role='main']//img[@data-deferred or @src] , I get the images which are inside the search results and I can then just use infinite scrolling to load them.

The Results

Knowing how to target those images, I absolutely destroyed google in the fight by making the scraper and using it to scrape a total of 74,718 images in one night without getting blocked a single time. My goal was was 500 images for each pokemon in generation 1, that is 75,500 total images, The scraper worked better than expected and I was just a 700 hundred images behind the goal. But even for those 700 images, the culprit was google not my little scraper, google didn’t have that many images for a few unpopular pokemon to fulfill my goals.

What’s next?

After winning the battle against google, I will use the images to train my model today, I already split those. Since my laptop is slow I will utilize google collab for training, using their own tool to train the model using the images they were trying so hard to hide XD….. and I really hope that scraping that many images is not illegal.

Open comments for this post

@Ansh904 on Pokedex · about 2 months ago

10h 26m 13s logged

Heyy Everyone

Pokedex Devlog 2 !!!

The selenium scraper is finally complete and it is working perfectly, except it hits a captcha some times, though it is rare.

I started building this scraper without any knowledge of selenium. I started with a very basic bot that went to google and typed pokedex, I slowly added more features like using headless mode and WebDriverWait instead of hardcoded time delays. I once crashed my computer while printing the encrypted source code of the google result, XD. But I was also encountering captchas a lot so I switched to undetected_chromedriver and gave it a user agent string so that it won’t be flagged.

My first bot was able to scrape the headers of the google search result. I then moved to an image scraping bot, The first version was very simple, It simply went to google images and downloaded every img tag that it could find. However, regardless of the target it scrapped only 21 images, the first being the google logo. This was because of lazy loading.

To fix this I implemented an infinite scrolling loop that scrolled until it reached the end of the page or hit the target. But it still wasn’t near the target, after running the script without headless mode, I found out that the user agent string I gave it was of chrome 91, that was released in 2021, after switching to a newer version, chrome 135, it was able to complete it’s target but it was also downloading junk images like logos, icons or spacers that were present inside the search result.

I spent 2 days fixing this problem, trying to point towards the image tags of the thubmnail images we see in google images. I wanted to write a code that would work everytime, that is why i couldn’t just point towards the ever changing classname of the img tag. I tried targeting imgs inside specific divs like img inside the div with role of listitems or img tags whose href attribute started with “/imgres” but it didnt worked. After a lot of tries I was able to get it right by targetting the img inside the div with role of main and whose dimensions are more that 100*100px. It worked like a charm. I then quickly made it in the form of a function and gave it 10 pokemons with target of 100 imgs each and boom I got 1000 images in total.

I have also decided that labelling 15,100 images would take ridiculously long and the accuracy of the model would also be not accurate. So instead I am building a classification model and labelling it would be easy I am downloading the pokemon images inside the folder with their name. So now that I don’t have to worry about spending hours downloading or labelling them, I would use 500 images per pokemon and create a much accurate model. Currently I am running the script over all pokemons in gen 1, 151 in total, It has reached till Rattata, pokedex number - 19, that means images of 18 out of 151 pokemon are downloaded. I will keep the script running until all are complete and start training the classification model tomorrow.

Open comments for this post

@Ansh904 on Pokedex · about 2 months ago

1h 53m 44s logged

Heyy everyone

I am currently building an android based pokedex app. The goal is simple point your phone’s camera towards a pokemon and it would use real time object detection to identify it and use pokeapi to fetch detailed stats about the pokemon.

When I first thought of this project, I wanted to make a pokedex for all pokemon. However, some simple calculations gave me a reality check. To get a deceny accuracy, I aimed to train the object detection model at 100 images per pokemon, but there are 1,025 pokemon in total, that would mean more than 100k images to collect and label, it would take months if not years for me to do it alone, that’s why I narrowed down quickly to only generation 1 pokemons, around 151 pokemons and 15,100 images which still is a lot to collect and label manually.

Instead of collecting images manually, I wanted to automate the process using the icrawler python framework. I initially tried using the built in google image crawler but it failed again and again, it returned no images, it wasnt able to detect the image tags, after some research I found out what was happening, google frequently changes their css class names to prevent bots from scraping and icrawler relied on hard coded class names.

So I switched to the in built Bing image crawler and ran a test batch of 5 pokemons, it worked, I was happy to see images of pokemons being stored in my computer. But there was another roadblock, I had aimed at a total of 500 images (100 per pokemon) but it only managed to scrape around 30 images per pokemon. That’s only 30% of my target.

To get past this I planning to build a better scraper using selenium and implement a dynamic scrolling loop to trigger lazy loaded images. I have attached today’s results, I got 156 images in total today and I hope for 500 tomorrow.

Stay tuned for the next update