Pokedex
- 3 Devlogs
- 15 Total hours
A real time Pokedex app with a custom Object Detection model trained for gen 1 Pokemons.
A real time Pokedex app with a custom Object Detection model trained for gen 1 Pokemons.
I have been working on making a real life pokedex, one in which you just take the photo and get the stats of that pokemon on your screen. To achieve this I decided to train a yolov8n-cls model on images of pokemon. But getting those images were a nightmare due to GOOGLE!!!
I built a custom scraper to scrape pokemon images from google images. But since the very start of the project, Google has been blocking me, it took a lot time for me, who had never touched selenium at that time, to bypass google’s anti bot systems. Since my laptop is very low end, I was using selenium in headless mode which triggered even more red flags for my bot. I switched from regular selenium to undetected_chromedriver, but still google’s system was flagging me with a big captcha that my little bot couldn’t solve. The main issue was that in headless mode, selenium runs a very small window size and it leaves clear machine tracks, I fixed that by using a larger window size and adding a user agent string that I chosen from a list of 10 different ua strings.
Not being flagged by google was easy in comparison to this. Targetting the Images inside the search results was an absolute nightmare because instead of fixed static classes or ids google uses dynamic strings that change frequently, I wanted to write code that anyone can use anywhere and that is why couldn’t target those ever changing class names. I tried a lot things from targeting images inside nested divs to images inside divs with specific roles (eg: listitems) or targeting images that have /imgres in their href attribute. But it took me an entire, with 6+ hours of coding and trying and a lot more researching and inspecting the elements of google images results but I found out that by targetting the images div with role of main with the images having a src or data-deffered attribute using the XPATH: //div[@role='main']//img[@data-deferred or @src] , I get the images which are inside the search results and I can then just use infinite scrolling to load them.
Knowing how to target those images, I absolutely destroyed google in the fight by making the scraper and using it to scrape a total of 74,718 images in one night without getting blocked a single time. My goal was was 500 images for each pokemon in generation 1, that is 75,500 total images, The scraper worked better than expected and I was just a 700 hundred images behind the goal. But even for those 700 images, the culprit was google not my little scraper, google didn’t have that many images for a few unpopular pokemon to fulfill my goals.
After winning the battle against google, I will use the images to train my model today, I already split those. Since my laptop is slow I will utilize google collab for training, using their own tool to train the model using the images they were trying so hard to hide XD….. and I really hope that scraping that many images is not illegal.
Heyy Everyone
Pokedex Devlog 2 !!!
The selenium scraper is finally complete and it is working perfectly, except it hits a captcha some times, though it is rare.
I started building this scraper without any knowledge of selenium. I started with a very basic bot that went to google and typed pokedex, I slowly added more features like using headless mode and WebDriverWait instead of hardcoded time delays. I once crashed my computer while printing the encrypted source code of the google result, XD. But I was also encountering captchas a lot so I switched to undetected_chromedriver and gave it a user agent string so that it won’t be flagged.
My first bot was able to scrape the headers of the google search result. I then moved to an image scraping bot, The first version was very simple, It simply went to google images and downloaded every img tag that it could find. However, regardless of the target it scrapped only 21 images, the first being the google logo. This was because of lazy loading.
To fix this I implemented an infinite scrolling loop that scrolled until it reached the end of the page or hit the target. But it still wasn’t near the target, after running the script without headless mode, I found out that the user agent string I gave it was of chrome 91, that was released in 2021, after switching to a newer version, chrome 135, it was able to complete it’s target but it was also downloading junk images like logos, icons or spacers that were present inside the search result.
I spent 2 days fixing this problem, trying to point towards the image tags of the thubmnail images we see in google images. I wanted to write a code that would work everytime, that is why i couldn’t just point towards the ever changing classname of the img tag. I tried targeting imgs inside specific divs like img inside the div with role of listitems or img tags whose href attribute started with “/imgres” but it didnt worked. After a lot of tries I was able to get it right by targetting the img inside the div with role of main and whose dimensions are more that 100*100px. It worked like a charm. I then quickly made it in the form of a function and gave it 10 pokemons with target of 100 imgs each and boom I got 1000 images in total.
I have also decided that labelling 15,100 images would take ridiculously long and the accuracy of the model would also be not accurate. So instead I am building a classification model and labelling it would be easy I am downloading the pokemon images inside the folder with their name. So now that I don’t have to worry about spending hours downloading or labelling them, I would use 500 images per pokemon and create a much accurate model. Currently I am running the script over all pokemons in gen 1, 151 in total, It has reached till Rattata, pokedex number - 19, that means images of 18 out of 151 pokemon are downloaded. I will keep the script running until all are complete and start training the classification model tomorrow.
Heyy everyone
I am currently building an android based pokedex app. The goal is simple point your phone’s camera towards a pokemon and it would use real time object detection to identify it and use pokeapi to fetch detailed stats about the pokemon.
When I first thought of this project, I wanted to make a pokedex for all pokemon. However, some simple calculations gave me a reality check. To get a deceny accuracy, I aimed to train the object detection model at 100 images per pokemon, but there are 1,025 pokemon in total, that would mean more than 100k images to collect and label, it would take months if not years for me to do it alone, that’s why I narrowed down quickly to only generation 1 pokemons, around 151 pokemons and 15,100 images which still is a lot to collect and label manually.
Instead of collecting images manually, I wanted to automate the process using the icrawler python framework. I initially tried using the built in google image crawler but it failed again and again, it returned no images, it wasnt able to detect the image tags, after some research I found out what was happening, google frequently changes their css class names to prevent bots from scraping and icrawler relied on hard coded class names.
So I switched to the in built Bing image crawler and ran a test batch of 5 pokemons, it worked, I was happy to see images of pokemons being stored in my computer. But there was another roadblock, I had aimed at a total of 500 images (100 per pokemon) but it only managed to scrape around 30 images per pokemon. That’s only 30% of my target.
To get past this I planning to build a better scraper using selenium and implement a dynamic scrolling loop to trigger lazy loaded images. I have attached today’s results, I got 156 images in total today and I hope for 500 tomorrow.
Stay tuned for the next update