Where the f*#@ have I heard this voice actor? (2024)

No seriously, where?!

If you also ask yourself this question every time you go to the movies, this app is made for you. It collects all the data of Italian voice actors and compares them with the works you've already seen.

Frontend

Backend

Challenges

1
Data collection: on Wikipedia there are about 900 voice actors, who have worked on about 50k different works.
2
Voice actor comparison: the main challenge: comparing multiple voice actors to discover which common works they've worked on.
3
Speed: the application must move quickly and reduce waiting times to a minimum.
4
External integrations: users must be able to import their data from Letterboxd and Trakt.

Solutions

1
Puppeteer-cluster: Puppeteer-cluster came to our rescue, a library that allows scraping large quantities of pages in a controlled and simultaneous way.
2
Data structure: The data structure was fundamental, and with it the use of MongoDB aggregations, with sometimes very long pipelines.
3
Caching system: A caching system was implemented thanks to Redis, and the pipelines underwent several optimization and cleaning steps to ensure the best performance.
4
Web interface: Through the web interface, it's possible to import data from both platforms.

Project status

The project consists of three main parts: the frontend, the backend, and a background worker that handles scraping. To reduce infrastructure costs, the scraper is currently stopped and the database not updated to 2025. However, the app remains functional in all its parts.
The project is under reconstruction. I will be updating this page shortly with a video demo.
There's still a lot of work to do, as outlined in the repo's readme, but among everything I'd like to improve the UI and the mongo pipelines that handle checking common voice actors between two works are still a bit slow for me to say I'm satisfied, but I reserve the right to make further adjustments once I get my Mongo certification ✨