Groa, a greenfield project, uses machine learning to power an innovative recommendation engine that generates tailored movie suggestions to users based on their unique taste and personal preference.

Solely responsible for implementing the Google Analytics API to collect, measure, and report key metrics to stakeholders with data visualizations.

Problem

  • The distribution platforms (Netflix, Amazon, and Disney) all exclusively promote their proprietary contect.
  • Ratings aggregation sites (Rotten Tomatoes, Metacritic) have all been purchased by production companies.
  • Social media sites (Reddit, Twiiter, Facebook) are covered in astroturf campaigns to promote the movies made by the big studios.
  • Find movies which match your taste has become a chore.

Solution

  • Groa Engine uses Natural Language Processing and collaborative content filtering to create personalized movie recommendations based on the text of user reviews rather than a reductive numerical rating system, while still providing accurate recommendations even if that is the only user input.
  • Groa recommendation system will prioritize actual user preferences over irrelevant promoted content.
  • Highlight the input of the users instead of devaluing our service with skewed results.

Data Science Problems Addressed


1 | The API is slow and the user has to wait to get actual recommendations

Previously, when users utilized the recommendation model, it would take 60 to 93 seconds to load results.

Solution:

  • Refactored our recommendation algorithm to improve the speed of the API.
  • Reviewed the code related to the API.
  • Implemented FastAPI caching system.

Result: Recommendation model response time is significantly reduced to an average of 4.1 seconds.

Solution: Added movie descriptions with information relevant to the movie: watch time, cast, director, producers, etc.

Result:

  • Improved bounce rate by 66%.
  • Increased average session duration by 665%.

Groa API Architecture

The Groa API is engineered using the FastAPI web framework and serves recommendations using a Word2Vec model. The Postgres database is running on a AWS Relational Database Service. The Groa API utilizes ElastiCache (running Redis) for caching and Elasticsearch service from AWS for search.

DS API Architecture

Tech Stack


Word2Vec Model | AWS Elastic Beanstalk | Docker | Python | Google Analytics


Link to Website

Groa URL

Link to Code

My Code