Automated Sports Fan Video Generation with Facial Recognition

The client is a US-based mobile app developer. The mobile app is a sports video platform where fans can sign up to receive videos of them from live events directly to their phone. The client wants to develop an automated solution for creating fan videos across large amounts of video content. It takes a large amount of time to go through 3-hour videos, select shots with fans, identify them, and prepare a video montage. They reached out to BroutonLab to develop an engine for gender recognition, facial detection & recognition, shot quality measurement, and video captioning, in order to automatically produce a ready-made video to send to fans.

The solution is a complex engine that can be separated into multiple models, including: - Facial recognition from videos - Gender recognition - Detecting whether the video frame shows fans or not - Emotion recognition - Scene detection - detecting whether an image represents the beginning of a new scene - Audio noise removal - Background removal from video - Automated image captioning We organized the process of data collection and labeling and built pipelines to manage input and output data flows with Luigi and DVC. Different state-of-the-art models were tested and employed during the project, including ResNet, Xception, and EfficientNet, and custom architectures. The models were deployed as microservices using Flask in Azure Cloud.

Our solution reduced the cost of labor related to video preparation by 90%. As a result of fast turnarounds for fan requests, 20 more contracts were signed with TV channels, and the time needed to process a single video was decreased from 3 hours to 30 minutes. The decrease of work and cost associated with video preparation allowed our client to shift focus to other venues and focus on other aspects of the platform.

Challenge

Solution

Results