Gibran
Featured Data Analysis Project

Research on Applying Machine Learning to Improve Player Valuation For Scouting in a Football Team

A project of football player transfer prediction using player performance and real time play analysis.

Players Analyzed

24.5K+

Time Period

2023-2024

Data Points

7.5M+

Dashboard Preview

1 / 2
Dashboard Build with Grad.io
Unsupervised clustering of the VAE system
1 / 2

Main dashboard view of the transfer prediction build with Grad.io

Tools & Technologies

WhoScoredSPADLGradioSQLPython

Problem

This project addresses the challenge of inefficient and subjective football scouting by developing a data-driven machine learning system to improve player valuation and recruitment decisions. It optimizes manual, intuition-based evaluation with a scalable approach that analyses player performance data, identifies similar players, and evaluates team fit through chemistry metrics. The goal is to help clubs make more consistent, accurate, and efficient scouting decisions in an increasingly competitive and data-rich football environment.

Results

The proposed machine learning system improves player scouting by producing more accurate and meaningful player comparisons and recommendations. The VAE-Gamma model significantly outperforms traditional methods in clustering player data, leading to better similarity matching. The chemistry metrics, particularly Joint Offensive Impact (JOI), align well with real-world transfer decisions and benchmark rankings, achieving solid recommendation performance (e.g., consistent shortlist rankings and a Hit@10 rate of 0.45).

Key Insights

Youth crime patterns decreased significantly during Covid-19 but rises again after 2021

Crime incident spikes in the afternoon, espeecially during school days

There are shift of location density in precints level

Sex crime rank especially harrasment highest for youth crime

Data source: NYPD Complaint Data Historic • Last updated: December 2023