Gibran
Featured Data Analysis Project

Research on Applying Machine Learning to Improve Player Valuation For Scouting in a Football Team

A project of football player transfer prediction using player performance and real time play analysis.

Players Analyzed

24.5K+

Time Period

2023-2024

Data Points

7.5M+

Dashboard Preview

1 / 2
Dashboard Build with Grad.io
Unsupervised clustering of the VAE system
1 / 2

Main dashboard view of the transfer prediction build with Grad.io

Tools & Technologies

WhoScoredSPADLGradioSQLPython

Problem

This project addresses the challenge of inefficient and subjective football scouting by developing a data-driven machine learning system to improve player valuation and recruitment decisions. It optimizes manual, intuition-based evaluation with a scalable approach that analyses player performance data, identifies similar players, and evaluates team fit through chemistry metrics. The goal is to help clubs make more consistent, accurate, and efficient scouting decisions in an increasingly competitive and data-rich football environment.

Results

The proposed machine learning system improves player scouting by producing more accurate and meaningful player comparisons and recommendations. The VAE-Gamma model significantly outperforms traditional methods in clustering player data, leading to better similarity matching. The chemistry metrics, particularly Joint Offensive Impact (JOI), align well with real-world transfer decisions and benchmark rankings, achieving solid recommendation performance (e.g., consistent shortlist rankings and a Hit@10 rate of 0.45).

Key Insights

Machine learning can improve football scouting by identifying suitable players faster and more consistently than relying solely on human intuition.

VAE-Gamma representation learning enhances player similarity analysis by transforming high-dimensional football statistics into meaningful latent features.

VAEP-based chemistry metrics (JOI and JDI) help evaluate how well a player is likely to fit within a team's offensive and defensive structure.

The proposed scouting system combines similarity search, chemistry prediction, and an interactive dashboard to support data-driven recruitment decisions.

Data source: NYPD Complaint Data Historic • Last updated: December 2023