Performance Labeling System
How our foundational model represents player performance
Our base model predicts five key performance statistics as a single unified label. Each label represents a unique combination of discrete performance bins.
For example:
A performance falling into the bin group “10–20 passes completed, 80–85% pass completion, 34–44 touches, >2 shots, >0.30 xG” corresponds to the label 1993.
We have 7,680 unique labels, each describing a distinct category of performance. This discretization strategy, turning continuous values into categorical bins, was a deliberate modeling choice that:
Transforms the problem into multi-class classification, which proved more stable and performant than regression.
Encourages structured, contextual reasoning about performance profiles instead of noisy continuous estimates.
Scales elegantly to additional performance metrics: if you want to build your own simulation engine with custom metrics, you can naturally extend the BALLER Transfer Portal architecture with your own binning scheme.
Stat Bin Definitions
Below are the bins used in our base model.
Shots (4 bins)
0
0 shots
1
1–2 shots
2
>2 shots
3
N/A
Expected goals: xG (5 bins)
0
0.00
1
0.01–0.10
2
0.11–0.30
3
>0.30
4
N/A
Pass Completion % (8 bins)
0
0–60%
1
60–70%
2
70–75%
3
75–80%
4
80–85%
5
85–90%
6
90–100%
7
N/A
Passes Completed (8 bins)
0
0–10
1
10–20
2
20–30
3
30–50
4
50–200
5
N/A
Touches (8 bins)
0
0–11
1
11–22
2
22–34
3
34–44
4
44–55
5
55–69
6
69–217
7
N/A
As you can see, the model is robust enough to explicitly handles missing or nonsensical values. This allows it to output “I don’t know” when the context does not provide enough information, rather than producing incorrect predictions. It also enables the model to gracefully handle special cases—such as statistics that are irrelevant for goalkeepers—without forcing artificial or misleading outputs.
Convert bins to a unique label
Convert label to bins
Convert label to human-readable stat bins
Last updated