Last year, I wrote a blog post that modelled the effect that money has on outcomes in Europe’s five biggest football leagues. The key explanatory variable in my model was Transfermarkt’s squad market values. Squad value served as a proxy for a club’s financial clout. The assumption was that clubs generally invest as much money in the team as possible, but data capturing the full extent of squad spending is challenging to source. Research shows Transfermarkt’s player market values are strongly associated with transfer fees (Herm, Callsen-Bracker, and Kreis 2014; Müller, Simons, and Weinmann 2017; Coates and Parshakov 2022) and player salaries (Prockl and Frick 2018), so I figured squad values would be an adequate proxy for squad spending. But what if I was wrong?
A little over a month ago, I was made aware of the DFL’s (Deutsche Fussball Liga) financial reports detailing the annual accounts for every club in the Bundesliga (and 2. Bundesliga). The accounts include each club’s “staff costs”, which is the amount each club spends on wages for all employees (including but not limited to playing staff). I think this will should be a close approximation of squad wage spending1. Never one to miss an opportunity to make content, I went to work and grabbed six seasons of data from 2017/18 to 2022/23. Now, we can determine whether Transfermarkt’s squad values are a reasonable proxy for the data I was too lazy to go and find myself. And more importantly, we can find out if I was wrong2.
Exploring the Data
The data is pretty solid, but it comes with some caveats. While most clubs report their staff costs on a seasonal basis, some report staff costs for the financial year. In those cases, the costs don’t perfectly align with a single season, but I don’t think the change in costs from season to season is significant enough to be a huge concern. I also had to drop Frankfurt’s 2022/23 season3 and Paderborn’s 2021/22 season4 from the data.
I’m not too worried that these caveats will cause any significant problems. Although the sample size is relatively small, the data seem sufficient for the question I’m trying to answer here.
The Value-Cost Gap (Difference)
First, it’s probably worth comparing the squad values and staff costs directly to understand what we are working with. I will start by capturing the difference between the two and exploring how this figure varies across the dataset.
Plotting the difference between squad value and staff costs below, we see that the gap between the two is sizeable, with squad value usually (but not always) larger.
buli_resources |>mutate(diff = squad_value - staff_cost) |>ggplot(aes(x = diff)) +geom_histogram(color ="#343a40", bins =25) +geom_hline(yintercept =0, colour ="#343a40") +scale_euros(axes ="x") +labs(title ="Differences Between Squad Value & Staff Costs in the Bundesliga",subtitle = stringr::str_wrap( glue::glue("The distribution of club value differences (squad value minus staff ","costs) per season in the Bundesliga from 2017/18 to 2022/23." ),width =95 ),x ="Value Difference\n(Squad Value - Staff Costs)",y =NULL,caption ="Visualisation: Paul Johnson | Data: DFL & Transfermarkt" )
The difference between the two isn’t an issue. Squad value and staff costs are different things. If they generally vary together, that will suffice.
It’s also worth considering the team-level variance5. Table 1 shows the median difference between squad value and staff costs per season for clubs that have played three or more seasons in the data6.
Table 1: Median Value Differences in the Bundesliga from 2017/18 to 2022/23
Value Difference (Squad Value - Staff Costs)
Bayern Munich
€463.3M
Borussia Dortmund
€400.1M
RB Leipzig
€344.1M
Bayer Leverkusen
€276.0M
Borussia Mönchengladbach
€145.0M
Hoffenheim
€139.1M
Frankfurt
€125.0M
Wolfsburg
€111.5M
Hertha BSC
€103.2M
Schalke
€97.5M
Freiburg
€81.2M
Mainz
€77.1M
Stuttgart
€67.9M
Augsburg
€66.7M
Union Berlin
€51.9M
Werder Bremen
€48.8M
Köln
€28.3M
Source: DFL & Transfermarkt
Looking at Table 1, it is obvious that the differences are at least partially a function of the team’s quality or the resources available to teams. This probably identifies an issue with Transfermarkt’s values. Squad values are updated regularly and will be responsive to player (and team) performances. Good teams play better, leading to better teams seeing their values increase.
Another explanation could be that this is just a product of scales. The squad values are generally larger than the staff costs, but perhaps they are a certain percentage higher, on average, and for teams with higher staff costs, that’s inherently going to mean a bigger difference.
When Value Meets Reality (Correlation)
If the differences between squad value and staff costs are a mixed bag, the correlation between the two paints a clearer picture. There is a 0.94 correlation between squad value and staff costs, which is pretty remarkable.
When we visualise the association between the two below, it further illustrates how closely tied they are.
Plot Code (Click to Expand)
buli_resources |>ggplot(aes(x = staff_cost, y = squad_value)) +geom_point(alpha =0.5, size =2, colour ="#343a40") +geom_smooth(method = lm,colour ="#026E99",linewidth =1.2,alpha =0.15 ) +scale_euros() +labs(title ="Squad Value by Staff Costs in the Bundesliga",subtitle = stringr::str_wrap( glue::glue("The correlation between staff costs and squad value in the Bundesliga ","from 2017/18 to 2022/23." ),width =95 ),x ="Staff Costs",y ="Squad Value",caption ="Visualisation: Paul Johnson | Data: DFL & Transfermarkt" )
The relationship is incredibly clean. It is clearly linear, and any variance is minimal. Squad values are undervalued at the lower end of staff costs, but the reverse is true, as it increases. At the top end of staff costs, the squad values are all slightly undervalued. Squad value seems to be an excellent approximation of staff costs from this perspective.
Comparing Predictive Performance
The final test of squad value’s validity as a proxy for staff costs is to compare how well both predict Bundesliga outcomes. To remain consistent with the original model, I will use the outcomes used in my original blog post—points, goal difference, and expected goal (xG) difference—all standardised using games played (so their value represents outcome value per game).
Squad value and staff costs are strongly correlated with all three league outcomes, as shown in Table 2.
Table 2: Correlation Between Squad Value/Staff Costs & Bundesliga Outcomes
Correlation
Squad Value
Points
0.83
Goal Difference
0.86
xG Difference
0.88
Staff Costs
Points
0.75
Goal Difference
0.79
xG Difference
0.81
Source: DFL & Transfermarkt
All three outcomes have stronger correlations with squad value than staff costs. The differences are relatively small, but they are consistent. However, the differences are barely perceptible when we plot the relationship between our two predictors and the Bundesliga outcomes below.
Plot Code (Click to Expand)
buli_resources |># reshape data to long format for outcomes and metricspivot_longer(cols =c(pts, xgd, gd),names_to ="outcome",values_to ="outcome_value" ) |>pivot_longer(cols =c(squad_value, staff_cost),names_to ="metric",values_to ="metric_value" ) |>mutate(outcome =recode( outcome,pts ="Points",gd ="Goal Difference",xgd ="xG Difference",.default = outcome ),outcome = forcats::fct( outcome,levels =c("Points", "Goal Difference", "xG Difference") ),metric =if_else(metric =="squad_value", "Squad Value", "Staff Costs") ) |>ggplot(aes(x = metric_value, y = outcome_value)) +geom_point(alpha =0.5, size =1.2, color ="#343a40") +geom_smooth(# log-linear trend linesmethod = lm,formula = y ~log(x),colour ="#026E99",se =TRUE,linewidth =1,alpha =0.2 ) +facet_grid(# layout by outcome and metricrows =vars(outcome),cols =vars(metric),scales ="free",switch ="y" ) +scale_euros(axes ="x") +labs(title ="Bundesliga Outcomes by Squad Value & Staff Costs",subtitle = stringr::str_wrap( glue::glue("Comparing squad value and staff costs' assocation with Bundesliga ","outcomes—points, goal difference, and xG difference—from 2017/18 to ","2022/23. Outcomes standardised by games." ),width =93 ),x =NULL,y =NULL,caption ="Visualisation: Paul Johnson | Data: DFL & Transfermarkt" ) +theme(panel.spacing =unit(1, "lines"),strip.placement ="outside",panel.spacing.x =unit(.3, "cm"),axis.text.x =element_text(angle =30, vjust =1, hjust =0.8) )
The variance appears to be a little larger in the staff costs plots, particularly at around €100m, where the bulk of the observations are. Still, the difference is not obvious. I’m not sure I would have noticed the difference if I wasn’t already aware of the slightly weaker correlation between staff costs and the league outcomes. Perhaps this is a me problem. Maybe I’m confessing my lack of attention to detail.
# set seed for reproducibilityset.seed(42)# set up parallel backendfuture::plan(future::multisession, workers = parallel::detectCores() -1)# split train/test data and specify foldssplits <-initial_split(buli_resources, prop =0.7)train <-training(splits)test <-testing(splits)folds <-vfold_cv(train, v =10, repeats =5)# generate recipesrecipes <-create_recipes(targets =c("pts", "gd", "xgd"),predictors =c("squad_value", "staff_cost"))# xgboost model spec (with tuning)xgb_spec <-boost_tree(trees =1200,learn_rate =0.005,tree_depth =tune(),loss_reduction =tune(),sample_size =tune(),stop_iter =tune()) |>set_engine("xgboost") |>set_mode("regression")# workflow set from recipes and modelwf_sets <-workflow_set(preproc = recipes,models =list(xgb = xgb_spec))# tuning controlctrl <- finetune::control_sim_anneal(save_pred =TRUE,parallel_over ="everything",save_workflow =TRUE,verbose =TRUE)# tune all workflowstuned_results <-workflow_map( wf_sets,fn ="tune_sim_anneal",control = ctrl,metrics =metric_set(rmse, rsq),resamples = folds,seed =42)# stop future planfuture::plan(future::sequential)
The last step in this process is to fit some models that are doing far too much, considering they are ostensibly intended to compare the predictive power of two different features. I’ve fit and tuned six different XGBoost models, one each for squad value and staff costs across the three outcomes. Season and team have been included as additional features, similar to the original model7.
I won’t bother walking through the models in detail. The goal isn’t to go to great lengths to fit perfect models. I’ve done a little tuning just for the hell of it8, but the focus is on comparing the performance of the squad value and staff costs models.
Model performance, measured using root mean square error (RMSE) and R2 is shown in Table 3.
Helper Functions Code (Click to Expand)
# select the best parametersselect_best_model <-function(results, model_id) { results |>extract_workflow_set_result(model_id) |>select_best(metric ="rmse")}# finalize and refit on full training datafinalize_and_fit_model <-function(results, model_id, data_split) { best_params <-select_best_model(results, model_id) results |>extract_workflow(model_id) |>finalize_workflow(best_params) |>last_fit(split = data_split, metrics =metric_set(rmse, rsq))}# extract test performance metricsget_test_metrics <-function(results, model_id, data_split) {finalize_and_fit_model(results, model_id, data_split) |>collect_metrics()}
Table 3: Performance of XGBoost Models Predicting Bundesliga Outcomes
Evaluation Metrics
RMSE
R2*
Squad Value
Points†
0.24
0.65
Goal Difference†
0.4
0.65
xG Difference†
0.31
0.58
Staff Costs
Points†
0.3
0.47
Goal Difference†
0.49
0.47
xG Difference†
0.36
0.44
Source: DFL & Transfermarkt
(*) Calculated using correlation (see documentation)
(†) Standardised using games played
Table 3 shows a similar pattern to the correlations in Table 2. The squad value models consistently outperform the staff costs models across both metrics, especially R2. It’s worth noting that the models are tuned on RMSE, so the variance in R2 across the two sets of models is possibly a function of this. However, there is a sizable increase in RMSE, too.
The consistency of this finding, plus the evidence in Table 1 that the difference between squad value and staff costs is partially a function of team quality, suggests that squad value has some performance bias baked in. I don’t think this invalidates squad values as a proxy for investment in the squad, but it does demonstrate a limitation in this approach.
How Reliable are Squad Values?
There is plenty of research that demonstrates the reliability of Transfermarkt’s player market values (and by extension squad values) across a variety of contexts (Herm, Callsen-Bracker, and Kreis 2014; Müller, Simons, and Weinmann 2017; Prockl and Frick 2018; Smith 2021; Coates and Parshakov 2022; James 2022). They are a decent approximation of player values on an open market (Transfermarkt 2021), which makes them pretty handy for several different use cases. I think the evidence shown in this blog post suggests that a proxy for investment in squads is one of those use cases, but that conclusion does come with some caveats.
Given how Transfermarkt’s values are estimated, I think there will obviously be some performance bias. Historic squad values are a snapshot from previous seasons, and they appear to be an average across the season9. As a consequence, adjustments to player values due to performance will have some impact on squad values.
It’s also worth noting that I’m using a small sample of data limited only to the Bundesliga10, which reduces how much we can infer from this. Evidence shows that player values vary in predictive value by league (Müller, Simons, and Weinmann 2017; Coates and Parshakov 2022), so findings based on a handful of seasons in a single league should be understood within their limited scope.
Still, I think the results support the idea that Transfermarkt’s squad values are a reasonable proxy for squad spending. Regarding my model specifically, I think the structure of the model should help negate some of these issues, particularly the effects of partial pooling. Generally, though, squad value and staff costs are almost perfectly correlated. While the squad value models are consistently better, I don’t think they are so different that it invalidates using squad value as a proxy.
Instead, I think it is something to be aware of if using squad values in this context. Every approach will have limitations, but if you can identify them, discuss them in your analysis, and even point to the specific ways they impact your model, you should be in a good place. With squad values, better the devil you know.
Acknowledgments
Many thanks to Ansgar Wolsing for pointing me to the staff costs data I used in this blog post.
Müller, Oliver, Alexander Simons, and Markus Weinmann. 2017. “Beyond Crowd Judgments: Data-Driven Estimation of Market Value in Association Football.”European Journal of Operational Research 263 (2): 611–24. https://www.sciencedirect.com/science/article/pii/S0377221717304332.
Prockl, Franziska, and Bernd Frick. 2018. “Information Precision in Online Communities: Player Valuations on www.transfermarkt.de.”International Journal of Sport Finance 13 (4): 319–35. https://d-nb.info/124512143X/34.
Players and playing staff will make up the majority of staff costs for all clubs, and I think it is reasonable to assume that non-playing staff costs will be directly related to playing staff costs (bigger teams like Bayern Munich will have more non-playing employees).↩︎
Frankfurt were one of the teams reporting their costs for the financial year, but in 2023, they switched to reporting costs over a season. To line their reports up, this meant the 2023 report is just the costs for the first six months of 2023 (so that the following season is on the correct schedule).↩︎
The figures published in 2024 refer to the 2022/23 season, but the reports are organised by division, with all teams in the Bundesliga in the 2024/25 season in one report and the same for the 2. Bundesliga. For the most part, this was just an inconvenience. In 2022/23, however, Paderborn were relegated from the Bundesliga, followed by another relegation down to the 3. Liga in 2023/24. The trouble is, while the top two tiers in Germany are governed by the DFL, the third tier is governed by the DFB. Paderborn’s consecutive relegations mean that they aren’t included in these reports. It’s despicable. The real victim here is me.↩︎
I also looked at differences over time. There was minimal variance from season to season, though 2017/18 was considerably lower than others, and I concluded that time isn’t relevant.↩︎
My main reason for filtering the data this way was to reduce the list of clubs included and make the table a little smaller. It also removes the noisiest observations (though all are obviously small samples since the maximum seasons are just six).↩︎
Some minor differences exist between these models and those I fit in my earlier blog post. These differences are primarily due to the flexibility of the original model’s multilevel structure, which allowed me to do a little more.↩︎
I used this post as an excuse to play around with future and furrr, so I ended up going to greater lengths to tune the models, but only as a way to make better use of parallel processing.↩︎
I’m unsure how historic squad values are derived, but they don’t appear to align with values at a specific point in the season, so I assume they are an average. A workaround would be to recreate preseason squad values by scraping Transfermarkt’s player values over time (shout out to John Muller for pointing this one out to me).↩︎
And while I am confident that staff and squad costs will be roughly equivalent, this adds another layer of potential noise.↩︎