130% ROI by using AI to predict the outcome of AFL matches

During my tenure at a leading bank, my role involved developing algorithms to identify fraudulent transactions. The concept at a high level was straightforward: with millions of transactions processed daily, narrow them down to a few hundred potential fraud cases for our operations team to examine. However, the implementation was far from simple.

It involved a complex mix of coding, regression models, neural networks, and various other machine learning techniques. Nowadays, with the emergence of Chat GPT, AI has become the buzzword everyone loves to use. 

This experience led me to an intriguing idea: if AI could be effectively employed to detect fraud, could it also be possible to predict the outcomes of AFL games using similar techniques?

 
  1. Predictive Analytics.

    If you’re familiar with the corporate world, you’ve likely encountered the phrase “garbage in, garbage out” in relation to data. The foundation of any predictive analytics is built on three critical components: 1) high-quality data, 2) high-quality data, and 3) high-quality data. To initiate this project, I gathered an extensive range of publicly available statistical data on AFL matches, encompassing aspects like goals, marks, handballs, kicks, game locations, and correlated them with historical weather data. While I was uncertain which data would prove most valuable, starting with a comprehensive dataset was crucial. This resulted in around 100 parameters for each game.

    However, the complexity increased when I began using this data to create parameters for my model. To illustrate, let’s consider one specific parameter: team handballs. I would then create a data point for handballs from the last game, average handballs over the last 3, 5, and 10 games, as well as the season average. Furthermore, I calculated the average handballs for the last 1, 2, and 3 games against the specific opposing team, then average handballs at the same ground, and the team’s average handballs in similar weather conditions and so forth. As you can imagine, once all these parameters were created in the hopes of enhancing the model’s predictive accuracy, the total number of parameters for each game skyrocketed to over 5000, all of which were fed into the model to predict game outcomes.

    Feeding 5000 parameters into a neural network is not practical, as it would require an extensive amount of time to train. More crucially, many of these parameters might not contribute to predicting match outcomes, compromising the prediction’s accuracy. To address this, we employed a genetic algorithm, an approach inspired by the process of natural evolution. This algorithm began by selecting a random subset of parameters and processing them through our neural network. After evaluating the model’s performance, the algorithm would then alter the input parameters randomly and rerun the model. If the new set of parameters improved the model’s predictions, these changes were noted and integrated into subsequent iterations. Conversely, if the performance deteriorated, the algorithm recognised that those parameters were less predictive and eliminated them. Through numerous iterations of this ‘evolutionary process,’ the algorithm effectively identified the most predictive parameters from the original 5000. This refined model could then provide a probability score between 0 and 1, indicating the likelihood of the home team’s victory.

  2. Betting System.

    If our system predicts a 0.7 probability of the home team winning, then the probability of the away team winning is 0.3. To calculate the fair bookmaker odds, you divide 1 by the probability of winning. Hence the odds for a home win should be $1.43 and for an away team is $3.33. Any difference in this to what the bookmaker is paying is our arbitrage opportunity.

    Next we use the Kelly fractional betting system to guide our betting decisions. This system involves placing a bet when the bookmaker’s odds are more favourable than our calculated probability of a team winning. The intriguing aspect of this strategy is that sometimes, wagering on the team expected to lose can be more profitable in the long term. For example, if our analysis suggests that the underdog team has a 25% chance of winning, but the bookmaker’s odds imply they have only a 10% chance, placing a bet on this underdog is advantageous. This is because, over time, the underdog winning approximately one in every four games offers a higher payout, making it a strategically sound bet.

  3. Our results.

    Throughout the season, we strictly adhered to the system’s guidance, placing bets precisely as recommended by the computer. We had a starting capital of $5,000 and, by the conclusion of the season, our funds had grown to approximately $11,500. This represents a Return on Investment of 130%. At this point we thought we were one more season away from retiring on a tropical beach. However, we soon learned a hard truth: if you consistently win against a bookmaker, they are under no obligation to keep accepting your bets. This was the case with us, as we eventually found ourselves barred from further betting.

Luckily, in horse racing, there’s a concept known as parimutuel betting. In this system, all bets are pooled together. The bookmaker takes a predetermined commission, and the rest of the money is distributed among the winners. Here, the bookie’s profit is unaffected by who wins, and hence will continue to accept your bets.

As you probably guessed, we shifted our models to the track, and that is a story for another blog 🙂