Sprint Intelligence AG is founded on many decades of practical experience in AI.
The field of AI has has been around since at least the 1940’s. Over the following few decades much was promised but little delivered. The available computers were underpowered and the algorithms primitive. However in more recent times the field has really turned itself around. Computers have become thousands of times faster and the algorithms have progressed enormously. Things that were once considered science fiction are rapidly becoming a reality. Algorithms such as deep learning and convolutional neural networks have enabled accurate speech recognition, computer vision and self driving cars. The entire field is currently progressing at a blistering pace.
AI systems can be used for a wide variety of tasks. Many of which are in the form of categorization – for example, an AI system is fed a digital image and the task is to identify what object is in the image. One real world example may be to output a car’s numberplate. A different kind of task is that of prediction. i.e. an AI system is fed a collection of data about things that have happened in the past and the task is to output the state of some future event. This could be a future share price based on previous share price movements or perhaps the outcome of an upcoming football match based on past results.
Sports prediction is a great challenge for AI systems. Modern AI systems require large amounts of training data and in recent years the quantity, quality and ease of access to past performance data is so much greater than in the past. This means that the time is ripe for well designed AI systems to produce predictions that are more accurate than ever before. Sprint Intelligence have plans to create prediction systems for a variety of sports but we have started with horse racing in the first instance for several reasons, 1) we have gained experience of processing horse related data through a previous horse genealogy project. 2) Our founder has had an interest in horse racing going back many years 3) Horse racing has many characteristics that make it extra hard to predict which makes it a great workout for our systems.
Horse race prediction is harder than prediction in other sports for many reasons:
• Many variations in conditions e.g. different surface types, going, gradients.
• Different pace profiles (e.g. run hard from start vs jog slowly then sprint at end)
• A mix of handicapped and non-handicapped past performance data
• Different Jockeys
• Wide range of quantities of past data. E.g. some horses may only run half a dozen times in their whole career while others may run hundreds of times
• Erratic long breaks
• Frequent non finishers (jumps)
• Last but not least, significant quantities of cheating! i.e. jockeys deliberately holding back fast horses so that their odds will be longer next time out.
One may wonder, with all these factors making horse race prediction so difficult, why even attempt it. The answer is that with all these factors, the betting markets are liable to contain correspondingly large errors so there is ample scope for punters to find value in the predictions.
Most AI/neural network systems require “training”. You can think of an AI system as being a transformation of a set of inputs (I.e. a collection of source data) into a set of outputs which would typically be some kind of “result” or “answer”. So for example if the system was an image recognition system, then an example input could be a digital image of a cat and the output would be a label “CAT”. In the case of a sports prediction system an example input might be a set of results for team A prior to some contest X and the output would be the predicted performance of A at contest X.
The training process consists of presenting the AI system with many instances of known inputs and letting the algorithm produce an output. Initially the output may be almost random but as we already know what the correct output should be, then we can make adjustments to the parameters of the AI system such that if the same input were to be given again then the output would be closer to the desired correct answer. This process is repeated over thousands (or more) of example input-output pairs and then in theory the system gradually adapts and so learns how to produce the correct results by itself. The success or failure of this process depends on many factors but in general the more complex the required transform, the more training examples will be required in order to get a good result. Indeed some of the most spectacular recent advances in deep learning systems are the result of collecting simply colossal datasets. For example the recent GTP-3 language processing system required a training set consisting of over 300 billion words!
If the required complexity of the transform from input to output is too high then it may not be possible to obtain enough past data to enable sufficient training. This is precisely where the black art comes in. An experienced expert AI programmer can write code to transform the raw input data in ways that enable a simpler transform from input to output, thereby making an intractable problem tractable. Just as one trivial example: say that a network was required to predict the performance of some athlete in an upcoming event and amongst the many inputs we had two: “weight” and “strength”. Now let us say that a key factor in determining the performance is neither weight nor strength on their own but actually the strength to weight ratio. Left as it is, the AI system will have to learn to convert strength and weight into a ratio as well as all its other calculations. A good AI programmer may anticipate this and pre-process the data so that the two weight and strength inputs were replaced by a single strength/weight ratio input. This will reduce the required complexity of the transform and will result in a system that can produce good results with fewer required training examples.
The algorithms we have developed for the horse racing system can easily be transferred to any other racing sport where you have multiple competitors in each event. For example greyhound racing, motor racing and to a lesser extent non-racing sports where you have multiple competitors in each event, like golf. Having said that, many of the principles and components of our system are directly applicable to sports where there are only two teams/players per event like football or tennis.
Our systems include algorithms and principles such as:
• Use of XXXXX XXXXXX like the AvB system – perhaps the most crucial principle of all
• XXX XXXXXXXX as proof of good system – dramatically shortens the proofing process
• Sophisticated opinion blending
• Experience helps us avoid statistical traps like inadvertently training on test data and overhitting
• CPU intensive iterative processes. Our systems are written in C, arguably the fastest programming language, allowing for processes that would otherwise be scarcely possible on large datasets in high level languages like Python or working with conventional spreadsheet or database software.
• Optimising shape of XXXXXXXX XXXXX.