UK Horse Racing

 

Menu

Tools
  
Data Analysis
BF Accounts App   
CSV Documentation   
Dutching Calculator   
Arbitrage Calculator
Each-Way Calculator
Retirement Plan
Gruss Betfair Guide
Odds & Time Conv.
XML Reader
Group Calculator
 
Template One
Template Two
  
Systems
External Tools

UK Horse Racing Data Analyst Tool, version 2.12

Introduction

This updated version of the Data Analyst tool is for those who wish to explore the CSV Archive results with the aid of our tool.

This version of the tool, which is presented as an Excel spreadsheet, requires the user to be able to write simple VBA statements. This isn't as hard as it sounds as the structure of the tool has been set up and all the user has to do is to change some statements which works out the requirements.

Even if the user isn't au-fait with VBA it won't be beyond anyone with average computing abilities to use this tool. If anyone wishes to learn VBA then the internet is full of good sites explaining elementary Excel VBA.

Version 2.13

This version was released on the 14th April 2012.

There are countless updates in this version, as would be expected of something which has been worked on continually for the last few months.

One of the major inclusions has been the ability to now work on the Class and Weight Differentials. However, this will require the download of the new format (version 41, or higher) of the archived historical data to make it run.

The results files have been improved significantly and these will continue to do so.

Data Analyst Downloads

Download the Excel spreadsheet from here

Version 2.13
The required text steering files
version 41 of the Data Files (required)

Running The Data Analyst Tool

In the .zip file containing the two text files. One is DataFilePath.txt and the second is OutputPath.txt. These two files contain the locations of where the data files are read from and where the reports are posted. These two files need to be edited by the user.

The data files are the CSV Result files which we post each month and these should be unzipped and altogether within the same folder.

There are two check boxes. Clearing the 'Back' checkbox will make a laying report and clicking the 'Place Analysis' checkbox will have a second sweep through the run to look at the place backing.

The Report

When the analysis has finished, three files are created in the Output Folder, the ones definied in OutputPath.txt.

The largest file will be a large CSV file containing all the selections which fit the selection criteria. the smaller CSV file is a breakdown, by month, of the profit and loss per month; the idea of this is that users can see how smooth or volatile the system is and/or which monts are generally worth exploring.

The third file is the all important text file and this is what most people will want to look for in the report.

The following is an extract of a run looking at the top rated horse to value which has run within Great Britain..

Dates: 07/02/2008 to 31/03/2011



                                                 Runs      Wins      Profit      SR %       ROI %        LDA%   1.00% Stake

WIN:              All                           3,904       587      194.61     15.04        4.98       32.43        133.07

WIN:              All Value                     3,904       587      194.61     15.04        4.98       32.43        133.07
WIN:              All  Profitable JC              763       127       10.33     16.64        1.35       29.66         87.21
WIN:              All  Profitable TC              869       146      144.99     16.80       16.68       21.43        277.45
WIN:              All  Profitable TJ              783       122       17.03     15.58        2.17       35.09         87.13
WIN:              All  Profitable TJC             615       110       72.34     17.89       11.76       28.94        164.91

WIN:              GB                            3,904       587      194.61     15.04        4.98       32.43        133.07
WIN:              IRE                               0         0        0.00      0.00        0.00        0.00        100.00

WIN:              Handicaps                     2,800       352      112.14     12.57        4.00       38.89         80.04
WIN:              Stakes                        1,104       235       82.48     21.29        7.47       17.38        166.26

WIN:              AW                                0         0        0.00      0.00        0.00        0.00        100.00
WIN:              Flat                          2,068       295       23.69     14.26        1.15       46.37         57.40
WIN:              NH Flat                          49        10       15.13     20.41       30.87       10.47        113.36
WIN:              Hurdle                          909       138       39.70     15.18        4.37       27.50        101.49
WIN:              Chase                           878       144      116.10     16.40       13.22       23.00        201.52

WIN:              GB Handicap                   2,800       352      112.14     12.57        4.00       38.89         80.04
WIN:              GB Stakes                     1,104       235       82.48     21.29        7.47       17.38        166.26

WIN:              IRE Handicap                      0         0        0.00      0.00        0.00        0.00        100.00
WIN:              IRE Stakes                        0         0        0.00      0.00        0.00        0.00        100.00

WIN:              GB AW                             0         0        0.00      0.00        0.00        0.00        100.00
WIN:              GB Flat                       2,068       295       23.69     14.26        1.15       46.37         57.40
WIN:              GB NH Flat                       49        10       15.13     20.41       30.87       10.47        113.36
WIN:              GB Hurdle                       909       138       39.70     15.18        4.37       27.50        101.49
WIN:              GB Chase                        878       144      116.10     16.40       13.22       23.00        201.52

WIN:              IRE AW                            0         0        0.00      0.00        0.00        0.00        100.00
WIN:              IRE Flat                          0         0        0.00      0.00        0.00        0.00        100.00
WIN:              IRE NH Flat                       0         0        0.00      0.00        0.00        0.00        100.00
WIN:              IRE Hurdle                        0         0        0.00      0.00        0.00        0.00        100.00
WIN:              IRE Chase                         0         0        0.00      0.00        0.00        0.00        100.00

The left hand column will show something like WIN or PLACE which helps to identify what sort of bet it is. Also within the report there will be rows for, for example, Hurdle Win so that one can look into different race types. The second column is rather straight-forward; All means all bets, All Value is everything to value and, overall, these are self explanatory.

The next five columns are the regular Runs, Wins, Profit, Strike Rate and Return On Investment. Note that in this report all winnings include a 5% Betfair tax. All results are taken from the Betfair SP price, where applicable, and anything which don't have a Betfair SP are assumed to be 20% higher than industry SP.

The next two columns are soemthing which I have designed which helps to identify systems which are smoother than others. Imagine you have a £100 bank and each bet you place 1% of the value of the bank so the first stake will be £1. The last column will tell you what the final state of the bank will be at (again after 5% Betfair tax) at the end of the period.

There is one issue here; due to compounding issues some of the banks would exceed the US National Debt so this model caps all stakes at £100. This is because it is assumed that this amount can be staked on any race. The figure in the column before this, LDA%, tells us what is the largest percentage drop of the account at any point in the history of this system.

So, in this example the largest profit drop was 32.5%. This is to give an idea on how volatile a system is.

Report Updates

The report is being constantly updated all the time to add new features. For example if a run was engineered so that there were to be only Chases to be examined then only Chases would appear in the report to save time and space.

Since version 2.12 two more columns have been added. These are the largest Draw-Down to level stakes.

The Programming

Introduction

This version of the Data Analyst tool requires the user to get his hands slightly dirty and to write or update some VBA.

The code has been designed so that this isn't too traumatic so an elementary knowledge of VBA is required. If one can operate a spreadsheet then this won't be beyond the user at all.

Opening the IDE

Start the spreadsheet and then open the Integrated Development Environment (the code editing area known as the IDE) by pressing alt-F11.

Looking Around

in the top left hand corner of the IDE the object tree is visible and it shows all the classes in the spreadsheet.

In the Classes section all the classes which begin with the letters 'cls' are all analysis systems. The user will be encouraged to write his own or to edit any of the existing classes. It is expected that this list of classes will extend over time.

But which class is run? Well if the Modules sub-tree is expanded and then the modLink module is opened and then the following code can be seen.

Public Sub CreateClassLink()

  'Set oReport = New clsProfitableCourseTrainer
  'Set oReport = New clsProfitableCourseJockey
  'Set oReport = New clsTop1RatedTJCProfitable
  'Set oReport = New clsTop3AllTJC
  'Set oReport = New clsTJ_10_10
  'Set oReport = New clsUKHR1
  'Set oReport = New clsRAdjValue
  'Set oReport = New clsStephen_1Backs
  'Set oReport = New clsTopRatedValue
  'Set oReport = New clsTop2RatedValue
  'Set oReport = New clsTrForm120
  'Set oReport = New clsRegressionGoing
  'Set oReport = New clsRegressionDistance
  'Set oReport = New clsUnderscoreCls
  'Set oReport = New clsUnderscoreJockey
  Set oReport = New clsUnderscoreRaw
  'Set oReport = New clsUnderscoreSpd
  'Set oReport = New clsUnderscoreTrForm
  'Set oReport = New clsAWStallBias20
  'Set oReport = New clsStallBias
  'Set oReport = New clsTJC
  'Set oReport = New clsRAdjFlatGoing
  'Set oReport = New clsTop3FlatRAdjStakes
  'Set oReport = New clsTop5FlatRAdjTrainer
  'Set oReport = New clsTopRatedSystemHCP
  'Set oReport = New clsTopRatedLstFlat
  'Set oReport = New clsTopRatedFrmAW
  'Set oReport = New clsAlarm
  'Set oReport = New clsUnderscoreConn
  'Set oReport = New clsTrainer4Weeks
  'Set oReport = New clsProfitableTJ
  'Set oReport = New clsTrainer4Weeks
  'Set oReport = New clsTopRated
  
End Sub

Everything in green is commented out so we can see that there's one line of code in this routine.

Which is as it ought to be because there is only supposed to be one line here - that which links the relevant test class into the tool.

So, in this example we can see that the last run called up the class which clearly did some work on the Raw Underscores.

If the user creates a new class then it has to be linked in here else it won't run.

The Classes

As mentioned earlier all of the classes starting with 'cls' are the various system tests. These are what the users should be creating and editing.

There are six other classes which are system classes and are used for the processing; have a look, by all means, in there but please don't edit anything. These six start with the letters 'zzz' just so that they are placed at the end of the list. Two pairs of these are concerned with the linked lists of the Courses and Goings and the last two are for the main reporting and calculations.

If one of the classes is opened and the routine RaceAnalysis() is located then there's a section in here which is between the following two comments. This is an example of one such test routine:

      ' *********  Place the selection criteria after here
      
      nRawRanking = Val(asEntry(GetColumnNumber("DK") - 1))
      nRAdjRanking = Val(asEntry(GetColumnNumber("DL") - 1))
      nHCPRanking = Val(asEntry(GetColumnNumber("DV") - 1))
      nRanking = Val(asEntry(GetColumnNumber("AZ") - 1))
      nTJCProfit = Val(asEntry(GetColumnNumber("BM") - 1))
      sHandicap = Trim$("" & asEntry(GetColumnNumber("BB") - 1))
      sCountry = Trim$("" & asEntry(GetColumnNumber("EI") - 1))
      sRaceType = Trim$("" & asEntry(GetColumnNumber("C") - 1))
      nFurlongs = Val(asEntry(GetColumnNumber("G") - 1))
      nRunners = Val(asEntry(GetColumnNumber("BV") - 1))
      sMeeting = Trim$("" & asEntry(GetColumnNumber("A") - 1))
      sDate = Trim$("" & asEntry(GetColumnNumber("BG") - 1))
      nYear = Year(sDate)
      sStallPercentage = ("" & asEntry(GetColumnNumber("Q") - 1))
      If Len(sStallPercentage) > 0 Then
        sStallPercentage = Replace$(sStallPercentage, "%", "")
      End If
      nStallPercentage = Val(sStallPercentage)
      sGoing = Trim$("" & asEntry(GetColumnNumber("F") - 1))
      
      

      nPrice = Val(asEntry(GetColumnNumber("EO") - 1))
      If nPrice = 0 Then
        nPrice = Val(asEntry(GetColumnNumber("ER") - 1)) * 1.2
      End If
      
      If IsInPriceRange(asEntry(), 2, 999) Then
        If nRanking <= 5 Then 
          If sCountry = "GB" Then
            If nRAdjRanking <= 3 And nHCPRanking = 1 Then
              If PoundsFromTopRated(asEntry(), asRaces(), nIndexStart, nIndexEnd) <= 12 Then
                oNodeHandler.AddResult bIsCreateCSVReport, asEntry()
              End If
            End If
          End If
        End If
      End If
  
      ' *********  Place the selection criteria before here

The first section is concerned with grabbing all sorts of data from the CSV files and shoving them into various variables (please, please make sure that good variable naming is used - if you're going to use a variable called nTJCProfit then please ensure that it contains the profit/loss for the Trainer/Jockey/Course combination and nothing else).

The next section is a pile of If statements. If the horse passes the requirements then the oNodeHandler.AddResult() routine is called.

So in this above example we're looking for all horses which pass the following requirements:

  • The price of the horse is between 2 and 999. In other words, no horses which go off odds on.
  • The horse has to be in the top five rated of the ratings.
  • We're not interested in Irish racing for this analysis as we're asking for GB only.
  • The horse has to be in the top three RAdj Rankings and also the HCP Underscore.
  • Lastly, the horse has to be within twelve pounds of the top rated horse.

Now that you can see the various tests looking above at the first section one can see how the data gets into the variables. Note that how the Horse Ranking (nRanking) is derived from column AZ from the CSV files. The distance of the race is taken from column G and so on.

And that is basically it. It's that simple.

oNodeHandler Class

At the core of all of these If statements is this routine within the oNodeHandler class. What happens here is if a horse passes all of the requrements then the detaila are passed to this .AddResult() function and then the horse's details are then added to a linked list.

Then when everything has been added to the linked list this is then processed hundreds of times to make the report. The user need not worry about this; the code looks after this but as long as the user knows all this happens after the .AddResult() function is called then that's all that's needed.

Consider it, if you will, as a plug hole where data goes to be processed.

Useful Functions

There are a few useful functions written and placed within the module ModFunctions which can be called.

We have seen a couple used in the above example but the functions which can be used to help the data analysis include:

IsValue()

This returns True if the price of the run was value or False if it doesn't.

It is called as follows:
If IsValue(asEntry(), 1.0) then
The only bit which can be changed is the number 1.0; the rest of the call mustn't be changed. This is the modifier to the value; so if this is 1.0 then we're looking at the exact value price; if it's 1.3 then we're looking at 30% above value.

IsInPriceRange()

This returns True if the price of the run was between the two numberic values or False if it wasn't.

It is called as follows:
If IsValue(asEntry(), 2.0, 9.0) then
The only bit which can be changed are the numbers; the rest of the call mustn't be changed. The first number is the lowest allowed price and the second is the highest price permitted.

PoundsFromTopRated()

This returns the number of pounds which this horse was rated behind the top rated horse in the race. If this is the top rated horse then zero is returned.

It is called as follows:
If PoundsFromTopRated(asEntry(), asRaces(), nIndexStart, nIndexEnd) <= 12 Then
Nothing in the actual list of parameters may be altered though the value 12 at the end may be replaced with any other value.

IsTopJumpRated()

This function returns True or False. It returns True if this horse is the Top Rated Jump horse in the race (Chases only).

It is called as follows:
If IsTopJumpRated(asEntry(), asRaces(), nIndexStart, nIndexEnd) Then
Nothing in the actual list of parameters may be altered..

CountJumpRatedHorses()

This function returns the number of horses which have a certain jump rating (the last parameter). This applies to Chases only.

It is called as follows:
If CountJumpRatedHorses(asEntry(), asRaces(), nIndexStart, nIndexEnd, nChaseJumpAbility) = 1 Then
Only the last argument may be changed.

GetRaceGender()

This function returns the gender of horses competing in this race.

It is called as follows:
sGender = GetRaceGender(asRaces(), nRaceIndexStart, nRaceIndexEnd)
Nothing may be changed here. The result from the function call is one of three possible string values: "Male", "Female" or "Mixed".

CountRaceAlarms()

This function returns the number of any given alarm in the race. For example you may wish to search the race and cound the number of "p" alarms.

It is called as follows:
CountRaceAlarms(asRaces(), nIndexStart, nIndexEnd, "p")
Only the last parameter may be changed, for example it may be "+", "e" or any of the other alarms used. This may be useful, for example, if one wishes to look for races with, say, only one "+" alarm in.

IsInRangeAll()

This function returns True or False depending on whether all of the Columns (which should be numeric!) fall between the maximum and minimum values.

It is called as follows:
If IsInRangeAll(asEntry() As String, "DK,DL,DM,DN,DO,DP,DQ,DR,DS,DT,DU,DV", 1, 3) Then
Only the last three parameters may be changed. In this example above we are looking to see if the values of all of the fields DK to DV fall within the range of 1 to 3. These are the Underscore Ranking Fields, by the way. Of course, one needn't have any many fields as this but they all have to be comma seperated and be numeric fields.
This function will be available in version 13.

InRangeCount()

This function returns the number of Columns (which should be numeric!) fall between the maximum and minimum values.

It is called as follows:
If InRangeCount(asEntry() As String, "DK,DL,DM,DN,DO,DP,DQ,DR,DS,DT,DU,DV", 1, 3) > 3 Then
Only the last three parameters may be changed. In this example above we are looking to see if more of than three of the values of all of the fields DK to DV fall within the range of 1 to 3.
This function will be available in version 13.