UK Horse Racing's Data Analyst Tool

The Programming


This version of the Data Analyst tool requires the user to get his hands slightly dirty and to write or update some VBA.

The code has been designed so that this isn't too traumatic so an elementary knowledge of VBA is required. If one can operate a spreadsheet then this won't be beyond the user at all.

Opening the IDE

Start the spreadsheet and then open the Integrated Development Environment (the code editing area known as the IDE) by pressing alt-F11.

Looking Around

in the top left hand corner of the IDE the object tree is visible and it shows all the classes in the spreadsheet.

In the Classes section all the classes which begin with the letters 'cls' are all analysis systems. The user will be encouraged to write his own or to edit any of the existing classes. It is expected that this list of classes will extend over time.

But which class is run? Well if the Modules sub-tree is expanded and then the modLink module is opened and then the following code can be seen.

Public Sub CreateClassLink()

  'Set oReport = New clsProfitableCourseTrainer
  'Set oReport = New clsProfitableCourseJockey
  'Set oReport = New clsTop1RatedTJCProfitable
  'Set oReport = New clsTop3AllTJC
  'Set oReport = New clsTJ_10_10
  'Set oReport = New clsUKHR1
  'Set oReport = New clsRAdjValue
  'Set oReport = New clsStephen_1Backs
  'Set oReport = New clsTopRatedValue
  'Set oReport = New clsTop2RatedValue
  'Set oReport = New clsTrForm120
  'Set oReport = New clsRegressionGoing
  'Set oReport = New clsRegressionDistance
  'Set oReport = New clsUnderscoreCls
  'Set oReport = New clsUnderscoreJockey
  Set oReport = New clsUnderscoreRaw
  'Set oReport = New clsUnderscoreSpd
  'Set oReport = New clsUnderscoreTrForm
  'Set oReport = New clsAWStallBias20
  'Set oReport = New clsStallBias
  'Set oReport = New clsTJC
  'Set oReport = New clsRAdjFlatGoing
  'Set oReport = New clsTop3FlatRAdjStakes
  'Set oReport = New clsTop5FlatRAdjTrainer
  'Set oReport = New clsTopRatedSystemHCP
  'Set oReport = New clsTopRatedLstFlat
  'Set oReport = New clsTopRatedFrmAW
  'Set oReport = New clsAlarm
  'Set oReport = New clsUnderscoreConn
  'Set oReport = New clsTrainer4Weeks
  'Set oReport = New clsProfitableTJ
  'Set oReport = New clsTrainer4Weeks
  'Set oReport = New clsTopRated
End Sub

Everything in green is commented out so we can see that there's one line of code in this routine.

Which is as it ought to be because there is only supposed to be one line here - that which links the relevant test class into the tool.

So, in this example we can see that the last run called up the class which clearly did some work on the Raw Underscores.

If the user creates a new class then it has to be linked in here else it won't run.

The Classes

As mentioned earlier all of the classes starting with 'cls' are the various system tests. These are what the users should be creating and editing.

There are six other classes which are system classes and are used for the processing; have a look, by all means, in there but please don't edit anything. These six start with the letters 'zzz' just so that they are placed at the end of the list. Two pairs of these are concerned with the linked lists of the Courses and Goings and the last two are for the main reporting and calculations.

If one of the classes is opened and the routine RaceAnalysis() is located then there's a section in here which is between the following two comments. This is an example of one such test routine:

      ' *********  Place the selection criteria after here
      nRawRanking = Val(asEntry(GetColumnNumber("DK") - 1))
      nRAdjRanking = Val(asEntry(GetColumnNumber("DL") - 1))
      nHCPRanking = Val(asEntry(GetColumnNumber("DV") - 1))
      nRanking = Val(asEntry(GetColumnNumber("AZ") - 1))
      nTJCProfit = Val(asEntry(GetColumnNumber("BM") - 1))
      sHandicap = Trim$("" & asEntry(GetColumnNumber("BB") - 1))
      sCountry = Trim$("" & asEntry(GetColumnNumber("EI") - 1))
      sRaceType = Trim$("" & asEntry(GetColumnNumber("C") - 1))
      nFurlongs = Val(asEntry(GetColumnNumber("G") - 1))
      nRunners = Val(asEntry(GetColumnNumber("BV") - 1))
      sMeeting = Trim$("" & asEntry(GetColumnNumber("A") - 1))
      sDate = Trim$("" & asEntry(GetColumnNumber("BG") - 1))
      nYear = Year(sDate)
      sStallPercentage = ("" & asEntry(GetColumnNumber("Q") - 1))
      If Len(sStallPercentage) > 0 Then
        sStallPercentage = Replace$(sStallPercentage, "%", "")
      End If
      nStallPercentage = Val(sStallPercentage)
      sGoing = Trim$("" & asEntry(GetColumnNumber("F") - 1))

      nPrice = Val(asEntry(GetColumnNumber("EO") - 1))
      If nPrice = 0 Then
        nPrice = Val(asEntry(GetColumnNumber("ER") - 1)) * 1.2
      End If
      If IsInPriceRange(asEntry(), 2, 999) Then
        If nRanking <= 5 Then 
          If sCountry = "GB" Then
            If nRAdjRanking <= 3 And nHCPRanking = 1 Then
              If PoundsFromTopRated(asEntry(), asRaces(), nIndexStart, nIndexEnd) <= 12 Then
                oNodeHandler.AddResult bIsCreateCSVReport, asEntry()
              End If
            End If
          End If
        End If
      End If
      ' *********  Place the selection criteria before here

The first section is concerned with grabbing all sorts of data from the CSV files and shoving them into various variables (please, please make sure that good variable naming is used - if you're going to use a variable called nTJCProfit then please ensure that it contains the profit/loss for the Trainer/Jockey/Course combination and nothing else).

The next section is a pile of If statements. If the horse passes the requirements then the oNodeHandler.AddResult() routine is called.

So in this above example we're looking for all horses which pass the following requirements:

  • The price of the horse is between 2 and 999. In other words, no horses which go off odds on.
  • The horse has to be in the top five rated of the ratings.
  • We're not interested in Irish racing for this analysis as we're asking for GB only.
  • The horse has to be in the top three RAdj Rankings and also the HCP Underscore.
  • Lastly, the horse has to be within twelve pounds of the top rated horse.

Now that you can see the various tests looking above at the first section one can see how the data gets into the variables. Note that how the Horse Ranking (nRanking) is derived from column AZ from the CSV files. The distance of the race is taken from column G and so on.

And that is basically it. It's that simple.

oNodeHandler Class

At the core of all of these If statements is this routine within the oNodeHandler class. What happens here is if a horse passes all of the requrements then the detaila are passed to this .AddResult() function and then the horse's details are then added to a linked list.

Then when everything has been added to the linked list this is then processed hundreds of times to make the report. The user need not worry about this; the code looks after this but as long as the user knows all this happens after the .AddResult() function is called then that's all that's needed.

Consider it, if you will, as a plug hole where data goes to be processed.