Introduction to Iterative Development

Whether a software application project, a web development project, a data analytics, or a data science project each of them can benefit from use of an iterative development methodology. In this tutorial, an approach to programmatically retrieve and extract lottery game data is discussed in depth. However, this is not an introduction to computer programming. People without any programming experience should be able to follow this tutorial - at least this is the intention.

A case study of Atlantic Lottery Corporation , with headquarters located in Moncton, New Brunswick, Canada, which operates a variety of lottery games with revenue split amongst prize winners, the provincial governments of Atlantic Canada, and the corporation is presented to provide context.

There is no application programming interface (API) nor a convenient way to download this data in bulk for subsequent analysis. By iterating multiple times we transform a minimum viable solution into a fully automated system suitable for running as a scheduled process. A software development methodology called iterative development is appropriate for this project. In total, four variations on the minimum viable solution produced a highly polished professional implementation ready to be handed off to the quality assurance team before deployment in the production environment.

By the end of the tutorial a fully automated data collection programme, capable of retrieving historical and future Lotto 649 outcomes (date, jackpot amount, and jackpot winning numbers including bonus number), will have been created. This data can be used to perform various types of analysis:

  • frequency distribution of individual winning numbers including bonus number
  • jackpot amount trends over time

This case study presents a fictionalised scenario about a real lottery organisation. Any resemblance to an actual project, the management structure of the organisation, or legislative regulatory compliance is purely coincidental.

Scenario

The internal gaming compliance and audit team has been tasked with producing a quarterly report showing trends associated with the Lotto 649 jackpot in particular. This report must be generated on an on-going basis each quarter. Senior management at Atlantic Lottery Corporation, in response to public complaints questioning the distribution of jackpot winners seemingly favouring the central and western regions of the country, are concerned ticket sales might decline in the eastern region of the country. Ticket sales primarily occur at retail outlets partnering with Atlantic Lottery Corporation but online sales make up an increasing percentage of overall sales further impacting retail outlets. Secondarily, a trend analysis of the jackpot winning number distribution has been requested. Data about each Lotto 649 draw, historic and into the future, must be collected.

Complicating the situation from the corporate information technology perspective is the lottery data systems are not accessible by the internal compliance and audit team. The team is required to maintain an arm’s length relationship with the other parts of the corporation. Therefore, lottery game data can only be retrieved via the public facing website.

For Lotto 649, twice each week on Wednesday and Saturday, a new set of winning numbers is randomly selected for prizes of various amounts including the jackpot. If the jackpot is not won, it increases for the next draw date.

Historical data about each lottery is not readily available in a form suitable for analysis; there is no application programming interface (API) nor an option to easily download this data in bulk. The lottery data is embedded as JSON within a dynamically-generated Hypertext Markup Language (HTML) webpage.

Senior management expects the data retrieval utility to be production-ready in 5 business days. Ideally a business analyst and a computer programmer, or a systems analyst, should be brought onto this project. You interviewed last week and will start next Monday. During your interview this project was discussed but you still have lots of questions. The supervisor to whom you will report is confident you can deliver on time and on budget.

Upon arriving at the Atlantic Lottery Corporation office in Halifax, Nova Scotia, your supervisor asks you to retrieve the 2018 Lotto 649 winning numbers as a warm-up. Two days have been allocated but first there is a mandatory presentation by Human Resources which you must attend. Your supervisor tells you to call the Help Desk for guidance on logging into your workstation before attending the computer-based training session created by Human Resources.

Good luck!

This case study presents a fictionalised scenario about a real lottery organisation. Any resemblance to an actual project, the management structure of the organisation, or legislative regulatory compliance is purely coincidental.

Step-by-Step Walk-through

Approach

While tempting to quickly read the scenario and jump into writing programming code, take a few minutes to consider how to approach the task. Specifically, the task involves historical data collection for the Lotto 649 lottery. At this stage a high-level understanding of both the problem and the requirements is sufficient.

Problem Statement

A quarterly report showing trends associated with the Lotto 649 jackpot has been requested to investigate whether winners are disproportionately in certain regions of the country. As a precursor a preliminary collection and extraction strategy for data retrieval must be formulated to facilitate constructing a suitable dataset.

Requirements

  • implement a process to retrieve historic lottery data
  • construct a dataset from selected data attributes retrieved from lottery data
  • keep the dataset up to date

Algorithm

With the problem statement and business & operational requirements understood, it is time to translate these requirements into a conceptual model (algorithm).

This problem-solving approach is commonly referred to as algorithmic thinking or computational thinking; an essential skill regardless of the context in which you might find yourself solving problems.

Initial

This algorithmic representation of the minimum viable version (alc-0) reflects the process derived from thinking about how to manually perform the task.

  1. Choose draw date
  2. If data for draw date has already been retrieved, display message and terminate
  3. Fetch lottery results for specific draw date
  4. Download webpage containing lottery results
  5. Extract embedded JSON data structure from webpage
  6. Extract jackpot winning numbers, bonus number, and amount
  7. Display jackpot winning numbers, bonus number, and amount
  8. Add jackpot winning numbers, bonus number, and amount to dataset

By iterating multiple times we can expand the level of detail for each step of the algorithm, eventually arriving at a point at which the algorithm can be implemented in a programming language. As an exercise refer back to the initial algorithm and try expanding the level of detail for each version (alc-0 through alc-2) of the programme.

Final

The fully automated version (alc-4) is described by the following first-cut algorithm.

  1. Validate lottery game designator, starting year, and ending year
  2. Read current record count from dataset
  3. For each draw date in the year(s) and each month within the year
    1. For each day (Wednesday, Saturday) of month
      1. If computed date is not less than current date, skip
      2. If results for draw date have already been retrieved, skip
      3. Fetch lottery results for specific draw date
      4. Download webpage containing lottery results
      5. Confirm webpage was saved and increment error count if missing
      6. Extract embedded JSON data structure from webpage file
      7. Extract jackpot winning numbers, bonus number, and amount from data file
      8. Save jackpot winning numbers, bonus number, and amount to dataset
  4. Read current record count from dataset
  5. Display status message (success or errors encountered)

To this end, successive iterations of the algorithm (initial to final) demonstrate varying degrees of automated web scraping to facilitate data analysis projects. The following subsections break out a subset of the major steps (3, 5, 6) from the initial algorithm.

Fetch Lottery Draw Results

$ BROWSER="firefox"
$ PROTOCOL="https"
$ DOMAIN="www.alc.ca"
$ ASSET="/content/alc/en/winning-numbers.html"
$ QUERY="date=2018-10-03&game=Lotto649"
$ ${BROWSER} ${PROTOCOL}://${DOMAIN}${ASSET}#${QUERY}

Retrieval of lottery game data via the corporation’s Winning Numbers page always produces the game data regardless of the on-page displayed lottery results for a given draw date. Upon examining the saved HTML file a peculiar comment provides a clue about the discrepancy between the on-screen “view source” and “save file” versions of the game data.

// Protect server-side variables with JS array to prevent blank value from
producing JS syntax error

Digging deeper into the source code of the web page uncovers the secondary URI which can display the lottery results for a specific game on a specific draw date and from which relevant data can be extracted after saving the webpage to disk.

$ BROWSER="wget -O ./data/page.html"
$ PROTOCOL="https"
$ DOMAIN="www.alc.ca"
$ ASSET="/content/alc/en/our-games/lotto/lotto-6-49.html"
$ QUERY="?date=2018-10-03"
$ ${BROWSER} ${PROTOCOL}://${DOMAIN}${ASSET}${QUERY}

The draw date window is limited to only a few years, not the complete lottery timeline. Manually verify, using a GUI web browser and the first URL, the earliest lottery draw date currently available for a specific lottery game, Lotto 649 for instance.

Within the saved webpage data about each Lotto 649 game is represented by a JavaScript Object Notation (JSON) data structure of the form:

  (function($){
  var componentRoot = document.getElementById("game-details");
  new ALC.components.GameDetailComponent({
  rootEl: componentRoot,
  gameId: "",
  gameData: [],
  drawDatesData: [],
  legendText: "",
  });
  })(jQuery);

Save the webpage to a file named page.html. The elements labeled gameId and gameData contain the data of interest.

Extract Embedded JSON Data Structure

Save the code, except the first and last lines, to a file named extract_json.awk.

awk '
  /ALC.components.GameDetailComponent/ { flag = 1; next; }
  /jQuery/ { flag = 0; } flag
' ./data/page.html \
| awk '
    BEGIN { print("["); }
    /gameId/ {
      sub("gameId", "\"gameId\"", $0);
      printf("{%s", $0);
    }
    /gameData/ {
      sub("gameData", "\"gameData\"", $0);
      sub(/,$/, "},", $0);
      printf("%s\n", $0);
    }
    END {
      printf("{}\n]");
    }
' > ./data/page.json
$ cd code
$ awk -f extract_json.awk > ../data/page.json
$ cd ..

The extracted JSON data structure serves as an aid while implementing the programmes later in this tutorial.

[{
  "gameId": "Lotto649",
  "gameData": [
    {
      "draw": {
        "providerdrawId": "42001003621",
        "bonus_number": "02",
        "prize_payouts": [
          {
            "atlantic_breakdowns": null,
            "guaranteed_prize_english": null,
            "guaranteed_prize_french": null,
            "guaranteed_prize_type": null,
            "number_of_prizes": 1,
            "prize_value": 40000000,
            "region_breakdowns": [
              {
                "region": "Quebec",
                "number_of_prizes": 1
              }
            ],
            "type": "Lotto649_6of6"
          },
          {
            "atlantic_breakdowns": null,
            "guaranteed_prize_english": null,
            "guaranteed_prize_french": null,
            "guaranteed_prize_type": null,
            "number_of_prizes": 4,
            "prize_value": 105957.1,
            "region_breakdowns": [
              {
                "region": "West",
                "number_of_prizes": 2
              },
              {
                "region": "Ontario",
                "number_of_prizes": 1
              },
              {
                "region": "Quebec",
                "number_of_prizes": 1
              }
            ],
            "type": "Lotto649_5of6Bonus"
          },
          {
            "atlantic_breakdowns": null,
            "guaranteed_prize_english": null,
            "guaranteed_prize_french": null,
            "guaranteed_prize_type": null,
            "number_of_prizes": 111,
            "prize_value": 3181.9,
            "region_breakdowns": [
              {
                "region": "BritishColumbia",
                "number_of_prizes": 11
              },
              {
                "region": "West",
                "number_of_prizes": 28
              },
              {
                "region": "Ontario",
                "number_of_prizes": 39
              },
              {
                "region": "Quebec",
                "number_of_prizes": 26
              },
              {
                "region": "Atlantic",
                "number_of_prizes": 7
              }
            ],
            "type": "Lotto649_5of6"
          },
          {
            "atlantic_breakdowns": null,
            "guaranteed_prize_english": null,
            "guaranteed_prize_french": null,
            "guaranteed_prize_type": null,
            "number_of_prizes": 7204,
            "prize_value": 93.2,
            "region_breakdowns": [
              {
                "region": "BritishColumbia",
                "number_of_prizes": 847
              },
              {
                "region": "West",
                "number_of_prizes": 1363
              },
              {
                "region": "Ontario",
                "number_of_prizes": 3281
              },
              {
                "region": "Quebec",
                "number_of_prizes": 1324
              },
              {
                "region": "Atlantic",
                "number_of_prizes": 389
              }
            ],
            "type": "Lotto649_4of6"
          },
          {
            "atlantic_breakdowns": null,
            "guaranteed_prize_english": null,
            "guaranteed_prize_french": null,
            "guaranteed_prize_type": null,
            "number_of_prizes": 133858,
            "prize_value": 10,
            "region_breakdowns": [
              {
                "region": "BritishColumbia",
                "number_of_prizes": 16093
              },
              {
                "region": "West",
                "number_of_prizes": 25177
              },
              {
                "region": "Ontario",
                "number_of_prizes": 60314
              },
              {
                "region": "Quebec",
                "number_of_prizes": 25280
              },
              {
                "region": "Atlantic",
                "number_of_prizes": 6994
              }
            ],
            "type": "Lotto649_3of6"
          },
          {
            "atlantic_breakdowns": null,
            "guaranteed_prize_english": null,
            "guaranteed_prize_french": null,
            "guaranteed_prize_type": null,
            "number_of_prizes": 96364,
            "prize_value": 5,
            "region_breakdowns": [
              {
                "region": "BritishColumbia",
                "number_of_prizes": 11540
              },
              {
                "region": "West",
                "number_of_prizes": 17839
              },
              {
                "region": "Ontario",
                "number_of_prizes": 43390
              },
              {
                "region": "Quebec",
                "number_of_prizes": 18657
              },
              {
                "region": "Atlantic",
                "number_of_prizes": 4938
              }
            ],
            "type": "Lotto649_2of6Bonus"
          },
          {
            "atlantic_breakdowns": null,
            "guaranteed_prize_english": null,
            "guaranteed_prize_french": null,
            "guaranteed_prize_type": null,
            "number_of_prizes": 937458,
            "prize_value": 3,
            "region_breakdowns": [
              {
                "region": "BritishColumbia",
                "number_of_prizes": 113428
              },
              {
                "region": "West",
                "number_of_prizes": 174438
              },
              {
                "region": "Ontario",
                "number_of_prizes": 424142
              },
              {
                "region": "Quebec",
                "number_of_prizes": 177412
              },
              {
                "region": "Atlantic",
                "number_of_prizes": 48038
              }
            ],
            "type": "Lotto649_2of6"
          }
        ],
        "tag": "430291",
        "tag_prize_payouts": [
          {
            "atlantic_breakdowns": [
              {
                "city": "PICTOU Co.",
                "province": "NS",
                "online": false,
                "number_of_prizes": 1
              }
            ],
            "guaranteed_prize_english": null,
            "guaranteed_prize_french": null,
            "guaranteed_prize_type": null,
            "number_of_prizes": 1,
            "prize_value": 100000,
            "region_breakdowns": null,
            "type": "TAG_All6"
          },
          {
            "atlantic_breakdowns": [
              {
                "city": "CHARLOTTETOWN",
                "province": "PE",
                "online": false,
                "number_of_prizes": 1
              },
              {
                "city": "CAMPBELLTON",
                "province": "NB",
                "online": false,
                "number_of_prizes": 1
              },
              {
                "city": "GRAND FALLS",
                "province": "NB",
                "online": false,
                "number_of_prizes": 1
              }
            ],
            "guaranteed_prize_english": null,
            "guaranteed_prize_french": null,
            "guaranteed_prize_type": null,
            "number_of_prizes": 3,
            "prize_value": 1000,
            "region_breakdowns": null,
            "type": "TAG_Last5"
          },
          {
            "atlantic_breakdowns": null,
            "guaranteed_prize_english": null,
            "guaranteed_prize_french": null,
            "guaranteed_prize_type": null,
            "number_of_prizes": 20,
            "prize_value": 100,
            "region_breakdowns": null,
            "type": "TAG_Last4"
          },
          {
            "atlantic_breakdowns": null,
            "guaranteed_prize_english": null,
            "guaranteed_prize_french": null,
            "guaranteed_prize_type": null,
            "number_of_prizes": 264,
            "prize_value": 20,
            "region_breakdowns": null,
            "type": "TAG_Last3"
          },
          {
            "atlantic_breakdowns": null,
            "guaranteed_prize_english": null,
            "guaranteed_prize_french": null,
            "guaranteed_prize_type": null,
            "number_of_prizes": 2656,
            "prize_value": 10,
            "region_breakdowns": null,
            "type": "TAG_Last2"
          },
          {
            "atlantic_breakdowns": null,
            "guaranteed_prize_english": null,
            "guaranteed_prize_french": null,
            "guaranteed_prize_type": null,
            "number_of_prizes": 26148,
            "prize_value": 2,
            "region_breakdowns": null,
            "type": "TAG_Last1"
          }
        ],
        "winning_numbers": [
          "15",
          "32",
          "38",
          "40",
          "44",
          "45"
        ]
      },
      "draw_date": "/Date(1538620199000-0300)/",
      "game": "Lotto649",
      "guaranteed_draws": [
        {
          "prize_payouts": [
            {
              "atlantic_breakdowns": null,
              "guaranteed_prize_english": null,
              "guaranteed_prize_french": null,
              "guaranteed_prize_type": "Lotto649_Cash",
              "number_of_prizes": 1,
              "prize_value": 1000000,
              "region_breakdowns": [
                {
                  "region": "Ontario",
                  "number_of_prizes": 1
                }
              ],
              "type": "Lotto649_Guaranteed"
            }
          ],
          "winning_number": "4025410203"
        }
      ],
      "last_edit_date": "/Date(1468521063000-0300)/",
      "next_draw": {
        "providerDrawId": "42001003627",
        "draw_date": "/Date(1540434599000-0300)/",
        "jackpot": 5000000,
        "estimated_number_of_promotional_draws": null,
        "guaranteed_prize_english": null,
        "guaranteed_prize_french": null,
        "guaranteed_prize_type": "Lotto649_Cash"
      },
      "promotional_draws": null
    }
  ]
}]

Extract Winning Numbers, Bonus Number, and Jackpot Amount

Extract the gameId element and bonus_number, winning_numbers, and jackpot elements within gameData using the JSON Query (jq) command-line utility.

jq --raw-output '.[] | select(.gameId=="Lotto649") | .gameData
  | .[].draw.winning_numbers | @csv' ./data/page.json |  sed 's/\"//g'
$ jq --raw-output '.[] | select(.gameId=="Lotto649") | .gameData 
  | .[].draw.winning_numbers | @csv' ./data/page.json 
  |  sed 's/\"//g'
jq '.[] | select(.gameId=="Lotto649") | .gameData | .[].draw.bonus_number' \
./data/page.json | sed 's/\"//g'
$ jq '.[] | select(.gameId=="Lotto649") | .gameData 
  | .[].draw.bonus_number' ./data/page.json | sed 's/\"//g'
jq '.[] | select(.gameId=="Lotto649") | .gameData | .[].draw.prize_payouts | .[] 
| select(.type=="Lotto649_6of6") | .prize_value' ./data/page.json
$ jq '.[] | select(.gameId=="Lotto649") | .gameData 
  | .[].draw.prize_payouts | .[] select(.type=="Lotto649_6of6") 
  | .prize_value' ./data/page.json

Note: The entire command when invoking jq must appear on the same physical line. The formatting in the sample code is for readability and will not execute.

Implementation

It is recommended the programming code in this section be obtained from the repository rather than attempting to copy-and-paste it. This is due to potential and actual formatting modifications made to the source code to avoid horizontal scrolling.

Minimum Viable Solution

The preceding code snippets can be combined into a single programme to collect the Lotto 649 draw date, winning numbers including bonus number, and jackpot. The complete source code is available in the repository . A safeguard avoids retrieving lottery results for a particular draw date multiple times.

$ ./code/alc-0.sh 
Usage:    alc.sh YYYY-MM-DD
Example:  alc.sh 2018-10-03
$
$ ./code/alc-0.sh 2018-10-03
Lottery results for 2018-10-03: 15,32,38,40,44,45,02,40000000
$
$ ./code/alc-0.sh 2018-10-03
Lottery results for 2018-10-03 were previously retrieved
2018-10-03,15,32,38,40,44,45,02,40000000
#!/usr/bin/env sh

# Lottery Data Retrieval and Extraction for Atlantic Lottery Corporation
# Author: Gregory D. Horne < greg at gregoryhorne dot ca >
# Original source:
#   https://gitlab.com/gregorydhorne/alc-lotto-649-data-retrieval
# Copyright (c) 2017-2018 Gregory D. Horne
# License: BSD 3-Clause License (http://opensource.org/licenses/BSD-3-Clause)

# Set draw date to the calendar date passed as the first argument to the script.
if [[ ! -z ${1} ]]
then
  draw_date=${1}
else
  echo "Usage:    alc.sh YYYY-MM-DD"
  echo "Example:  alc.sh 2018-10-03"
  exit 1
fi

# If the results for the draw date have already been retrieved, do not fetch
# them again.
if [[ -e ./data/lotto649.csv ]] \
   && [[ ! -z `grep "${draw_date}" ./data/lotto649.csv` ]]
then
  echo "Lottery results for ${draw_date} were previously retrieved"
  grep ${draw_date} ./data/lotto649.csv
  exit 1
fi

# Launch the web browser and retreive the Lotto649 winning numbers for the
# specified draw date. The date is not validated to ensure it is an actual
# draw date; if the date is incorrect, the lottery results returned are for
# the draw date prior to the calendar date.
BROWSER="wget -O ./data/page.html"
PROTOCOL="https"
DOMAIN="www.alc.ca"
ASSET="/content/alc/en/our-games/lotto/lotto-6-49.html"
QUERY="?date=${draw_date}"
${BROWSER} ${PROTOCOL}://${DOMAIN}${ASSET}${QUERY}

# Check the web page was saved to a file named page.html.
if [[ ! -e ./data/page.html ]]
then
  echo "Save the web page to a file named page.html"
  exit 1
fi

# Extract the gameId and gameData elements from the JSON data structure within
# the webpage. Create a well-formed JSON data structure.
awk '
  /ALC.components.GameDetailComponent/ { flag = 1; next; }
  /jQuery/ { flag = 0; } flag
' ./data/page.html \
| awk '
    BEGIN { print("["); }
    /gameId/ {
      sub("gameId", "\"gameId\"", $0);
      printf("{%s", $0);
    }
    /gameData/ {
      sub("gameData", "\"gameData\"", $0);
      sub(/,$/, "},", $0);
      printf("%s\n", $0);
    }
    END { printf("{}\n]"); }
' > ./data/page.json

# Extract the winning numbers and the bonus number.
numbers=$(jq --raw-output '.[] | select(.gameId=="Lotto649") | .gameData
          | .[].draw.winning_numbers | @csv' ./data/page.json |  sed 's/\"//g')
bonus_number=$(jq '.[] | select(.gameId=="Lotto649") | .gameData
               | .[].draw.bonus_number' ./data/page.json | sed 's/\"//g')

# Extract the jackpot amount.
jackpot=$(jq '.[] | select(.gameId=="Lotto649") | .gameData
          | .[].draw.prize_payouts | .[] | select(.type=="Lotto649_6of6")
          | .prize_value' ./data/page.json)
jackpot=$(printf "%0.2f" ${jackpot})

# Display the lottery results on the console. 
printf "Lottery results for ${draw_date}: "
printf "${numbers},${bonus_number},${jackpot}\n"

# Save the lottery results to a file named lotto649.csv.
printf "${draw_date},${numbers},${bonus_number},${jackpot}\n" >> ./data/lotto649.csv

# Clean-up any intermediary files.
rm -f ./data/page.html ./data/page.json

exit 0

By iterating multiple times we transform a minimum viable solution into a fully automated system suitable for running as a scheduled process.

Refactoring the Script using Functions

The complete source code which collects the Lotto 649 draw date, winning numbers including bonus number, and jackpot is available in the repository . A safeguard avoids retrieving lottery results multiple times.

$ ./code/alc-1.sh 
Usage:    alc.sh YYYY-MM-DD
Example:  alc.sh 2018-10-03
$
$ ./code/alc-1.sh 2018-10-03
Lottery results for 2018-10-03: 15,32,38,40,44,45,02,40000000
$
$ ./code/alc-1.sh 2018-10-03
Lottery results for 2018-10-03 were previously retrieved
2018-10-03,15,32,38,40,44,45,02,40000000
#!/usr/bin/env sh

# Lottery Data Retrieval and Extraction for Atlantic Lottery Corporation
# Author: Gregory D. Horne < greg at gregoryhorne dot ca >
# Original source:
#   https://gitlab.com/gregorydhorne/alc-lotto-649-data-retrieval
# Copyright (c) 2017-2018 Gregory D. Horne
# License: BSD 3-Clause License (http://opensource.org/licenses/BSD-3-Clause)

# Configuration Settings
BROWSER="wget -O ./data/page.html"
PROTOCOL="https"
DOMAIN="www.alc.ca"
ASSET="/content/alc/en/our-games/lotto/lotto-6-49.html"

# Retrieve lottery results for the draw date.
retrieve_lottery_results()
{
  # If the results for the draw date have already been retrieved, do not fetch
  # them again.
  if [[ -e ./data/lotto649.csv ]] \
  && [[ ! -z `grep "${draw_date}" ./data/lotto649.csv` ]]
  then
    echo "Lottery results for ${draw_date} were previously retrieved"
    grep ${draw_date} ./data/lotto649.csv
    exit 1
  fi

  # Launch the web browser and retrieve the Lotto 649 winning numbers for the
  # specified draw date. The date is not validated to ensure it is an actual
  # draw date; if the date is incorrect, the lottery results returned are for
  # the draw date prior to the calendar date.
  QUERY="?date=${draw_date}"
  ${BROWSER} ${PROTOCOL}://${DOMAIN}${ASSET}${QUERY}

  # Check the web page was saved to a file named page.html.
  if [[ ! -e ./data/page.html ]]
  then
    echo "Save the web page to a file named page.html"
    exit 1
  fi
}

# Extract the gameId and gameData elements from the JSON data structure within
# the webpage. Create a well-formed JSON data structure.
extract_lottery_details()
{
  awk '
    /ALC.components.GameDetailComponent/ { flag = 1; next; }
    /jQuery/ { flag = 0; } flag
  ' ./data/page.html \
  | awk '
      BEGIN { print("["); }
      /gameId/ {
        sub("gameId", "\"gameId\"", $0);
        printf("{%s", $0);
      }
      /gameData/ {
        sub("gameData", "\"gameData\"", $0);
        sub(/,$/, "},", $0);
        printf("%s\n", $0);
      }
      END { printf("{}\n]"); }
  ' > ./data/page.json

  # Extract the winning numbers and the bonus number.
  numbers=$(jq --raw-output '.[] | select(.gameId=="Lotto649") | .gameData
            | .[].draw.winning_numbers | @csv' ./data/page.json |  sed 's/\"//g')
  bonus_number=$(jq '.[] | select(.gameId=="Lotto649") | .gameData
                 | .[].draw.bonus_number' ./data/page.json | sed 's/\"//g')

  # Extract the jackpot amount.
  jackpot=$(jq '.[] | select(.gameId=="Lotto649") | .gameData
            | .[].draw.prize_payouts | .[] | select(.type=="Lotto649_6of6")
            | .prize_value' ./data/page.json)
  jackpot=$(printf "%0.2f" ${jackpot})
}

# Display the lottery results on the console.
write_details_to_console()
{
  printf "Lottery results for ${draw_date}: "
  printf "${numbers},${bonus_number},${jackpot}\n"
}

# Save the lottery results to a file named lotto649.csv.
write_details_to_file()
{
  printf "${draw_date},${numbers},${bonus_number},${jackpot}\n" \
  >> ./data/lotto649.csv
}

# Clean-up any intermediary files.
clean_up()
{
  rm -f ./data/page.html ./data/page.json
}

###############################################################################

# Set draw date to the calendar date passed as the first argument to the script.
# If a calendar date is present, set the draw date. Otherwise, display usage
# information on the console.
if [[ ! -z ${1} ]]
then
  draw_date=${1}
else
  echo "Usage:    alc.sh YYYY-MM-DD"
  echo "Example:  alc.sh 2018-10-03"
  exit 1
fi

retrieve_lottery_results
extract_lottery_details
write_details_to_console
write_details_to_file
clean_up

exit 0

Refactoring the Script using Function Parameters and Global & Local Variables

The complete source code which collects the Lotto 649 draw date, winning numbers including bonus number, and jackpot is available in the repository . A safeguard avoids retrieving lottery results multiple times.

$ ./code/alc-2.sh 
Usage:    alc.sh YYYY-MM-DD
Example:  alc.sh 2018-10-03
$
$ ./code/alc-2.sh 20181003
Usage:    alc.sh YYYY-MM-DD
Example:  alc.sh 2018-10-03
$
$ ./code/alc-2.sh 2018-10-03
Lottery results for 2018-10-03: 15,32,38,40,44,45,02,40000000
$
$ ./code/alc-2.sh 2018-10-03
Lottery results for 2018-10-03 were previously retrieved
2018-10-03,15,32,38,40,44,45,02,40000000
#!/usr/bin/env sh

# Lottery Data Retrieval and Extraction for Atlantic Lottery Corporation
# Author: Gregory D. Horne < greg at gregoryhorne dot ca >
# Original source:
#   https://gitlab.com/gregorydhorne/alc-lotto-649-data-retrieval
# Copyright (c) 2017-2018 Gregory D. Horne
# License: BSD 3-Clause License (http://opensource.org/licenses/BSD-3-Clause)

# Configuration Settings
BROWSER="wget -O ./data/page.html"
PROTOCOL="https"
DOMAIN="www.alc.ca"
ASSET="/content/alc/en/our-games/lotto/lotto-6-49.html"

# Global Variables
draw_date=""

# Validate the specified calendar date. 
validate()
{
  # If a calendar date has not been passed as the first argument, display usage
  # message.
  if [[ -z "${1}" ]] || [[ ${#1} -ne 10 ]]
  then
    echo "Usage:    alc.sh YYYY-MM-DD"
    echo "Example:  alc.sh 2018-10-03"
    exit 1
  fi

  draw_date=${1}
}

# Retrieve lottery draw results for the specified draw date.
retrieve_lottery_results()
{
  # If the results for the draw date have already been retrieved, do not fetch
  # them again.
  if [[ -e ./data/lotto649.csv ]] \
     && [[ ! -z `grep "${draw_date}" ./data/lotto649.csv` ]]
  then
    echo "Lottery results for ${draw_date} were previously retrieved"
    grep ${draw_date} ./data/lotto649.csv
    exit 1
  fi

  # Launch the web browser and retrieve the Lotto649 winning numbers for the
  # specified draw date. The date is not validated to ensure it is an actual
  # draw date; if the date is incorrect, the lottery results returned are for
  # the draw date prior to the calendar date.
  QUERY="?date=${draw_date}"
  ${BROWSER} ${PROTOCOL}://${DOMAIN}${ASSET}${QUERY}

  # Check the web page was saved to a file named page.html.
  if [[ ! -e ./data/page.html ]]
  then
    echo "Save the web page to a file named page.html"
    exit 1
  fi
}

# Extract the gameId and gameData elements from the JSON data structure within
# the webpage. Create a well-formed JSON data structure.  
extract_lottery_details()
{
  local numbers
  local bonus_number
  local jackpot

  awk '
    /ALC.components.GameDetailComponent/ { flag = 1; next; }
    /jQuery/ { flag = 0; } flag
  ' ./data/page.html \
  | awk '
      BEGIN { print("["); }
      /gameId/ {
        sub("gameId", "\"gameId\"", $0);
        printf("{%s", $0);
      }
      /gameData/ {
        sub("gameData", "\"gameData\"", $0);
        sub(/,$/, "},", $0);
        printf("%s\n", $0);
      }
      END { printf("{}\n]"); }
  ' > ./data/page.json

  # Extract the winning numbers and the bonus number.
  numbers=$(jq --raw-output '.[] | select(.gameId=="Lotto649")
            | .gameData | .[].draw.winning_numbers | @csv' ./data/page.json
            | sed 's/\"//g')
  bonus_number=$(jq '.[] | select(.gameId=="Lotto649") | .gameData
                 | .[].draw.bonus_number' ./data/page.json | sed 's/\"//g')

  # Extract the jackpot amount.
  jackpot=$(jq '.[] | select(.gameId=="Lotto649") | .gameData
            | .[].draw.prize_payouts | .[] | select(.type=="Lotto649_6of6")
            | .prize_value' ./data/page.json)
  jackpot=$(printf "%0.2f" ${jackpot})

  echo "${numbers},${bonus_number},${jackpot}"
}

# Display the lottery results on the console.
write_details_to_console()
{
  local results=${1}

  printf "Lottery results for ${draw_date}: "
  printf "${results}\n"
}

# Save the lottery results to a file named lotto649.csv.
write_details_to_file()
{
  local results=${1}

  printf "${draw_date},${results}\n" >> ./data/lotto649.csv
}

# Clean-up any intermediary files.
clean_up()
{
  rm -f ./data/page.html ./data/page.json
}

###############################################################################

validate ${1}
retrieve_lottery_results
results=$(extract_lottery_details)
write_details_to_console ${results}
write_details_to_file ${results}
clean_up

exit 0

Refactoring the Script to Handle Multiple Draw Dates

User Specified Draw Dates

The complete source code which collects the Lotto 649 draw date, winning numbers including bonus number, and jackpot is available in the repository . A sample draw dates file containing past dates and future dates, at least at the time of writing this article, is provided to test the script. A safeguard avoids retrieving lottery results multiple times, in addition to attempting to process a future draw date.

$ ./code/alc-3.sh 
Usage:    alc.sh lottery
Example:  alc.sh lotto-6-49
$
$ ./code/alc-3.sh lotto-6-49
Status: Record count: 0 (pre-count) : 2 (post-count)
        Error count: 0
$
$ ./code/alc-3.sh lotto-6-49
Status: Record count: 2 (pre-count) : 2 (post-count)
        Error count: 0

Ordinarily the error count should be zero, however, network latency or not saving the webpage can create a situation in which not all lottery draw data was successfully retrieved. Rerun the script to retrieve the missing data; it skips dates for which data has already been successfully retrieved.

#!/usr/bin/env sh

# Lottery Data Retrieval and Extraction for Atlantic Lottery Corporation
# Author: Gregory D. Horne < greg at gregoryhorne dot ca >
# Original source:
#   https://gitlab.com/gregorydhorne/alc-lotto-649-data-retrieval
# Copyright (c) 2017-2018 Gregory D. Horne
# License: BSD 3-Clause License (http://opensource.org/licenses/BSD-3-Clause)

# Configuration Settings
BROWSER="wget -O ./data/page.html"
PROTOCOL="https"
DOMAIN="www.alc.ca"

# Validate the lottery game designator. 
validate()
{
  # If a lottery name has not been passed as the first argument, display usage
  # message.
  if [[ "${1}" != "lotto-6-49" ]]
  then
    echo "Usage:    alc.sh lottery"
    echo "Example:  alc.sh lotto-6-49"
    exit 1
  fi
}

# Retrieve results for the specified draw dates contained in the file
# named draw-dates.
lotto_game()
{
  local game=${1}

  local error_count=0
  local file_name=$(echo ${game} | sed 's/-//g')
  local today=$(date -I | sed 's/-//g')

  local draw_date
  local line

  while IFS='' read -r line || [[ -n "${line}" ]]
  do
    draw_date=$(echo ${line} | sed 's/-//g')
    if [[ "${draw_date}" -lt "${today}" ]]
    then
      draw_date=${line}
      error_count=$(retrieve_lottery_results ${game} ${draw_date} ${error_count})
      if [[ -e ./data/page.html ]]
      then
        results=$(extract_lottery_details ${game} ${draw_date})
        write_details_to_file "${results}"
      fi
      clean_up
    fi
  done < ./data/draw-dates

  temp_file=$(mktemp)
  sort ./data/${file_name}.csv > ${temp_file}
  cp ${temp_file} ./data/${file_name}.csv

  echo ${error_count}
}

# Retrieve lottery results for the specified draw date.
retrieve_lottery_results()
{
  local lottery=${1}
  local draw_date=${2}
  local error_count=${3}

  local file_name=$(echo ${lottery} | sed 's/-//g')

  # If the results for the draw date have already been retrieved, do not fetch
  # them again.
  if [[ ! -e ./data/${file_name}.csv ]] \
     || [[ -z $(grep "${draw_date}" ./data/${file_name}.csv) ]]
  then
    # Launch the web browser and retrieve the Lotto 649 winning numbers for the
    # specified draw date. The date is not validated to ensure it is an actual
    # draw date; if the date is incorrect, the lottery results returned are for
    # the draw date prior to the calendar date.
    ASSET="/content/alc/en/our-games/lotto/${lottery}.html"
    QUERY="?date=${draw_date}"
    ${BROWSER} ${PROTOCOL}://${DOMAIN}${ASSET}${QUERY}

    # Check the web page was saved to a file named page.html.
    if [[ ! -e ./data/page.html ]]
    then
      error_count=$(expr ${error_count} + 1)
    fi
  fi

  echo ${error_count}
}

# Extract the JSON formatted data about the lottery from the HTML file.
# Create a well-formed JSON data structure and extract lottery details (gameId
# and gameData).
extract_lottery_details()
{
  local lottery=${1}
  local draw_date=${2}

  local file_name=$(echo ${lottery} | sed 's/-//g')

  local bonus_number
  local jackpot
  local numbers

  awk '
    /ALC.components.GameDetailComponent/ { flag = 1; next; }
    /jQuery/ { flag = 0; } flag
  ' ./data/page.html \
  | awk '
      BEGIN { print("["); }
      /gameId/ {
        sub("gameId", "\"gameId\"", $0);
        printf("{%s", $0);
      }
      /gameData/ {
        sub("gameData", "\"gameData\"", $0);
        sub(/,$/, "},", $0);
        printf("%s\n", $0);
      }
      END { printf("{}\n]"); }
  ' > ./data/page.json

  # Extract the winning numbers and the bonus number.
  numbers=$(jq --raw-output '.[] | select(.gameId=="Lotto649") | .gameData
            | .[].draw.winning_numbers | @csv' ./data/page.json
            | sed 's/\"//g')
  bonus_number=$(jq '.[] | select(.gameId=="Lotto649")
                 | .gameData | .[].draw.bonus_number' ./data/page.json
                 | sed 's/\"//g')

  # Extract the jackpot amount.
  jackpot=$(jq '.[] | select(.gameId=="Lotto649") | .gameData
            | .[].draw.prize_payouts | .[] | select(.type=="Lotto649_6of6")
            | .prize_value' ./data/page.json)
  jackpot=$(printf "%0.2f" ${jackpot})

  echo "${lottery} ${draw_date},${numbers},${bonus_number},${jackpot}"
}

# Save the lottery results to a file named lotto649.csv.
write_details_to_file()
{
  local file_name=$(echo ${1} | cut -d \  -f 1 | sed 's/-//g')
  local results=$(echo ${1} | cut -d \  -f 2)

  printf "${results}\n" >> ./data/${file_name}.csv
}

# Get the current record count, that is the number of lottery draws stored in
# the lottery details file.
record_count()
{
  local file_name=$(echo ${1} | sed 's/-//g')

  if [[ -e ./data/${file_name}.csv ]]
  then
    echo $(wc -l ./data/${file_name}.csv | cut -d \  -f 1)
  else
    echo 0
  fi
}

# Display the number of lottery draws stored in the lottery details file.
status()
{
  local pre_update_count=${1}
  local post_update_count=${2}
  local error_count=${3}

  printf "Status:"
  printf "\tRecord count: ${pre_update_count} (pre-count) : "
  printf "${post_update_count} (post-count)\n"
  printf "\tError count: ${error_count}\n"
}

# Clean-up any intermediary files.
clean_up()
{
  rm -f ./data/page.html ./data/page.json
}

###############################################################################

validate ${1}
pre_update_count=$(record_count ${1})
error_count=$(lotto_game ${1})
post_update_count=$(record_count ${1})
status ${pre_update_count} ${post_update_count} ${error_count}

exit 0

A final iteration eliminates the draw dates file replacing it with two variables, start_year and end_year, passed as arguments to the programme. A separate directory is automatically created to store lottery data. A compressed archival file is created and updated alongside the uncompressed file.

Calendar-Driven Draw Dates

The complete source code which collects the Lotto 649 draw date, winning numbers including bonus number, and jackpot is available in the repository . A safeguard avoids retrieving lottery results multiple times, in addition to attempting to process a future draw date.

$ ./code/alc-4.sh lotto-6-49 2018 2018
Data retrieval starting...with errors
Status:
  Record count: 0 (pre-count) : 103 (post-count)
  Error count: 1
  Archive created
$ ./code/alc-4.sh lotto-6-49 2018 2018
Data retrieval starting...completed successfully
  Archive updated
$ ./code/alc-4.sh lotto-6-49 2018 2018
Data retrieval starting...completed successfully
  No updates to archive
$ ./code/alc-4.sh lotto-6-49 2018 2019
Data retrieval starting...completed successfully
  Archive updated
#!/usr/bin/env sh

# Lottery Data Retrieval and Extraction for Atlantic Lottery Corporation
# Author: Gregory D. Horne < greg at gregoryhorne dot ca >
# Original source:
#   https://gitlab.com/gregorydhorne/alc-lotto-649-data-retrieval
# Copyright (c) 2017-2018 Gregory D. Horne
# License: BSD 3-Clause License (http://opensource.org/licenses/BSD-3-Clause)

# Configuration parameters.
BROWSER="wget -O ./data/page.html"
PROTOCOL="https"
DOMAIN="www.alc.ca"

# Archive current data after retrieving the latest updates.
archive_data()
{
  local prefetch_count=${1}
  local post_fetch_count=${2}
  local file_name=$(echo ${3} | sed 's/-//g')

  if [[ ${prefetch_count} -lt ${postfetch_count} ]]
  then
    if [[ -e ./data/${file_name}.zip ]]
    then
      zip -f ./data/${file_name}.zip ./data/${file_name}.csv > /dev/null 2>&1
      echo -e "\tArchive updated"
    else
      zip ./data/${file_name}.zip ./data/${file_name}.csv > /dev/null 2>&1
      echo -e "\tArchive created"
    fi
  else if [[ ! -e ./data/${file_name}.zip ]] \
          && [[ ${prefetch_count} -ne 0 ]] \
          && [[ ${postfetch_count} -ne 0 ]]
       then
         zip ./data/${file_name}.zip ./data/${file_name}.csv > /dev/null 2>&1
         echo -e "\tArchive created"
       else
         echo -e "\tNo updates to archive"
       fi
  fi

  return
}

# Extract the JSON formatted data about the lottery from the HTML file.
# Create a well-formed JSON data structure and extract lottery details (gameId
# and gameData).
extract_lottery_details()
{
  local game_designator=${1}

  local file_name=$(echo ${game_designator} | sed 's/-//g')

  local bonus_number
  local jackpot
  local numbers

  awk '
    /ALC.components.GameDetailComponent/ { flag = 1; next; }
    /jQuery/ { flag = 0; } flag
  ' ./data/page.html \
  | awk '
      BEGIN { print("["); }
      /gameId/ {
        sub("gameId", "\"gameId\"", $0);
        printf("{%s", $0);
      }
      /gameData/ {
        sub("gameData", "\"gameData\"", $0);
        sub(/,$/, "},", $0);
        printf("%s\n", $0);
      }
      END { printf("{}\n]"); }
  ' > ./data/page.json

  # Extract the winning numbers and the bonus number.
  numbers=$(jq --raw-output '.[] | select(.gameId=="Lotto649") | .gameData
            | .[].draw.winning_numbers | @csv' ./data/page.json
            | sed 's/\"//g')
  bonus_number=$(jq '.[] | select(.gameId=="Lotto649") | .gameData
                 | .[].draw.bonus_number' ./data/page.json | sed 's/\"//g')

  # Extract the jackpot amount.
  jackpot=$(jq '.[] | select(.gameId=="Lotto649") | .gameData
            | .[].draw.prize_payouts | .[] | select(.type=="Lotto649_6of6")
            | .prize_value' ./data/page.json)
  jackpot=$(printf "%0.2f" ${jackpot})

  # Delete temporary lottery data files
  rm -f ./data/page.html ./data/page.json

  echo "${numbers},${bonus_number},${jackpot}"
}

# Initialise datastore.
initialise()
{
  local file_name=$(echo ${1} | sed 's/-//g')

  if [[ ! -e ./data ]]
  then
    mkdir ./data
  fi

  if [[ ! -e ./data/${file_name}.csv ]]
  then
    touch ./data/${file_name}.csv
  fi

  return
}

# Report current record count.
record_count()
{
  local file_name=$(echo ${1} | sed 's/-//g')

  echo $(wc -l ./data/${file_name}.csv | cut -d \  -f 1)
}

# Retrieve lottery results for the specified draw date.
retrieve_lottery_results()
{
  local game_designator=${1}
  local draw_date=${2}
  local error_count=${3}

  local file_name=$(echo ${game_designator} | sed 's/-//g')

  # If the results for the draw date have already been retrieved, do not fetch
  # them again.
  if [[ ! -e ./data/${file_name}.csv ]] \
     || [[ -z $(grep "${draw_date}" ./data/${file_name}.csv) ]]
  then
    # Launch the web browser and retrieve the Lotto 649 winning numbers for the
    # specified draw date. The date is not validated to ensure it is an actual
    # draw date; if the date is incorrect, the lottery results returned are for
    # the draw date prior to the calendar date.
    ASSET="/content/alc/en/our-games/lotto/${game_designator}.html"
    QUERY="?date=${draw_date}"
    ${BROWSER} ${PROTOCOL}://${DOMAIN}${ASSET}${QUERY} > /dev/null 2>&1
    # Check the web page was saved to a file named page.html. 
    if [[ ! -e ./data/page.html ]]
    then
      error_count=$(expr ${error_count} + 1)
    fi
  fi

  echo ${error_count}
}

status()
{
  local start_count=${1}
  local end_count=${2}
  local error_count=${3}

  echo "Status:"
  echo -e "\tRecord count: ${start_count} (pre-count) : ${end_count} (post-count)"
  echo -e "\tError count: ${error_count}"

  return
}

lotto649()
{
  local start_year=${1}
  local end_year=${2}
  local game_designator=${3}

  local file_name=$(echo ${game_designator} | sed 's/-//g')
  local today=$(date -I)

  local error_count=0

  local days_of_month
  local draw_date
  local line

  for year in $(seq ${start_year} ${end_year})
  do
    for month in $(seq 12)
    do
      cal ${month} ${year} | tail -n +2 | sed 's/   / 0 /g' \
      | awk -v cols=We,Sa '
          BEGIN { split(cols, out, ","); }
          NR == 1 { for (i = 1; i <= NF; i++) { ix[$i] = i; } }
          NR > 1 {
            for (i in out) { printf("%s%s", $ix[out[i]], OFS); }
            print("");
          }' \
      | head -n -1 > ./data/draw-dates

      while IFS='' read -r days_of_week || [[ -n "${days_of_week}" ]]
      do
        for day in $(echo ${days_of_week})
        do
          draw_date=$(printf "%4d-%02d-%02d" ${year} ${month} ${day})
          # Check for a valid day of the month (01-31) inclusive
          if [[ ${draw_date:8:2} -eq 0 ]]; then continue; fi
          historic_date=$(echo ${draw_date} | sed 's/-//g')
          current_date=$(echo ${today} | sed 's/-//g')
          if [[ "${historic_date}" -lt "${current_date}" ]]
          then
            error_count=\
              $(retrieve_lottery_results ${game_designator} \
              ${draw_date} ${error_count})
            if [[ -e ./data/page.html ]]
            then
              results=$(extract_lottery_details ${game_designator})
              write_details_to_file ${file_name}.csv ${draw_date} "${results}"
            fi
          fi
        done
      done < ./data/draw-dates
      rm -f ./data/draw-dates
    done
  done

  temp_file=$(mktemp)
  sort ./data/${file_name}.csv > ${temp_file}
  cp -f ${temp_file} ./data/${file_name}.csv

  echo ${error_count}
}

# Validate the lottery game designator as well as starting and ending years. 
validate()
{
  local game_designator=${1}
  local start_year=${2}
  local end_year=${3}

  # If a lottery name has not been passed as the first argument, display usage
  # message.
  if [[ "${game_designator}" != "lotto-6-49" ]] \
     || [[ -z ${start_year} ]] \
     || [[ -z ${end_year} ]]
  then
    echo "Usage:    alc.sh lottery"
    echo "Example:  alc.sh lotto-6-49 YYYY YYYY"
    exit 1
  fi
}

# Save the lottery results to a file named lotto649.csv.
write_details_to_file()
{
  local file_name=${1}
  local draw_date=${2}
  local results=${3}

  printf "${draw_date},${results}\n" >> ./data/${file_name}
}

# Retrieve lottery data.
main()
{
  local game_designator=${1}
  local start_year=${2}
  local end_year=${3}

  local prefetch_count=0
  local postfetch_count=0
  local error_count=0

  validate ${game_designator} ${start_year} ${end_year}

  initialise ${game_designator}

  printf "Data retrieval starting..."
  prefetch_count=$(record_count ${game_designator})
  error_count=$(lotto649 ${start_year} ${end_year} ${game_designator})
  printf "completed "
  if [[ "${error_count}" == "0" ]]
  then
    printf "successfully\n"
  else
    printf "with errors\n"
    postfetch_count=$(record_count ${game_designator})
    status ${prefetch_count} ${postfetch_count} ${error_count}
  fi
  
  archive_data ${prefetch_count} ${postfetch_count} ${game_designator}
}

############################################################################### 

# Pass the lottery game designator, starting year, and ending year to retrieve.
main ${1} ${2} ${3}
exit 0

Closing Thoughts

Imagine wanting to retrieve all Lotto 649 draws during a particular month, a particular year, or for several years. A tedious and potentially error-prone task if using a typical graphical web browser such as Mozilla Firefox. Fortunately, scripting & programming languages such as ash combined with JSON Query (jq) and calendar generator (cal) command-line utilities, and the non-interactive network retriever (wget) command-line utility makes automating the task manageable and time-efficient, especially in a scheduled batch-processing environment.

Rapidly prototyping a minimum viable programme to retrieve historic lottery data (draw date, jackpot winning numbers, and jackpot amount) for a specific lottery game was just the beginning. Afterwards, a succession of code reorganisations and feature improvements led to a fully automated, production-ready data retrieval programme suitable for deployment in a scheduled or batch processing environment. Furthermore, a model now exists and can easily be applied to fetch other data of interest about Lotto 649 or any of the other lottery games operated by Atlantic Lottery Corporation. With a model at the ready it is relatively easy to adapt the fully-automated data retrieval programme to extract regional data.

The minimum viable solution has 31 lines of executable code whereas the production-ready fully automated version contains 205 executable lines of code. A one-to-one ratio of programming language logical statements to physical lines of executable code should not be assumed; a single logical statement might span multiple physical lines either for readability, formatting within 80 columns barring other restrictions which do not permit splitting a statement, or by convention. A quick review indicates the production-ready implementation has approximately six times as many executable lines of code compared to the minimum viable implementation.

The final version (alc-4) was originally created in support of the Analysis of Lottery Draws Between 2009 and 2017 project. At the time, historical lottery data from 2009 onwards was available from Atlantic Lottery Corporation. Extraction of the jackpot amount has been added to the latest revision.

Legalities

The source code and programmes developed in this tutorial are copyrighted and licensed under the terms of the BSD 3-Clause License .

Data Ownership

In accordance with the Atlantic Lottery Corporation Terms of Service , specifically subsection 15.3 “Intellectual Property”, as it pertains to any data retrieved from the Atlantic Lottery Corporation website:

“You will not copy, transmit or make otherwise available any content or material made available on or through ALC.ca.”