Loto: fun with lottery numbers

x1: statistics

Run statistical tests suites Dieharder & ENT

Should be run from the CLI (depending on how you installed it), e.g.:

$ python loto/core.py x1

But can also be run as a script:

$ python loto/x1_statistics.py
loto.x1_statistics.build_stats_tests_sets(db_path, db_name)

Creates test sets files from the DB.

A test set file is in the format: one number per line.

Parameters
  • db_path (str) – path to the directory where the DB is stored.

  • db_name (str) – name of the DB.

Returns

list containing the paths to the files juste created.

Return type

list

loto.x1_statistics.check_installed_tools(desired_tools)

Prints out if any or all of the tools required are installed on the user’s machine.

Parameters

desired_tools (list) – list of the tools wished to be run, e.g. [‘pracrand’, ‘ent’, ‘dieharder’].

Returns

list of actually installed tools, e.g. [‘pracrand’, ‘dieharder’].

Return type

list

loto.x1_statistics.main()
loto.x1_statistics.run_tools(installed_tools, list_paths)

Starts a subprocess to run each requested tool, sequentially.

Each tool is ran with sensible defaults.

Parameters

installed_tools (list) – list of the tools to be run, e.g. [‘ent’, ‘dieharder’].

x2: plots

Visualize the lottery numbers distribution with matplotlib.

Many different combinations of parameters can be tried here. Most graphs generated here are trying to be visually pleasing if not very helpful :-)

Should be run from the CLI (depending on how you installed it), e.g.:

$ python loto/core.py x2

But can also be run as a script:

$ python loto/x2_plots.py
loto.x2_plots.gen_area_plot(df)

Generates an area plot.

Parameters

df (dataframe) – pandas dataframe containing all the draw results for the lottery (dates/balls/stars).

Returns

a string containing either a success or failure message.

Return type

string

loto.x2_plots.gen_heatmap(df)

Generates a heatmap image.

On the y-axis the draw dates, on the x-axis the balls and stars. Generates a nice image of the distribution of the numbers more than anything else. The image is written to the images dir.

Parameters

df (dataframe) – pandas dataframe containing all the draw results for the lottery (dates/balls/stars).

Returns

a string containing either a success or failure message.

Return type

string

loto.x2_plots.gen_line_plot(df)

Generates a line plot.

Parameters

df (dataframe) – pandas dataframe containing all the draw results for the lottery (dates/balls/stars).

Returns

a string containing either a success or failure message.

Return type

string

loto.x2_plots.gen_pie_plot(df)

Generates pie plots.

Shows the distribution of numbers drawn per ball (e.g. how many times the number 42 was drawn for ball n°3).

Parameters

df (dataframe) – pandas dataframe containing all the draw results for the lottery (dates/balls/stars).

Returns

a string containing either a success or failure message.

Return type

string

loto.x2_plots.load_plots_dataframe(db_path, db_name)

Returns a pandas dataframe containing all the rows & columns of the numbers table.

Parameters
  • db_path (str) – path to the directory where the DB is stored.

  • db_name (str) – name of the DB.

Returns

one to one extraction of the table.

Return type

pandas dataframe

loto.x2_plots.main()

x3: OEIS

Query the On-line Encyclopedia of Integer Sequences.

For now we query with the 10 latest draws numbers (5 balls + 7 stars). But many combinations can be tried:longer sequences, sequences randomly picked, etc.

Should be run from the CLI (depending on how you installed it), e.g.:

$ python loto/core.py x3

But can also be run as a script:

$ python loto/x3_oeis.py
loto.x3_oeis.balls_by_draws(draws)

Generates URLs used to query OEIS’ database.

Concatenates sequences of numbers in the format expected by the OEIS’ API.

Returns

list of URLs.

Return type

list

loto.x3_oeis.check_sequences_oeis(urls, seconds)

Check if given sequences returns something from the OEIS.

Parameters
  • urls (list) – list of prepared urls to try.

  • seconds (float) – number of seconds to wait between requests.

loto.x3_oeis.get_latest_draws(db_path, db_name, nb_draws)

Creates sequences from the latest n draws.

Parameters
  • db_path (str) – path to the directory where the DB is stored.

  • db_name (str) – name of the DB.

Returns

list of tuples of ints [(15, 31, 40, 44, 48, 1, 12), (16, 1, 2, 7, 48, 1, 12), etc.].

Return type

list

loto.x3_oeis.main()

x4: CPT+

CPT+ - Use a Compact Prediction Tree Plus to predict the next winning numbers.

So many combinations are possible here. I made choices to limit the scope (what to put in and in what form/shape/order).

Should be run from the CLI (depending on how you installed it), e.g.:

$ python loto/core.py x4

But can also be run as a script:

$ python loto/x4_cpt_plus.py
loto.x4_cpt_plus.build_cptp_data(db_path, db_name)

Builds training sets files for the SPMF library.

Training sets files created using the DB. A training set file in the format expected by the java SPMF library: ‘ -1 ‘ between each number within a sequence and ‘ -2’ to end a sequence (see SPMF/CPT+ docs). The files are stored in the data/files folder.

We keep the numbers from the two last lotteries out of the training sets, instead we return them as lists to build the prediction set (balls_draw_m2) and the verification set (balls_draw_m1). Same for stars numbers.

Parameters
  • db_path (str) – path to the directory where the DB is stored.

  • db_name (str) – name of the DB.

Returns

dictionary of lists {

‘paths’ : [strings], ‘balls_draw_m1’ : [ints], ‘balls_draw_m2’ : [ints], ‘stars_draw_m1’ : [ints], ‘stars_draw_m2’ : [ints]}

Return type

dict

loto.x4_cpt_plus.load_training_set(pkg, path, numbers=[])

Loads the training set file of training sequences.

Either simply load a training set file (in the tree) as is, or load a training set file appended with an additional sequence.

Parameters
  • pkg (jpype object) – common root of the java package from which we load the classes we need.

  • path (str) – filepath to the training file to load.

  • numbers (list) – optional, number sequence(s) to append to training file.

Returns

training set object.

Return type

jpype object

loto.x4_cpt_plus.main()
loto.x4_cpt_plus.make_predictions(pkg, prediction_model, previous_numbers)

Iterates over predictions until we get enough of them.

Make predictions for a given sequence.

Parameters
  • pkg (jpype object) – common root of the java package from which we load the classes we need.

  • prediction_model (jpype object) – CPT+ java object trained, used here to make predictions.

  • previous_numbers (list) – number sequence for which we want a prediction.

Returns

the predictions. A list of 5 numbers for the balls or a list of 2 for the stars.

Return type

list

loto.x4_cpt_plus.prepare_prediction_request(pkg, numbers)

Turns a python list into a java sequence, in order to query for a prediction.

Parameters
  • pkg (jpype object) – common root of the java package from which we load the classes we need.

  • numbers (list) – the sequence of numbers to use as a query/arg for the prediction.

Returns

sequence of numbers in their java format.

Return type

jpype object

loto.x4_cpt_plus.print_report(dict_nbs)

Print DIY mini-table with predictions.

x5: prophet

Prophet: ask FB’s Prophet to predict the next winning numbers.

Feed the balls/stars numbers as time series (one dataframe each) to Prophet. See what comes out.

Dissapointing results for now of course, prophet seems to show me what I already knew : the gambler’s fallacy is real and on average a number between 1 and 50 is around 25 :-) Heavy computing to get there…of course I’m joking here: there must be parameters to play with to get more satisfying results. So next step: play with params, read the docs deeper.

Should be run from the CLI (depending on how you installed it), e.g.:

$ python loto/core.py x5

But can also be run as a script:

$ python loto/x5_prophet.py
loto.x5_prophet.generate_plots(prophet, forecast, field)

Generate plots with matplotlib with prophet’s forecast data.

Parameters
  • prophet (obj) – the prophet object instance.

  • forecast (dataframe) – dataframe containing prophet’s prediction.

  • field (str) – name of the field (ball_1, star_2,…).

loto.x5_prophet.load_prophet_dataframe(db_path, db_name, y_field)

Returns a pandas dataframe containing a time series with two columns: a date and a number.

Parameters
  • db_path (str) – path to the directory where the DB is stored.

  • db_name (str) – name of the DB.

  • y_field (str) – name of the number field to extract (ball_1, star_2,…).

Returns :

dataframe: time series of dates[‘ds’] + numbers[‘y’].

loto.x5_prophet.main()
loto.x5_prophet.print_report(dict_nbs)

Print DIY mini-table with predictions.

class loto.x5_prophet.suppress_stdout_stderr

Bases: object

A context manager for doing a “deep suppression” of stdout and stderr in Python, i.e. will suppress all print, even if the print originates in a compiled C/Fortran sub-function. This will not suppress raised exceptions, since exceptions are printed to stderr just before a script exits, and after the context manager has exited (at least, I think that is why it lets exceptions through). Prophet’s github corresponding issue. Corresponding SO question.

x6: serendipity

Try different kinds of random

If nothing else works, going back to different sources of randomness to predict the next winning numbers. The processes are run simultaneously to add bullshit (via multiprocessing/manager).

Should be run from the CLI (depending on how you installed it), e.g.:

$ python loto/core.py x6

But can also be run as a script:

$ python loto/x6_serendipity.py
loto.x6_serendipity.anu_quantum_rng(url, queue)

Gets random numbers in hex format from ANU’s Quantum RNG.

Seven 128 bits long hex numbers are requested. Each of them is used to seed the random function, one for each ball (5) and one for each star (2).

Parameters
  • url (str) – ANU’s Quantum RNG preformated URL.

  • queue (obj) – multiprocessing manager queue.

Returns

the manager queue object appended with a dict containing the prediction.

5 balls (1, 50) and 2 stars (1, 12) {‘balls’: [12, 13, 16, 41, 11], ‘stars’: [12, 1]}

Return type

obj

loto.x6_serendipity.main()
loto.x6_serendipity.nist_beacon(url, queue)

Gets latest pulse of entropy from NIST Beacon service.

The last pulse given by the beacon is used as a seed to generate random numbers for a “prediction”.

Parameters
  • url (str) – NIST Beacon service URL for last pulse.

  • queue (obj) – multiprocessing manager queue.

Returns

the manager queue object appended with a dict containing the prediction.

5 balls (1, 50) and 2 stars (1, 12) {‘balls’: [12, 13, 16, 41, 11], ‘stars’: [12, 1]}

Return type

obj

loto.x6_serendipity.os_random(queue)

Generates prediction with dev/urandom.

Parameters

queue (obj) – multiprocessing manager queue.

Returns

the manager queue object appended with a dict containing the prediction.

5 balls (1, 50) and 2 stars (1, 12) {‘balls’: [12, 13, 16, 41, 11], ‘stars’: [12, 1]}

Return type

obj

loto.x6_serendipity.print_report(dict_nbs)

Print DIY mini-table with predictions.

loto.x6_serendipity.random_org(urls, queue)

Gets random sequences of numbers from random.org service.

The service is called twice: once for the balls (5 ints between 1 and 50) and once for the stars (2 ints between 1 and 12).

Parameters
  • urls (dict) – Random.org preformated URLs.

  • queue (obj) – multiprocessing manager queue.

Returns

the manager queue object appended with a dict containing the prediction.

5 balls (1, 50) and 2 stars (1, 12) {‘balls’: [12, 13, 16, 41, 11], ‘stars’: [12, 1]}

Return type

obj

CLI wrapper

Command line interface wrapper to make running the different experiments and creating (first run) or refreshing the database a little friendlier for the user.

Should be run as a standalone script, depending on how you installed it, e.g.:

$ python loto/core.py
loto.core.lazy_load(slow_module)

Speed up CLI’s perceived responsiveness by lazy loading modules.

https://bugs.python.org/msg214954

Parameters

slow_module (str) – the module to load.

helpers

Common helpers/utilities functions.

Helpers for downloading files, listing files in a directory, decompressing zipped folders, reshaping data from a csv file, so it can be loaded in a sqlite3 DB.

loto.helpers.create_necessary_directories(path)

Given a path, creates all necessary folders to satisfy it.

If the path/folder already exists, nothing is done.

Parameters

path (str) – path to the directory(ies) to be created.

loto.helpers.decompress_files(path)

Decompresses all zip files in a folder.

Parameters

path (str) – path to the folder containing the zip files.

loto.helpers.download_files(urls, path)

Handles the downloading of lottery files.

Destroys and recreates the download folder. Downloads files from a given list of urls into that folder.

Parameters
  • urls (list) – list of urls of the files to be downloaded.

  • path (str) – path to the download directory.

loto.helpers.get_next_lottery_date()

Returns the date of the next lottery day.

Euromillions is drawn every Tuesday and Friday. So depending on the current day, it will return either next Tuesday or next Friday.

Returns

next lottery day, returned as a pandas timestamp.

Return type

timestamp

loto.helpers.get_numbers_as_columns(db_path, db_name)

Selects lottery numbers as columns of numbers.

For each column containing either a ball number or a star number, extracts all numbers to a list, sorted by date. The lists are returned as a dict of lists (of tuples).

Parameters
  • db_name (str) – name of the DB.

  • db_path (str) – path to the directory where the DB is stored.

Returns

dictionary of lists of tuples {‘column_name’: [(number, number,…)]}.

Return type

dict

loto.helpers.get_numbers_as_sequences(db_path, db_name)

Selects lottery numbers as sequences of numbers (in the order drawn).

Extract all ball numbers in lists of 5 (one for each draw). And extract all star numbers in lists of two. Return both lists in a dict.

Parameters
  • db_name (str) – name of the DB.

  • db_path (str) – path to the directory where the DB is stored.

Returns

dictionary of lists {‘balls’: [numbers], ‘stars’: [numbers]}.

Return type

dict

loto.helpers.list_files(path, ext)

Lists files in a folder for a specific extension.

Walks all files in a folder and returns a list containing the paths of the files in that folder for a specific file extension.

Parameters
  • path (str) – path to the directory to explore.

  • ext (str) – extension of the files to list.

Returns

A list containing paths.

Return type

list

loto.helpers.load_db(numbers, db_path, db_name)

Loads the numbers in the database.

Destroys and recreates the DB folder. (Re)creates the DB. Loads given data in the DB.

Parameters
  • numbers (list) – euromillions winning numbers list.

  • db_name (str) – name given to the DB.

  • db_path (str) – path to the directory where the DB is stored.

loto.helpers.prepare_data(path)

Prepares data to be loaded in the database.

Extract the data we want from the CSV files which is to be loaded in the DB. Returns data in the shape we expect (as a python list).

Parameters

path (str) – path to the folder containing the CSV files to prepare.

Returns

a list containing columns corresponding to DB expectations.

Return type

list

loto.helpers.spinning_cursor(seconds)

Shows a spinning cursor for a number of seconds.