Deeplearning Module¶

Here are contained the set of functions relating to the training, validation and testing of the neural networks.

If the user intends to load pickles of saved DeepLearning objects or model pth files it is important to remember that the models must be loaded in the same computational environment as they were initialised in. Both in terms of parallelisation and the processing units they are loaded on.

For example if a model was trained on 16 GPUs in parallel, it will be required that that model is loaded on 16 GPUs in parallel. This is a pre-requisite required by Pytorch in their serialization routines.

This module include a set of functions relating to the training, validation and testing of neural networks.

Author: Oliver Boom Github Alias: OliverJBoom

class Foresight.deeplearning.DeepLearning(model, data_X, data_y, optimiser, batch_size=128, n_epochs=100, loss_function=<sphinx.ext.autodoc.importer._MockObject object>, device='cpu', seed=42, debug=True, disp_freq=20, fig_disp_freq=50, early_stop=True, early_verbose=False, patience=50, rel_tol=0, scaler_data_X=None, scaler_data_y=None)[source]¶

Class to perform training and validation for a given model

Parameters:

model (nn.module) – The neural network model
data_X (np.array) – The training dataset
data_y (np.array) – the target dataset
n_epochs (int) – The number of epochs of training
optimiser (torch.optim) – The type of optimiser used
batch_size (int) – The batch size
loss_function (torch.nn.modules.loss) – The loss function used
device (string) – The device to run on (Cpu or CUDA)
seed (int) – The number that is set for the random seeds
debug (bool) – Whether to print some parameters for checking
disp_freq (int) – The epoch frequency that training/validation metrics will be printed on
fig_disp_freq (int) – The frequency that training/validation prediction figures will be made
early_stop (bool) – Whether early stopping is utilized
early_verbose (bool) – Whether to print out the early stopping counter
patience (stopping int) – The amount of epochs without improvement before
rel_tol – The relative improvement percentage that must be achieved float
scaler_data_X (sklearn.preprocessing.data.MinMaxScaler) – The data X scaler object for inverse scaling
scaler_data_y (sklearn.preprocessing.data.MinMaxScaler) – The dataX y scaler object for inverse scaling

create_data_loaders()[source]¶: Forms iterators to pipeline in the data/labels

evaluate(model, test_loader)[source]¶

Evaluates the performance of the network on given data for a given model.

A lot of overlap of code with validation. Only kept separate due to the inspection of attributes being made easier when running simulations if kept separate.

Parameters:	model (nn.module) – The model to evaluate test_loader (torch.utils.data.dataloader.DataLoader) – The iterator that feeds in the data of choice
Returns:	The error metric for that dataset
Return type:	float

live_pred_plot()[source]¶: Plots the training predictions, validation predictions and the training/validation losses as they are predicted.

size_check()[source]¶: Checks the size of the datasets

train(train_loader)[source]¶

Performs a single training epoch and returns the loss metric for the training dataset.

Parameters:	train_loader (torch.utils.data.dataloader.DataLoader) – The iterator that feeds in the training data
Returns:	The error metric for that epoch
Return type:	float

train_val_test()[source]¶: Splits the DataFrames in to a training, validation and test set and creates torch tensors from the underlying numpy arrays

training_wrapper()[source]¶: The wrapper that performs the training and validation

validate(val_loader)[source]¶

Evaluates the performance of the network on unseen validation data.

Parameters:	val_loader (torch.utils.data.dataloader.DataLoader) – the iterator that feeds in the validation data
Returns:	the error metric for that epoch
Return type:	float

class Foresight.deeplearning.EarlyStopping(patience, rel_tol, verbose=True)[source]¶

Used to facilitate early stopping during the training of neural networks.

When called if the validation accuracy has not relative improved below a relative tolerance set by the user the a counter is incremented. If the counter passes a set value then the stop attribute is set to true. This should be used as a break condition in the training loop.

If rel_tol is set to 0 then the metric just needs to improve from it’s existing value

Parameters:

patience (int) – The amount of epochs without improvement before stopping
rel_tol (float) – The relative improvement % that must be achieved
verbose (bool) – Whether to print the count number
best (float) – The best score achieved so far
counter (int) – The amount of epochs without improvement so far
stop (bool) – Whether stopping criteria is achieved

Foresight.deeplearning.full_save(model, model_name, optimiser, num_epoch, learning_rate, momentum, weight_decay, use_lg_returns, PCA_used, data_X, train_loss, val_loss, test_loss, train_time, hidden_dim, mse, mae, mde, path)[source]¶

Saves the models run details and hyper-parameters to a csv file :param model: The model run :type model: nn.module

Parameters:

model_name (strin) – The name the model is saved under
optimiser (torch.optim) – The optimiser type used
num_epoch (int) – The number of epochs run for
learning_rate (float) – The learning rate learning hyper-parameter
momentum (float) – The momentum learning hyper-parameter
weight_decay (float) – The weight decay learning hyper-parameter
use_lg_returns (bool) – Whether log returns was used
PCA_used (bool) – Whether PCA was used
data_X (np.array) – The training dataset (used to save the shape)
train_loss (float) – The loss on the training dataset
val_loss (float) – The loss on the validation dataset
test_loss (float) – The loss on the test dataset
train_time (float) – The amount of time to train
hidden_dim (int) – The number of neurons in the hidden layers
mse (floot) – The mean squared error metric
mae (floot) – The mean absolute error metric
mde (floot) – The mean direction error metric
path (string) – The directory path to save in

Foresight.deeplearning.model_load(model_name, device, path='../Results/Pths/')[source]¶

Loading function for the models.

Parameters:	model_name (string) – The model name to load device (string) – The device to run on (Cpu or CUDA) path (string) – The directory path to load the model from

Foresight.deeplearning.model_save(model, name, path='../Results/Pths/')[source]¶

Saving function for the model.

Parameters:	model (torch.nn) – The model to save name (string) – The name to save the model under path (string) – The directory path to save the model in

Foresight.deeplearning.param_strip(param)[source]¶

Strips the key text info out of certain parameters. Used to save the text info of which models/optimiser objects are used

Parameters:	param (object) – The parameter object to find the name of

Foresight.deeplearning.set_seed(seed)[source]¶

Sets the random seeds to ensure deterministic behaviour.

Parameters:	seed (int) – The number that is set for the random seeds
Returns:	Confirmation that seeds have been set
Return type:	bool