seq_models

seq_models#


class EncoderDecoderVectorRegressionModel(cuda: bool, history_sequence_column_name: str, history_sequence_vectoriser: SequenceVectoriser, history_sequence_variable_length: bool, target_sequence_column_name: str, target_sequence_vectoriser: SequenceVectoriser, latent_dim: int, encoder_factory: EncoderFactory, decoder_factory: DecoderFactory, nn_optimiser_params: Optional[NNOptimiserParams] = None)[source]#

Bases: TorchVectorRegressionModel

A highly general encoder-decoder sequence model, which encodes a sequence of history items and uses the encoding to make predictions for one or more target sequence items. History and target sequences are converted to vectors via vectorisers.

Parameters:
  • cuda – whether to use a CUDA device

  • history_sequence_column_name – the name of the data frame input column which contains the history sequences to be encoded. The column must contain a sequence of items that can be converted to vectors via the history_sequence_vectorizer

  • history_sequence_vectoriser – a vectorizer which converts history sequence items to vectors

  • history_sequence_variable_length – whether history sequences can be of variable length

  • target_sequence_column_name – the column containing the target item sequence; Note that the column must contain sequences even if there is but a single target item for which predictions shall be made. In such cases, simply use a column that contains lists with a single item each.

  • target_sequence_vectoriser – the vectoriser for the generation of feature vectors for the target items.

  • latent_dim – the number of latent dimensions to be used by the encoder

  • encoder_factory – a factory for the creation of the encoder, which takes sequence items from the history and encodes them into vectors of dimension latent_dim

  • decoder_factory – a factory for the creation of the decoder component, which takes a latent vector produced by the encoder and (a sequence of) target features to make predictions

  • nn_optimiser_params – the optimiser parameters

class InputTensoriser(history_sequence_column_name: str, history_sequence_vectoriser: SequenceVectoriser, target_sequence_column_name: str, target_sequence_vectoriser: SequenceVectoriser)[source]#

Bases: Tensoriser

fit(df: pandas.DataFrame, model=None)[source]#
Parameters:
  • df – the data frame with which to fit this tensoriser

  • model – the model in the context of which the fitting takes place (if any). The fitting process may set parameters within the model that can only be determined from the (pre-tensorised) data.

class EncoderDecoderModel(parent: EncoderDecoderVectorRegressionModel)[source]#

Bases: TorchModel

create_torch_module() torch.nn.Module[source]#
class EncoderDecoderVectorClassificationModel(output_mode: ClassificationOutputMode, cuda: bool, history_sequence_column_name: str, history_sequence_vectoriser: SequenceVectoriser, history_sequence_variable_length: bool, latent_dim: int, encoder_factory: EncoderFactory, decoder_factory: DecoderFactory, nn_optimiser_params: Optional[NNOptimiserParams] = None, class_weights: Optional[Dict[Hashable, float]] = None)[source]#

Bases: TorchVectorClassificationModel

A highly general encoder-decoder sequence model, which encodes a sequence of history items and uses the encoding to make predictions for one or more target sequence items. History input sequences are converted to vectors via a vectoriser.

Parameters:
  • output_mode – the output mode defining the semantics of the outputs produced by the decoder

  • cuda – whether to use a CUDA device

  • history_sequence_column_name – the name of the data frame input column which contains the history sequences to be encoded. The column must contain a sequence of items that can be converted to vectors via the history_sequence_vectorizer

  • history_sequence_vectoriser – a vectorizer which converts history sequence items to vectors

  • history_sequence_variable_length – whether history sequences can be of variable length

  • latent_dim – the number of latent dimensions to be used by the encoder

  • encoder_factory – a factory for the creation of the encoder, which takes sequence items from the history and encodes them into vectors of dimension latent_dim

  • decoder_factory – a factory for the creation of the decoder component, which takes a latent vector produced by the encoder and (a sequence of) target features to make predictions

  • nn_optimiser_params – the optimiser parameters

  • class_weights – a mapping from class labels to weights (which will be applied to the default loss evaluator, provided that it is not overridden in nn_optimiser_params)

class InputTensoriser(history_sequence_column_name: str, history_sequence_vectoriser: SequenceVectoriser)[source]#

Bases: Tensoriser

fit(df: pandas.DataFrame, model=None)[source]#
Parameters:
  • df – the data frame with which to fit this tensoriser

  • model – the model in the context of which the fitting takes place (if any). The fitting process may set parameters within the model that can only be determined from the (pre-tensorised) data.

class EncoderDecoderModel(parent: EncoderDecoderVectorClassificationModel)[source]#

Bases: VectorTorchModel

create_torch_module_for_dims(input_dim: int, output_dim: int) EncoderDecoderModule[source]#
Parameters:
  • input_dim – the number of input dimensions as reported by the data set provider (number of columns in input data frame for default providers)

  • output_dim – the number of output dimensions as reported by the data set provider (for default providers, this will be the number of columns in the output data frame or, for classification, the number of classes)

Returns:

the torch module