Dataset
Classification_problem
- class gaggle.problem.dataset.classification_problem.ClassificationProblem(problem_args: ProblemArgs = None, sys_args: SysArgs = None)[source]
Bases:
Problem
A Problem that represents a standard Machine Learning classification problem. It stores the associated training and validation dataset. Population evaluation optimized for GPU by default to speed up training. To create a classification problem with a custom dataset, register said dataset in the DatasetFactory.
- evaluate(individual: Individual, train: bool = True, *args, **kwargs) float [source]
Evaluates an individual on the current batch of data.
- Parameters:
individual –
train – whether we are currently training or performing an inference.
*args –
**kwargs –
Returns:
- evaluate_population(population_manager: PopulationManager, use_freshness: bool = True, update_manager: bool = True, train: bool = True, *args, **kwargs) dict[slice(<class 'int'>, <class 'float'>, None)] [source]
Population evaluation optimized for GPU by default to speed up training. Should only be modified if specific custom behavior is desired. It is usually not recommend to modify this function.
- Parameters:
population_manager –
use_freshness –
update_manager –
train –
*args –
**kwargs –
- Returns:
The dictionary of individual fitnesses
Dataset
- class gaggle.problem.dataset.dataset.DataWrapper(data: Tensor = None, targets: Tensor = None)[source]
Bases:
Dataset
Wrapper that set the .data and .targets attributes that can then be accessed by the Dataset class in the get_data_and_targets method.
See also
This class creates attributes that are used by Dataset.get_data_and_targets.
- class gaggle.problem.dataset.dataset.Dataset(problem_args: ProblemArgs = None, train: bool = True, sys_args: SysArgs = None)[source]
Bases:
Dataset
,ABC
Dataset class that allows for more flexible custom indexing and other behavior
- get_data_and_targets()[source]
Gets the data and the targets for the current dataset stored in the self.data object. The self.data object should have .data and .targets attributes to be returned.
- Returns:
A tuple containing (data, targets) or (None, None) if the dataset is not initialized.
Dataset_factory
- class gaggle.problem.dataset.dataset_factory.DatasetFactory[source]
Bases:
object
Factory that generates pre-existing available datasets. DatasetFactory.datasets stores said datasets as a dictionary with their name as key and the uninitialized Dataset object as value.
See also
Dataset Class
- datasets = {'CIFAR10': <class 'gaggle.problem.dataset.base_datasets.cifar10.CIFAR10'>, 'MNIST': <class 'gaggle.problem.dataset.base_datasets.mnist.MNIST'>}
- static from_data(data: Tensor, targets: Tensor, train: bool = True, seed: int = 1337) Dataset [source]
Creates a basic dataset object from given data and targets with basic arguments.
- Parameters:
data – data tensor
targets – target/label tensor
train – whether it is a training or evaluation dataset
seed – seed for the randomness of the batch sampling
- Returns:
A Dataset object.
- classmethod from_problem_args(problem_args: ProblemArgs = None, train: bool = True, sys_args: SysArgs = None) Dataset [source]
Initializes the requested dataset from the dictionary of available datasets.
This is done by using the attribute problem_args.dataset_name as the lookup key to DatasetFactory.datasets.
- Parameters:
problem_args – problem args that will be used to build the Dataset
train – whether we should return the training or evaluation dataset
sys_args – system args
- Returns:
A Dataset object.