data.StratifiedSampler
- class data.StratifiedSampler(*args: Any, **kwargs: Any)[source]
A custom sampler that performs stratified sampling based on a partition criterion.
Note: Make sure that num_bins is chosen sufficiently small to avoid too many empty bins.
- Parameters:
data_source – The data source to be sampled from.
partition_criterion – A callable function that takes a data source and returns a list of values used for partitioning.
num_samples – The total number of samples to be drawn from the data source.
num_bins – The number of bins to divide the partitioned values into. Defaults to 10.
replacement – Whether to sample with replacement or without replacement. Defaults to True.
verbose – Whether to print verbose output during sampling. Defaults to True.