armed.crossvalidation.grouped_cv#

Custom scikit-learn KFolds for grouped AND stratified splitting, created by hermidalc https://github.com/scikit-learn/scikit-learn/issues/13621#issuecomment-656094573 —– 12/28/2020

Classes

StratifiedGroupKFold(*args, **kwargs)

Stratified K-Folds iterator variant with non-overlapping groups.

StratifiedGroupShuffleSplit(*args, **kwargs)

Stratified GroupShuffleSplit cross-validator Provides randomized train/test indices to split data according to a third-party provided group. This group information can be used to encode arbitrary domain specific stratifications of the samples as integers. This cross-validation object is a merge of GroupShuffleSplit and StratifiedShuffleSplit, which returns randomized folds stratified by group class. The folds are made by preserving the percentage of groups for each class. Note: like the StratifiedShuffleSplit strategy, stratified random group splits do not guarantee that all folds will be different, although this is still very likely for sizeable datasets. Read more in the User Guide. :param n_splits: Number of re-shuffling & splitting iterations. :type n_splits: int, default=5 :param test_size: If float, should be between 0.0 and 1.0 and represent the proportion of groups to include in the test split (rounded up). If int, represents the absolute number of test groups. If None, the value is set to the complement of the train size. By default, the value is set to 0.1. :type test_size: float, int, None, default=None :param train_size: If float, should be between 0.0 and 1.0 and represent the proportion of the groups to include in the train split. If int, represents the absolute number of train groups. If None, the value is automatically set to the complement of the test size. :type train_size: float, int, or None, default=None :param random_state: If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random. :type random_state: int, RandomState instance or None, default=None.