CVs with high number of folds don't return predicted train data with equally represented folds.
The issue is best explained with an example. Given the following cross-validator setup
slice #:| 1. | 2. | 3. | 4. | 5. |
fold 0: | Tr | Tr | Tr | Te | Va |
fold 1: | Tr | Tr | Te | Va | Tr |
fold 2: | Tr | Te | Va | Tr | Tr |
fold 3: | Te | Va | Tr | Tr | Tr |
fold 4: | Va | Tr | Tr | Tr | Te |
HepNet.predict(cv='train') will return predicted train data where slice 1 of the dataset was parsed through the trained network of fold 2, slice 2-4 from fold 4, and slice 5 from fold 3, since the predicted data is overwritten when looping over the different folds. This means the trained networks from fold 0 and fold 1 are not represented at all. They are, however, represented when predicting the validation or testing set. This prevents proper comparisons between the network outputs of the train/test/val set for the full dataset.