Skip to content
Snippets Groups Projects

WIP: Select training folds so they are equally represented in the predicted data

Open Frank Sauerburger requested to merge improve-predict-step-in-cv into master
1 unresolved thread

Closes #68

Edited by Frank Sauerburger

Merge request reports

Checking pipeline status.

Approval is optional
Ready to merge by members who can write to the target branch.
  • The source branch is 60 commits behind the target branch.
  • 6 commits and 1 merge commit will be added to master.
  • Source branch will be deleted.

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
380 # if we select the slices for training we are done
381 if not for_predicting: return all_slices_for_folds[fold_i]
382
383 # all_slices_for_folds looks e.g. like:
384 # [[0, 1, 2], [0, 1, 4], [0, 3, 4], [2, 3, 4], [1, 2, 3]]
385 # need to select array with uniq entries:
386 # [0, 1, 2, 4, 3]
387 uniq_el = lambda ar: set(x for l in ar for x in l)
388 exclusive_slices = []
389 for i, slices in enumerate(all_slices_for_folds):
390 for sl in slices:
391 if sl not in exclusive_slices and sl in uniq_el(all_slices_for_folds[i:]):
392 exclusive_slices.append(sl)
393 return exclusive_slices[fold_i]
394
395 def select_training(self, df, fold_i, for_predicting = False):
  • This is not dependent on the type of CV anymore and could be moved to the abstract parent class. I wonder if there will ever be cases where something similar needs to be done for the val or test set. Then one could abstract the select_training function and define select_training/validation/test_slices function for the individual CVs.

    By Benjamin Paul Jaeger on 2020-11-07T01:01:44 (imported from GitLab)

  • Please register or sign in to reply
  • Please register or sign in to reply
    Loading