Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
FreeForestML
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package Registry
Container Registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
CERN
fsauerbu
FreeForestML
Commits
e4131c4f
Unverified
Commit
e4131c4f
authored
3 years ago
by
Frank Sauerburger
Browse files
Options
Downloads
Patches
Plain Diff
Add optional column creators to Cut
parent
a51669be
Branches
75-add-support-for-scale-factors
Branches containing commit
No related tags found
1 merge request
!67
Resolve "Add support for scale factors"
Pipeline
#12696
passed
3 years ago
Changes
4
Pipelines
2
Hide whitespace changes
Inline
Side-by-side
Showing
4 changed files
.gitlab-ci.yml
+1
-1
1 addition, 1 deletion
.gitlab-ci.yml
ci/doctest.sh
+3
-0
3 additions, 0 deletions
ci/doctest.sh
freeforestml/cut.py
+19
-3
19 additions, 3 deletions
freeforestml/cut.py
freeforestml/tests/test_cut.py
+23
-0
23 additions, 0 deletions
freeforestml/tests/test_cut.py
with
46 additions
and
4 deletions
.gitlab-ci.yml
+
1
−
1
View file @
e4131c4f
...
...
@@ -9,7 +9,7 @@ doctest:
image
:
python:3.7
script
:
-
pip install -r requirements.txt
-
"
python
-m
doctest
-v
$(ls
freeforestml/*.py
|
grep
-v
'__init__.py')"
-
ci/doctest.sh
unittest
:
stage
:
test
...
...
This diff is collapsed.
Click to expand it.
ci/doctest.sh
0 → 100755
+
3
−
0
View file @
e4131c4f
#!/bin/bash
python3
-m
doctest
-v
$(
ls
freeforestml/
*
.py |
grep
-v
'__init__.py'
)
This diff is collapsed.
Click to expand it.
freeforestml/cut.py
+
19
−
3
View file @
e4131c4f
...
...
@@ -8,7 +8,7 @@ class Cut:
quantities.
Cuts store the condition to be applied to a dataframe. New cut objects
accept all event by default. The selection can be limited by passing a
accept all event
s
by default. The selection can be limited by passing a
lambda to the constructor.
>>>
sel_all
=
Cut
()
...
...
@@ -65,9 +65,20 @@ class Cut:
>>>
sel_sr
=
Cut
(
lambda
df
:
df
.
is_sr
==
1
,
label
=
"
Signal Region
"
)
>>>
sel_sr
.
label
'
Signal Region
'
If the application of a cut requires to change the event weights by a so
called scale factors, you can pass additional optional keyword arguments
that specify how the new weight should be computed.
>>>
sel_sample
=
Cut
(
lambda
df
:
df
.
value
%
2
==
0
,
\
weight
=
lambda
df
:
df
.
weight
*
2
)
The argument name
'
weight
'
in this example is arbitrary. It is even
possible to add new columns to the returned dataframe in this way,
however, this is not recommended.
"""
def
__init__
(
self
,
func
=
None
,
label
=
None
):
def
__init__
(
self
,
func
=
None
,
label
=
None
,
**
columns
):
"""
Creates a new cut. The optional func argument is called with the
dataframe upon evaluation. The function must return an index array. If
...
...
@@ -77,16 +88,21 @@ class Cut:
if
isinstance
(
func
,
Cut
):
self
.
func
=
func
.
func
self
.
label
=
label
or
func
.
label
self
.
columns
=
columns
or
func
.
columns
else
:
self
.
func
=
func
self
.
label
=
label
self
.
columns
=
columns
def
__call__
(
self
,
dataframe
):
"""
Applies the internally stored cut to the given dataframe and returns a
new dataframe containing only entries passing the event selection.
"""
return
dataframe
[
self
.
idx_array
(
dataframe
)]
new_df
=
dataframe
[
self
.
idx_array
(
dataframe
)]
if
self
.
columns
:
new_df
=
new_df
.
assign
(
**
self
.
columns
)
return
new_df
def
idx_array
(
self
,
dataframe
):
"""
...
...
This diff is collapsed.
Click to expand it.
freeforestml/tests/test_cut.py
+
23
−
0
View file @
e4131c4f
...
...
@@ -311,3 +311,26 @@ class CutTestCase(unittest.TestCase):
high_sale
=
Cut
(
lambda
df
:
df
.
sale
>
10
)
self
.
assertEqual
(
list
(
high_sale
(
self
.
df
).
year
),
[])
def
test_assign_columns
(
self
):
"""
Check that passing a keyword argument overwrites an existing column.
"""
alternate
=
Cut
(
lambda
df
:
df
.
year
%
2
==
0
,
sale
=
lambda
df
:
df
.
sale
*
2
)
df_alt
=
alternate
(
self
.
df
)
self
.
assertEqual
(
list
(
df_alt
.
year
),
[
2010
,
2012
,
2014
,
2016
])
self
.
assertEqual
(
list
(
df_alt
.
sale
),
[
7.8
,
9.4
,
15.0
,
4.6
])
def
test_assign_new_columns
(
self
):
"""
Check that passing a keyword argument creates a new columns
"""
alternate
=
Cut
(
lambda
df
:
df
.
year
%
2
==
0
,
weight
=
lambda
df
:
df
.
year
*
0
+
2
)
df_alt
=
alternate
(
self
.
df
)
self
.
assertEqual
(
list
(
df_alt
.
year
),
[
2010
,
2012
,
2014
,
2016
])
self
.
assertEqual
(
list
(
df_alt
.
sale
),
[
3.9
,
4.7
,
7.5
,
2.3
])
self
.
assertEqual
(
list
(
df_alt
.
weight
),
[
2
,
2
,
2
,
2
])
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment