This repository consists of a collection of python examples intended as an
introduction on the usage of python in data analysis, especially for the
advanced laboratories in physics at the University of Freiburg. In previous
years code examples for [ROOT](https://root.cern.ch/) have been provided.
Material on the usage of python was missing. The code examples shown in this
repository follow the examples shown in the ROOT introductions.

# Installation
To get started with python for data analysis in the advanced laboratories you
need the python interpreter. In this document we will use `python3`. The
additional packages `numpy`, `scipy` and `matplotlib` are useful for data
analysis and data presentation. To install all the packages on Ubuntu, you
can run the following command line.


```bash
sudo apt-get install python3 python3-numpy python3-scipy python3-matplotlib
```
<!--
Doxec in a docker container needs a slightly different command, please keep them
in-sync.
-->
<!-- console
```bash
$ apt-get update
$ apt-get install -y python3 python3-numpy python3-scipy python3-matplotlib
```
-->

# Prerequisites




# 'Hello World' Example
The first example is basically a 'Hello World' script, to check whether python
is running correctly. Create a file named `hello_world.py` and add the following
content.

<!-- write hello_world.py -->
```python
# load math library with sqrt function
import math

print("Example 1:")

# Strings can be formatted with the % operator. The placeholder %g prints a
# floating point numbers as decimal or with exponent depending on its
# magnitute.
print("  Square root of 2 = %g" % math.sqrt(2))
```

To run the example, open a terminal tell the python interpreter to run your
code.
<!-- console_output -->
```sh
$ python3 hello_world.py
Example 1:
  Square root of 2 = 1.41421
```

Have you seen the expected output? Congratulations, you can move on to real-life
examples.

# Numpy Arrays 
The standard data structure to store numerical data are numpy arrays. Numpy
arrays are defined in the numpy package, and are implemented in a very
efficient way. 

To get stared with numpy arrays create a file `np_arrays.py` and add all lines
listed in this chapter. The first line should be an import statement.
<!-- write np_arrays.py -->
```python
import numpy as np
```
In this example we create a numpy array `numbers` containing my favorite numbers from
a python list.
<!-- append np_arrays.py -->
```python
numbers = np.array([4, 9, 16, 36, 49])
```
Having all these numbers in a numpy array makes bulk computations very efficient.
Assume, we want to calculate the square root of all these numbers, wen can
simply use numpy's `sqrt` method do perform the same operation on all elements
of the array at the same time.
<!-- append np_arrays.py -->
```pyton
roots = np.sqrt(numbers)
```

Since the resulting variable `roots` is also a numpy array, we can perform
similar operations on this variable.
<!-- append np_arrays.py -->
```pyton
something_else = 1.5 * roots - 4
```
Numpy arrays overload the typical arithmetic operations, such that the above
statement benefits from numpys efficient, vectorized (i.e. performing the same
operation on may values) implementation. You should always think about a way to
use such vectorized statements, and try to avoid manually looping over all the
values. Using a python loop to run over $`10^3`$ values is probably fine, but you
don't want to wait for a python loop iterating over 10^6 or 10^9 values.

Finally add a print statement to check that all the caluclations are as expected. 
<!-- append np_arrays.py -->
```python
print("The result is", something_else)
```

When executed you should get the following printout.
<!-- console_output -->
```bash
$ python3 np_arrays.py
The result is [-1.   0.5  2.   5.   6.5]
```

Numpy offers many other functionalities which are beyond the scope of this basic
introduction. It is definetely worth glancing at the
[documentation](https://docs.scipy.org/doc/numpy/index.html).

# Plotting Functions
One major aspect of data analysis is also data presentation. This includes the
geneation of diagrams and plots. You can use the powerful library matplotlib to
create publication-quality plots from python. The goal of this example is to plot
the cropped parabola f(x), which is limited y=4 for x>=2. 

```math
    f(x) = \left\{\begin{array}{lr}
              x^2, & \text{for } x < 2\\
              4, & \text{for } 2 \leq x
            \end{array}\right\.
```

Create the file `func_plot.py` and add the following lines.

<!-- Add additional files for non-X11 environment in CI -->
<!-- write func_plot.py
```python
import matplotlib
matplotlib.use('Agg')
```
-->
<!-- append func_plot.py -->
```python
import numpy as np
import matplotlib.pyplot as plt
```

Plotting a function with matplotlib means plotting many points connected by a
line. First we create an array with 200 equidistant values in the interval
[-2.5, 3]. This array functions as a grid of x-values, for which we calculate
the y values. 
<!-- append func_plot.py -->
```python
x = np.linspace(-2.5, 3, 200)
```

We can easily calculate the square of all these values with `x**2`. Cropping the
right part is a bit more complex. First we create an index array of 1's and 0's, which
indicate whether x >= 2. This index array can be used to select a subset of
y-values. Finally we can assign the value 4 to this subset, and therefore
effectively cropping the parabola. The full example reads:
<!-- append func_plot.py -->
```python
y = x**2
idx = (x >= 2)
y[idx] = 4
```

The final step of this example is to plot the points and connect the with a
line by using matplolib's plot method. We can also add axis labels and save the
resulting figure.
<!-- append func_plot.py -->
```python
plt.plot(x, y)
plt.xlabel("$x$")  # latex synatx can be used
plt.ylabel("cropped parabola")
plt.savefig("cropped_parabola.eps")
```

Run your script and check the file `cropped_parabola.eps` is created.
<!-- console -->
```bash
$ python3 func_plot.py
```

<!-- console_output
```
$ ls cropped_parabola.eps
cropped_parabola.eps
```
-->