Skip to content
Snippets Groups Projects
Frank Sauerburger's avatar
Frank Sauerburger authored
Create also a png version of the cropped parabola plot and use this version in
the README.
b7dd962f
History
Name Last commit Last update
.gitignore
.gitlab-ci.yml
Dockerfile
LICENSE
README.md

This repository consists of a collection of python examples intended as an introduction on the usage of python in data analysis, especially for the advanced laboratories in physics at the University of Freiburg. In previous years code examples for ROOT have been provided. Material on the usage of python was missing. The code examples shown in this repository follow the examples shown in the ROOT introductions.

Installation

To get started with python for data analysis in the advanced laboratories you need the python interpreter. In this document we will use python3. The additional packages numpy, scipy and matplotlib are useful for data analysis and data presentation. To install all the packages on Ubuntu, you can run the following command line.

sudo apt-get install python3 python3-numpy python3-scipy python3-matplotlib

Prerequisites

'Hello World' Example

The first example is basically a 'Hello World' script, to check whether python is running correctly. Create a file named hello_world.py and add the following content.

# load math library with sqrt function
import math

print("Example 1:")

# Strings can be formatted with the % operator. The placeholder %g prints a
# floating point numbers as decimal or with exponent depending on its
# magnitute.
print("  Square root of 2 = %g" % math.sqrt(2))

To run the example, open a terminal tell the python interpreter to run your code.

$ python3 hello_world.py
Example 1:
  Square root of 2 = 1.41421

Have you seen the expected output? Congratulations, you can move on to real-life examples.

Numpy Arrays

The standard data structure to store numerical data are numpy arrays. Numpy arrays are defined in the numpy package, and are implemented in a very efficient way.

To get stared with numpy arrays create a file np_arrays.py and add all lines listed in this chapter. The first line should be an import statement.

import numpy as np

In this example we create a numpy array numbers containing my favorite numbers from a python list.

numbers = np.array([4, 9, 16, 36, 49])

Having all these numbers in a numpy array makes bulk computations very efficient. Assume, we want to calculate the square root of all these numbers, wen can simply use numpy's sqrt method do perform the same operation on all elements of the array at the same time.

roots = np.sqrt(numbers)

Since the resulting variable roots is also a numpy array, we can perform similar operations on this variable.

something_else = 1.5 * roots - 4

Numpy arrays overload the typical arithmetic operations, such that the above statement benefits from numpys efficient, vectorized (i.e. performing the same operation on may values) implementation. You should always think about a way to use such vectorized statements, and try to avoid manually looping over all the values. Using a python loop to run over

10310^3
values is probably fine, but you don't want to wait for a python loop iterating over 10^6 or 10^9 values.

Finally add a print statement to check that all the caluclations are as expected.

print("The result is", something_else)

When executed you should get the following printout.

$ python3 np_arrays.py
The result is [-1.   0.5  2.   5.   6.5]

Numpy offers many other functionalities which are beyond the scope of this basic introduction. It is definetely worth glancing at the documentation.

Plotting Functions

One major aspect of data analysis is also data presentation. This includes the geneation of diagrams and plots. You can use the powerful library matplotlib to create publication-quality plots from python. The goal of this example is to plot the cropped parabola f(x), which is limited y=4 for x>=2.

There was an error rendering this math block. KaTeX parse error: Got function '\.' with no arguments as argument to '\right' at position 147: …nd{array}\right\̲.̲

The final plot should look like this. Plot of cropped parabola

Create the file func_plot.py and add the following lines.

import numpy as np
import matplotlib.pyplot as plt

Plotting a function with matplotlib means plotting many points connected by a line. First we create an array with 200 equidistant values in the interval [-2.5, 3]. This array functions as a grid of x-values, for which we calculate the y values.

x = np.linspace(-2.5, 3, 200)

We can easily calculate the square of all these values with x**2. Cropping the right part is a bit more complex. First we create an index array of 1's and 0's, which indicate whether x >= 2. This index array can be used to select a subset of y-values. Finally we can assign the value 4 to this subset, and therefore effectively cropping the parabola. The full example reads:

y = x**2
idx = (x >= 2)
y[idx] = 4

The final step of this example is to plot the points and connect the with a line by using matplolib's plot method. We can also add axis labels and save the resulting figure.

plt.plot(x, y)
plt.xlabel("$x$")  # latex synatx can be used
plt.ylabel("cropped parabola")
plt.savefig("cropped_parabola.eps")

Run your script and check the file cropped_parabola.eps is created.

$ python3 func_plot.py