Prijavi problem


Obeleži sve kategorije koje odgovaraju problemu

Još detalja - opišite nam problem


Uspešno ste prijavili problem!
Status problema i sve dodatne informacije možete pratiti klikom na link.
Nažalost nismo trenutno u mogućnosti da obradimo vaš zahtev.
Molimo vas da pokušate kasnije.

Priručnik za Džupajter

3. Bar charts and colors

In this lecture we demonstrate:

  1. another way to move cells around a Jupyter notebook;
  2. how to visualize data using bar charts; and
  3. how to use colors in charts.

3.1. A bit more about Jupyter notebooks

We have already on a few occasions said that each Jupyter notebook is a sequence of cells, and each cell can contain some text, an expression, or a Python program. Buttons at the top of the page make it possible for you to manipulate cells. We have already used the following buttons:

  • Run which runs a cell,
  • the dskette which saves the notebook,
  • + which adds a new cell below the active cell, and
  • up and down arrows which move the active cell.

The following three buttons:

Three more buttons

perform the usual actions cut (cut the cell from the notebook and memorize it, button that looks like scissors), copy (leave the cell in the notebook but memorize a copy of it, button that looks loke two sheets of paper) and paste (insert the memorized cell, button that looks like placing a sheet of paper onto a clipboard).

If you wish move a cell:

  1. click on the cell,
  2. click the cut button ("scissors") to remove the cell from the notebook and memorize it,
  3. click somewhere else in the Jupyter notebook, and finally
  4. click the paste button ("placing a sheet of paper onto a clipboard") to insert the memorized cell.

This operation is briefly called cut/paste.

If you wish make a copy of a cell:

  1. click on the cell,
  2. click the copy button ("two sheets of paper") to memorize the cell,
  3. click somewhere else in the Jupyter notebook, and finally
  4. click the paste button ("placing a sheet of paper onto a clipboard") to insert the memorized cell.

This operation is briefly called copy/paste.

3.2. Bar charts

Quite often it is more convenient to represent data by a sequence of bars instead of by a continuous line. Such charts are called bar charts (for obvious reasons).

Let us first import the library:

In [1]:
import matplotlib.pyplot as plt

After the import, the library is available in the notebook and there is no need to import it again. However, the import pertains to this notebook only.

Assume that a these are the marks of a student:

Subject Mark
Maths 2
English 4
Arts 5
History 3
PE 5
Music 4
Technology 5

We'll represent the data in terms of two lists like this:

In [2]:
subjects = ["Maths", "Eng", "Arts", "Hist", "PE", "Music", "Tech"]
marks    = [2,        4,     5,      3,      5,    4,       5    ]

The function bar can be invoked to represent these data in the form of a bar chart:

In [3]:
plt.bar(subjects, marks)
plt.title("Marks of a student")
plt.show()
plt.close()

If you wish to resize the chart you can invoke the function figure with its parameter figuresize like this:

In [4]:
plt.figure(figsize=(10,5))
plt.bar(subjects, marks)
plt.title("Marks of a student")
plt.show()
plt.close()

The pyplot library assigns colors to charts the way it finds appropriate. If we wish to change the color of a chart we can simply request another color by throwing in the color parameter as follows:

In [5]:
plt.figure(figsize=(10,5))
plt.bar(subjects, marks, color="g")
plt.title("Marks of a student")
plt.show()
plt.close()

The chart is now green ("g" = green). We have the following colors at our disposal:

Letter Color
"b" blue
"g" green
"r" red
"c" cyan
"m" magenta
"y" yellow
"k" black
"w" white

3.3. Displaying two sets of data on the same chart

"Normal body temperature" is actually an interval of temperatures that changes with the age og the person. When measured in the armpit the interval of temperatures that is considered normal for an age is given the this table:

Age Temperature ($^\circ$C)
0--2 years 34.7--37.3
3--10 years 35.9--36.7
11--65 years 35.2--36.9
preko 65 years 35.6--36.2

The data can be represented as three lists:

In [6]:
age        = ["0-2", "3-10", "11-65", "65+"]
normalT_lo = [34.7,  35.9,   35.2,    35.6]
normalT_hi = [37.3,  36.7,   36.9,    36.2]

We shall visualize this situation on the same chart by invoking bar twice:

In [7]:
plt.bar(age, normalT_hi)
plt.bar(age, normalT_lo)
plt.title("Normal body temperature by age")
plt.xlabel("Age (years)")
plt.ylabel("Temperature (C)")
plt.show()
plt.close()

Functions xlabel and ylabel add additional explanations to the $x$- and $y$-axis.

Unfortunately, this chart is not very informative because the intervals we are trying to depict are relatively small. Since we would like to focus on intervals of temperatures we can limit the range ov values that are represented by the $y$-axis. In this case, using the ylim ($y$-limits) function we are going to limit the range of the temperatures displayed to the interval $34-39^\circ C$.

In [8]:
plt.ylim(34,39)
plt.bar(age, normalT_hi)
plt.bar(age, normalT_lo)
plt.title("Normal body temperature by age")
plt.xlabel("Age (years)")
plt.ylabel("Temperature (C)")
plt.show()
plt.close()

Note also that the order of the two bar functions matters! The library draws bars representing data in the order in which they appear in the Python code. Since the values in the normalT_hi are greater that the values in normalT_lo the other possible ordering of the bar commands produces the chart in which the higher values are painted over the lower ones, which is not what we had in mind:

In [9]:
plt.ylim(34,39)
plt.bar(age, normalT_lo)
plt.bar(age, normalT_hi)
plt.title("Normal body temperature by age")
plt.xlabel("Age (years)")
plt.ylabel("Temperature (C)")
plt.show()
plt.close()

Therefore, we draw higher values first, and then paint the lover values over them:

In [10]:
plt.ylim(34,39)
plt.bar(age, normalT_hi)
plt.bar(age, normalT_lo)
plt.title("Normal body temperature by age")
plt.xlabel("Age (years)")
plt.ylabel("Temperature (C)")
plt.show()
plt.close()

For those who did not spend all this time to produce the diagram it may be unclear which values are represented by which color. This is why it is possible to add a legend to the chart. To do so, each bar command gets and extra parameter of the form label="explanation" which provides a short explaination of what data are presented by the diagram. The function legend at the end puts a legend in one of the corners of the chart:

In [11]:
plt.ylim(34,39)
plt.bar(age, normalT_hi, label="upper limit")
plt.bar(age, normalT_lo, label="lower limit")
plt.title("Normal body temperature by age")
plt.xlabel("Age (years)")
plt.ylabel("Temperature (C)")
plt.legend()
plt.show()
plt.close()

3.4. Exercises

Exercise 1. Look at the code carefully and then answer the questions:

In [12]:
import matplotlib.pyplot as plt
plt.ylim(34,39)
plt.bar(age, normalT_hi, label="upper limit")
plt.bar(age, normalT_lo, label="lower limit")
plt.title("Normal body temperature by age")
plt.xlabel("Age (years)")
plt.ylabel("Temperature (C)")
plt.legend()
plt.show()
plt.close()
  1. What does the function bar do?
  2. What happens if we swap the two lines of code containing the bar functions?
  3. What does the function xlabel do?
  4. What do the functions ylim and legend do?
  5. How would you change the size of this chart?
  6. How would you change the color of bars to green and yellow?

Exercise 2. The first ten places on the ATP list on July 21st, 2109 look like this:

In [13]:
tennis_players = ["Đoković", "Nadal", "Federer", "Thiem", "Zverev", "Tsipras", "Nishikori", "Khachanov", "Fognini", "Medvedev"]
ATP_points     = [12415,      7945,    7460,      4595,   4325,     4045,       4040,        2890,        2785,      2625]

Visualize this by a bar chart.

Exercise 3. The biologists have up to now classified more than 2,000,000 species of living beings. They are all divided into five kingdoms and the approximate number of species per kingdom is given in this table:

Kingdom Number of species
Animalia 1,400,000
Plantae 290,000
Fungi 100,000
Protoctista 200,000
Prokaryotae 10,000

Visualize this data by a bar chart.

Exercise 4. The following table summarizes the highest and the lowest recorded temperatures (in $^\circ$C) on each of the continents:

Continent: Europe Asia Africa North America South America Australia Antarctica
Highest recorded temp: 48 54 55 56.7 48.9 50.7 19.8
Lowest recorded temp: -58.1 -67.8 -23.9 -63 -32.8 -23 -89.2

Visualize the data on the same chart. Use red bars to display highest recorded temperatures, and blue bars for the lowest ones.

Exercise 5.

(a) Search the Internetu to find out what does the function barh from the library matplotlib do.

(b) Solve Exercise 4 using the barh function.

In [ ]:
 

Exercise 6*. It is estimated that on July 1st, 2019 the population of China was 1,420,062,022 and the population of India was 1,368,737,513. It is also estimated that the population of China increases by 0.35% per year, while the population of India increases by 1.08% per year.

(a) Assuming that the rate of increase of the population of both countries is not going to change in near future visualize the population of China and India in the following ten years on the same chart using the plot function.

(b) Read from the chart in which year is India going to overtake China as the most populated country on the Earth.

In [ ]:
 
© 2019 Petlja.org Creative Commons License