In this post i am going to introduce you to one of the most popular Python library used to plot graphs. Yes you guessed it correct.…. its MATPLOTLIB.
Graphs are nothing but a way to represent information visually. Graphs dramatically increases the capability of processing information for anyone. They play a major role in evaluation/analysis and can efficiently boost productivity in business. Graphs are useful for comparing data. There are mainly 2 categories of graph namely 2-Dimensional graph and 3-Dimensional graphs. sub-categorizing them are a variety of types of graphs. line graph, Bar chart, Pie chart, Scatter plot graph, etc just to name a few. In python, MatPlotLib is a library that lets us plot graphs(2-D & 3-D)very easily and efficiently. Below, only gives a gentle introduction on how this library can be utilized.
NOTE:: This post assumes that you already have Matplotlib installed on your machine. If not, you might want to go through your system specific(Win, Linux, Mac etc.) instructions first.
Let’s start with 2-D line graph.
import matplotlib.pyplot as plt
x = [1, 2, 3]
y = [4, 5, 6]
x1 = [10, 11, 12]
y1 = [13, 14, 15]
x2 = [20, 21, 22]
y2 = [23, 24, 25]
plt.xlabel('X_LABEL')
plt.ylabel('Y_LABEL')
plt.plot(x, y, label = 'First')
plt.plot(x1, y1, label = 'Second')
plt.plot(x2, y2, label = 'Third')
plt.title('FIRST EVER MATPLOTLIB GRAPH')
plt.legend()
plt.show()
Below are the code snippets for basic 2-D Bar chart, Histogram, Stack plot, Scatter graph, Pie chart respectively.
# Bar chart
import matplotlib.pyplot as plt
x = [2, 4, 6, 8, 10]
y = [7, 2, 7, 9, 16]
x1 = [1,3,5,7,9]
y1 = [7,8,2,4,6]
plt.bar(x, y, label = 'FIRST', color = 'c')
plt.bar(x1, y1, label = 'SECOND', color = 'r')
plt.xlabel('X_LABEL')
plt.ylabel('Y_LABEL')
plt.title('BARCHART PLOTTING')
plt.legend()
plt.show()
--------------------------------------------------------------------
# Histogram
import matplotlib.pyplot as plt
population_ages = [22, 55, 62, 55, 88, 5, 12, 4, 57, 9, 105, 6, 32, 4, 72, 94, 26, 59, 16, 53, 86, 10, 45, 34, 65, 85, 25, 35, 25, 14, 16, 36, 17, 27, 38, 49, 56, 10, 64]
bins = [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110]
plt.hist(population_ages, bins, histtype = 'bar', rwidth = .2)
plt.xlabel('X_LABEL')
plt.ylabel('Y_LABEL')
plt.title('HISTOGRAM PLOTTING')
#plt.legend()
plt.show()
--------------------------------------------------------------------
# Stack
import matplotlib.pyplot as plt
days = [1,2,3,4,5]
sleeping = [7,8,6,11,7]
eating = [2,3,4,3,2]
working = [7,8,7,2,2]
playing = [8,5,7,8,13]
plt.plot([], [], color = 'm', label = 'Sleeping', linewidth = 7)
plt.plot([], [], color = 'c', label = 'Eating', linewidth = 7)
plt.plot([], [], color = 'r', label = 'Working', linewidth = 7)
plt.plot([], [], color = 'k', label = 'Playing', linewidth = 7)
plt.stackplot(days, sleeping, eating, working, playing, colors = ['m','c','r','k'])
plt.xlabel('X_LABEL')
plt.ylabel('Y_LABEL')
plt.legend()
plt.show()
--------------------------------------------------------------------
# Scatter
import matplotlib.pyplot as plt
x = [1,2,3,4,5,6]
y = [7,8,9,1,2,1]
plt.title("SCATTER PLOTTING")
plt.xlabel('X_LABEL')
plt.ylabel('Y_LABEL')
plt.scatter(x, y, s = 100, color = 'cyan', label = "SCATTER PLOT")
plt.legend()
plt.show()
--------------------------------------------------------------------
# Pie Chart
import matplotlib.pyplot as plt
sleeping = [7,8,6,11,7]
eating = [2,3,4,3,2]
working = [7,8,7,2,2]
playing = [8,5,7,8,13]
slices = [7,2,7,8]
activities = ['sleeping','eating','working','playing']
colors = ['m','c','r','b']
plt.pie(
slices,
labels = activities,
colors = colors,
startangle =180,
shadow = True,
counterclock = True,
explode = (0,0.1,0,0),
autopct = '%1.1f%%'
)
plt.xlabel('X_LABEL')
plt.ylabel('Y_LABEL')
plt.show()
In all of the above examples data is created manually in form of lists. But in real time scenarios, data is not manually fed. Let’s look at an example where we take data from a CSV file
import matplotlib.pyplot as plt
import csv
import numpy as np
x = []
y = []
with open('example.txt','r') as csvfile:
lots = csv.reader(csvfile, delimiter = ',')
or row in plots:
x.append(int(row[0]))
y.append(int(row[1]))
plt.plot(x, y, label = 'Loaded from file')
In the above snippet, example.txt is a simple text file containing random numbers which are separated by a comma(‘ , ’), and the file is kept in the same directory as the script.
The same can also be done more effectively using Numpy. Numpy is another cool python library/module for some number crunching.
import matplotlib.pyplot as plt
import numpy as np
x = []
y = []
x, y = np.loadtxt('example.txt', delimiter = ',', unpack = True)
plt.plot(x, y, label = 'Loaded from file')
plt.xlabel('X_LABEL')
plt.ylabel('Y_LABEL')
plt.legend()
plt.show()
Finally we move on to plotting a 3-D graph. As seen above during 2-D line graph, the code for 3-D line graphing is much similar.
from mpl_toolkits.mplot3d import axes3d
import matplotlib.pyplot as plt
fig = plt.figure()
ax1 = fig.add_subplot(111, projection = '3d')
x = [1,2,3,4,5,6,7,8,9]
y = [3,4,5,6,7,8,9,0,1]
z = [6,7,8,9,0,1,2,3,4]
ax1.plot_wireframe(x,y,z)
ax1.set_xlabel('X_AXIS')
ax1.set_ylabel('Y_AXIS')
ax1.set_zlabel('Z_AXIS')
plt.show()
Now, lets try to join two pandas dataframe in 4 different ways namely Joining, Concatenating, Appending and Merging. All of them might seem to work in a similar manner at first glance but dont be mistaken because they all have very unique and different methodologies of operations.
Similar code can be applied to plot a 3-D Scatter graph and a 3-D Bar graph respectively.
from mpl_toolkits.mplot3d import axes3d
import matplotlib.pyplot as plt
fig = plt.figure()
ax1 = fig.add_subplot(111, projection = '3d')
x = [1,2,3,4,5,6,7,8,9]
y = [3,4,5,6,7,8,9,0,1]
z = [6,7,8,9,0,1,2,3,4]
ax1.scatter(x, y, z, c = 'g', marker = 'o')
ax1.set_xlabel('X_AXIS')
ax1.set_ylabel('Y_AXIS')
ax1.set_zlabel('Z_AXIS')
plt.show()
--------------------------------------------------------------------
from mpl_toolkits.mplot3d import axes3d
import matplotlib.pyplot as plt
import numpy as np
from matplotlib import style
style.use('ggplot')
fig = plt.figure()
ax1 = fig.add_subplot(111, projection='3d')
x = [1,2,3,4,5,6,7,8,9,10]
y = [1,2,3,4,5,6,7,8,9,10]
z = np.zeros(10)
x1 = np.ones(10)
y1 = np.ones(10)
z1 = [1,2,3,4,5,6,7,8,9,10]
ax1.bar3d(x, y, z, x1, y1, z1)
ax1.set_xlabel('x axis')
ax1.set_ylabel('y axis')
ax1.set_zlabel('z axis')
plt.show()
3-D bar graphs are unique in itself. There is a starting point, then there is the height of the bar, the width of the bar and finally there is the depth of the bar. A bar chart starts with the flat bar on an axis, but we have added another dimension i.e. depth in our example.
Here we conclude this post with some of the references that are out there to explore Matplotlib.
You can find a well commented and structured code along with reference notes on my Github. Make sure to check it out.
Stay tuned. Until next time…!