CBSERanker

Loading

Data Visualization with Matplotlib in Python

Data Visualization with Matplotlib in Python

Data Visualization with Matplotlib in Python

Introduction to Data Visualization

Data visualization represents data in graphical formats like charts and graphs, making it easier to identify patterns, trends, and correlations. Matplotlib is a powerful Python library for creating 2D visualizations, with Pyplot being its key submodule for simple plotting.

Installation and Setup

Installation Methods:

• Anaconda: Matplotlib comes pre-installed.

• Standard Installation:


python -m pip install -U pip
python -m pip install -U matplotlib

Importing Pyplot:


import matplotlib.pyplot as plt  # Common convention

Types of Charts

• Line Chart: Displays data points connected by lines.

• Bar Chart: Represents data with rectangular bars.

• Pie Chart: Shows proportional data as slices of a circle.

• Histogram: Displays frequency distribution.

• Scatter Plot: Shows relationships between two variables.

• Box Plot: Visualizes data distribution through quartiles.

Line Chart

Basic Line Chart:


import matplotlib.pyplot as plt
x = [1, 2, 3]
y = [2, 3.5, 5]
plt.plot(x, y)
plt.show()

Customizing Line Charts:

Labels and Title:


plt.xlabel(‘Overs’)
plt.ylabel(‘Runs Scored’)
plt.title(‘Over wise Runs Scored’)

Line Style:


plt.plot(x, y, ‘r’, linewidth=4, linestyle=’dashed’)  # Red, thick, dashed line

Markers:


plt.plot(x, y, marker=’+’, markersize=10, markeredgecolor=’red’)

Bar Chart

Vertical Bar Chart:


categories = [‘1-10′, ’11-20′, ’21-30’]
values = [65, 55, 70]
plt.bar(categories, values, width=0.3, color=[‘r’, ‘g’, ‘b’])
plt.xlabel(‘Over Interval’)
plt.ylabel(‘Runs Scored’)
plt.title(‘Scoring Chart’)
plt.show()

Horizontal Bar Chart:


plt.barh(cities, temperatures)
plt.xlabel(‘Temperature’)
plt.ylabel(‘Cities’)

Multiple Bar Chart:


x = np.linspace(1, 5, 5)
plt.bar(x, team_a, width=0.3, label=’Team A’)
plt.bar(x + 0.3, team_b, width=0.3, label=’Team B’)
plt.legend()

Pie Chart

Basic Pie Chart:


slices = [50, 20, 15, 10]
departments = [‘Sales’, ‘HR’, ‘Finance’, ‘Production’]
plt.pie(slices, labels=departments, autopct=’%1.1f%%’, shadow=True)
plt.title(‘Department Distribution’)
plt.show()

Exploded Pie Chart:


explode = [0, 0.2, 0, 0]  # Pull out the ‘HR’ slice
plt.pie(slices, explode=explode, labels=departments)

Histogram

Frequency Distribution:


ages = [22, 32, 35, 45, 55, 14, 26]
bins = [0, 10, 20, 30, 40, 50, 60]
plt.hist(ages, bins=bins, color=’magenta’, edgecolor=’black’)
plt.xlabel(‘Employee Age’)
plt.ylabel(‘Number of Employees’)

Frequency Polygon:


plt.hist(ages, bins=bins, histtype=’step’)

Box Plot

Basic Box Plot:


data = [val1, val2, val3]
plt.boxplot(data, labels=[‘Series1’, ‘Series2’, ‘Series3’])

Customized Box Plot:


plt.boxplot(data, patch_artist=True, notch=True)

Scatter Plot

Basic Scatter Plot:


plt.scatter(x, y, color=’red’, marker=’x’)
plt.xlabel(‘Age’)
plt.ylabel(‘Number of Employees’)

Saving Plots


plt.savefig(‘path/to/file.pdf’)  # Supports formats like PNG, PDF, SVG
plt.show()

Key Takeaways

• Use plt.plot() for line charts.

• Customize charts with labels (xlabel, ylabel), titles (title), and legends (legend).

• Bar charts (bar, barh) are ideal for comparisons.

• Pie charts (pie) show proportions.

• Histograms (hist) display distributions.

• Box plots (boxplot) summarize data statistics.

• Scatter plots (scatter) reveal relationships between variables.

Leave a Reply

Your email address will not be published. Required fields are marked *