Learn Histograms in Python with Matplotlib โ€” Step-by-Step with Code

Data Viz Beginner

A histogram is like a bar chart's nerdy cousin โ€” it doesn't just compare things, it shows how your numbers are spread out. Instead of labels like "Apples" or "Oranges," you get ranges like 0โ€“10, 10โ€“20, and so on. It's perfect for spotting patterns in things like test scores, survey results, or the ages of your favorite superheroes.

When Should You Use a Histogram?

Use a histogram when you want to understand how your data is spread out โ€” like figuring out what range most test scores fall into, or how often certain temperatures occur. If your data is continuous (like age, marks, height, or time), a histogram is way more helpful than a pie chart or line graph.

Real-Life Examples of Histograms

Step-by-Step Code Example

Here's a simple example of how to create a histogram in Matplotlib:

Python
import matplotlib.pyplot as plt

# Sample data: Ages of 200 people
ages = [22, 25, 29, 30, 32, 33, 35, 36, 38, 40,
        42, 44, 45, 48, 50, 52, 55, 58, 60, 65] * 10

# Create the histogram
plt.hist(ages, bins=8, color='skyblue', edgecolor='black')

# Add titles and labels
plt.title("Age Distribution of Survey Participants")
plt.xlabel("Age Groups")
plt.ylabel("Number of People")

plt.show()

Output of the above code:

Histogram showing age distribution of survey participants

Understanding the Code Step-by-Step

Let's break down what's happening, one line at a time โ€” no boring jargon, just clear logic.

import matplotlib.pyplot as plt

Grabs Matplotlib's plotting tools. Think of this like opening your paintbox before making a graph.

ages = [...] * 10

Our sample data โ€” a list of ages repeated 10 times to mimic a group of 200 people for a more realistic visualization.

plt.hist(ages, bins=8, color='skyblue', edgecolor='black')

This is where the histogram is actually made. bins=8 groups the data into 8 age ranges. color and edgecolor keep it neat and readable.

plt.title() / plt.xlabel() / plt.ylabel()

Adds context โ€” a title and axis labels so the chart makes sense at a glance without anyone having to guess what the numbers mean.

plt.show()

The final line that says: all set โ€” now render and display the chart.

Customize Your Histogram

Histograms don't have to be boring. You can tweak colors, borders, transparency, and bar style to make your data look clear and polished:

Python
import matplotlib.pyplot as plt

ages = [21, 23, 25, 22, 26, 29, 21, 25, 27, 30, 24, 23, 22, 28, 26]

plt.hist(
    ages,
    bins=8,             # Number of bars
    color='#4db6ac',    # Soft teal fill color
    edgecolor='white',  # White borders between bars
    linewidth=1.2,      # Border thickness
    alpha=0.85,         # Slight transparency
    histtype='bar'      # Try 'step' or 'barstacked' too!
)

plt.title("Age Distribution of Classmates")
plt.xlabel("Age")
plt.ylabel("Frequency")
plt.grid(True, linestyle='--', alpha=0.4)
plt.show()

Output of the above code:

Customized histogram with teal bars and grid lines

Mini Project: Visualize Exam Score Distribution

Let's say you've got the exam scores of 50 students. Build a histogram to visualize how those scores are spread out โ€” who's topping, who's in the middle, and how many are lagging behind.

Bonus: try grouping scores like 0โ€“20, 21โ€“40, and label them smartly. Perfect for teachers, students analyzing trends, or just solid Python practice.

Bonus challenge: highlight the highest and lowest scoring bins with different colors using plt.bar() for custom segments, or add a vertical mean line with plt.axvline(x=mean_score, color='red').

Common Histogram Mistakes to Avoid

Too few bins

Hides important patterns โ€” clusters, peaks, and outliers all get swallowed into a handful of fat bars.

Too many bins

Makes the chart noisy โ€” random fluctuations start looking like meaningful trends when they're not.

Using it for categorical data

Histograms are for numerical data only. If your data has labels like "Apple" or "Banana", use a bar chart instead โ€” a histogram won't make sense here.

Forgetting axis labels

No labels means the reader has to guess what the numbers mean. Always add a title, x-axis label, and y-axis label โ€” it takes 3 lines of code.

Comparing datasets with different bin sizes

Inconsistent bins across plots leads to misleading comparisons. Keep the bin count the same whenever you're comparing two histograms side by side.

Ignoring outliers

Extreme values stretch the histogram and squash everything else. Always check for outliers before reading the distribution shape.

Frequently Asked Questions

A histogram shows how numerical data is distributed across ranges or intervals. In Matplotlib, you create one with plt.hist(data, bins=10) โ€” pass your data and it handles the grouping automatically.
A histogram displays continuous numerical data grouped into ranges โ€” the bars touch each other to show continuity. A bar graph compares separate categories with gaps between bars. Use a histogram for distributions, a bar graph for comparisons.
Use plt.hist(data, bins=10) and pass your data list plus the number of bins you want. Add plt.title(), plt.xlabel(), and plt.ylabel() for labels, then call plt.show() to display it.
bins controls how many bars your histogram has. Each bin is a range of values. Too few and you lose detail; too many and it gets noisy. Start with 8โ€“15 bins and adjust from there based on how the chart looks.
Histograms are for continuous numerical data โ€” ages, scores, temperatures, prices, heights. They're not suitable for categorical data like city names or product types. For those, use a bar chart instead.
PM
Palak Mishra Backend developer ยท Building beginner-friendly Python docs at Kiddo
More to explore โ†’

Bar Graphs ยท Customize Graphs ยท Pie Charts ยท Multiple Line Graphs ยท Line Graphs ยท Scatter Plots