Statistics Made Easy: Mean, Median, Mode, and More for Data Science Beginners
Statistics is the backbone of data science. Without it, analyzing data would feel like trying to read a book in a language you don’t understand. Whether you’re just starting your journey into data science or simply refreshing your basics, understanding a few key statistical concepts can make things much clearer.
In this guide, we’ll walk through the most important topics every beginner should know: mean, median, mode, standard deviation, correlation, and probability basics.
The mean is what most people think of as the “average.”
How it works:
Add up all the values and then divide by how many values there are.
Example:
If five students score: 70, 80, 90, 85, 75
Mean = (70 + 80 + 90 + 85 + 75) ÷ 5 = 80
Why it matters:
The mean gives a simple summary of data. For example, if you want to know how much time users spend on an app, the mean tells you the overall average.
The median is the middle number in a sorted list of values.
- If there’s an odd number of values, the middle one is the median.
- If there’s an even number, the median is the average of the two middle numbers.
- A low SD means the data is close to the average (less variation).
- A high SD means the data is spread out (more variation).
- Class A scores: 80, 82, 81, 79, 83 → Low SD (everyone scored similarly).
- Class B scores: 50, 60, 70, 90, 100 → High SD (scores vary a lot).
- Positive correlation: As one goes up, the other also goes up (e.g., hours studied and exam scores).
- Negative correlation: As one goes up, the other goes down (e.g., product price and number of buyers).
- No correlation: No relationship (e.g., shoe size and intelligence).
- Height and weight → positive correlation.
- Age of a car and resale value → negative correlation.
- Coffee consumption and favourite colour → no correlation.
- Mean, median, and mode help summarize data.
- Standard deviation shows how spread out the data is.
- Correlation highlights relationships between variables.
- Probability helps us measure uncertainty.
S
Written by
shreyashri
Last updated
5 September 2025
