Course: MA121: Introduction to Statistics

Section outline

Select section Course Introduction

Collapse Expand
Course Introduction

Collapse all Expand all
If you invest in financial markets, you may want to predict the price of a stock in six months from now based on company performance measures and other economic factors. As a college student, you may be interested in knowing the dependence of the mean starting salary of a college graduate, based on your GPA. These are just some examples that highlight how statistics are used in our modern society. To figure out the desired information for each example, you need data to analyze.
The purpose of this course is to introduce you to the subject of statistics as a science of data. Data abounds in this information age; extracting useful knowledge and gaining a sound understanding of complex data sets has been more of a challenge. In this course, we will focus on the fundamentals of statistics, broadly described as the techniques to collect, clarify, summarize, organize, analyze, and interpret numerical information.
This course will begin with a brief overview of the discipline of statistics and will then quickly focus on descriptive statistics, introducing graphical methods of describing data. You will learn about combinatorial probability and random distributions, which are the foundation for statistical inference. With inference, we will focus on estimation and hypothesis testing issues. We will also examine the techniques to study the relationship between two or more variables, known as regression.
By the end of this course, you should understand what statistics represent, how to use statistics to organize and display data, and how to draw valid inferences based on data by using appropriate statistical tools.
- Select activity Enroll me in this course
- Select activity Course Syllabus
  
  Course Syllabus Page
Select section Unit 1: Statistics and Data

Collapse Expand
Unit 1: Statistics and Data
Today, we have access to large volumes of data. The first step of data analysis is to accurately summarize all of this data, both graphically and numerically, so that we can understand what the data reveals. To be able to use and interpret the data correctly is essential to making informed decisions. For instance, when you see a survey of opinion about a certain TV program, you may be interested in the proportion of those people who indeed like the program. In this unit, you will learn about descriptive statistics used to summarize and display data. After completing this unit, you will know how to present your findings once you have collected data. For example, suppose you want to buy a new mobile phone with a particular type of camera. Suppose you are unsure about the prices of any of the phones with this feature, so you access a website with a sample data set of prices, given your desired features. Looking at all of the prices in a sample can sometimes be confusing. A better way to compare this data might be to look at the median price and the variation of prices. The median and variation are two ways out of several ways that you can describe data.
You can also graph the data so that it is easier to see the price distribution. In this unit, you will study precisely this; you will learn numerical and graphical ways to describe and display your data. You will understand the essentials of calculating common descriptive statistics for measuring center, variability, and skewness in data. You will learn to calculate and interpret these measurements and graphs. Descriptive statistics are, as their name suggests, descriptive. They do not generalize beyond the data considered. Descriptive statistics illustrate what the data shows. Numerical descriptive measures computed from data are called statistics. Numerical descriptive measures of the population are called parameters. Inferential statistics can generalize the findings from sample data to a broader population.
Completing this unit should take you approximately 3 hours.
- Select activity Unit 1 Learning Outcomes
  
  Unit 1 Learning Outcomes Page
- Select activity Final Exam
  
  Final Exam Quiz
  
  Students must
  
  Receive a grade
  
  Receive a passing grade
Select section 1.1: The Science of Statistics

Collapse Expand
1.1: The Science of Statistics
- Select activity What is Statistics?
  
  What is Statistics? Book
  
  Students must
  
  Mark as done
Select section 1.2: Descriptive and Inferential Statistics

Collapse Expand
1.2: Descriptive and Inferential Statistics
- Select activity Descriptive and Inferential Statistics
  
  Descriptive and Inferential Statistics Book
  
  Students must
  
  Mark as done
- Select activity Basic Definitions and Concepts
  
  Basic Definitions and Concepts Book
  
  Students must
  
  Mark as done
Select section 1.3: Types of Data and Their Collection

Collapse Expand
1.3: Types of Data and Their Collection
- Select activity Variables and Data Collection
  
  Variables and Data Collection Book
  
  Students must
  
  Mark as done
- Select activity Presenting Data
  
  Presenting Data Page
  
  Students must
  
  Mark as done
Select section 1.4: Graphical Methods for Describing Data

Collapse Expand
1.4: Graphical Methods for Describing Data
- Select activity Graphing
  
  Graphing Book
  
  Students must
  
  Mark as done
- Select activity Three Popular Data Displays
  
  Three Popular Data Displays Book
  
  Students must
  
  Mark as done
Select section 1.5: Measures of Central Tendency and Variability

Collapse Expand
1.5: Measures of Central Tendency and Variability
- Select activity Numerical Measures of Central Tendency and Variability
  
  Numerical Measures of Central Tendency and Variability Book
  
  Students must
  
  Mark as done
- Select activity Measures of Central Location
  
  Measures of Central Location Book
  
  Students must
  
  Mark as done
- Select activity Mean, Median, Mode, and Variance
  
  Mean, Median, Mode, and Variance Page
  
  Students must
  
  Mark as done
Select section 1.6: Methods for Describing Relative Standing

Collapse Expand
1.6: Methods for Describing Relative Standing
- Select activity Percentiles
  
  Percentiles Book
  
  Students must
  
  Mark as done
Select section 1.7: Methods for Describing Bivariate Relationships

Collapse Expand
1.7: Methods for Describing Bivariate Relationships
- Select activity Scatterplots and Bivariate Data
  
  Scatterplots and Bivariate Data Page
  
  Students must
  
  Mark as done
- Select activity Pearson's r
  
  Pearson's r Book
  
  Students must
  
  Mark as done
Select section Unit 1 Assessment

Collapse Expand
Unit 1 Assessment
- Select activity Unit 1 Assessment
  
  Unit 1 Assessment Quiz
  
  Students must
  
  Receive a grade
Select section Unit 2: Elements of Probability and Random Variables

Collapse Expand
Unit 2: Elements of Probability and Random Variables
Probabilities affect our everyday lives. In this unit, you will learn about probability and its properties, how probability behaves, and how to calculate and use it. You will study the fundamentals of probability and work through examples covering different types of probability questions. These basic probability concepts will provide a foundation for understanding more statistical concepts, for example, interpreting polling results. Though you may have already encountered concepts of probability, after this unit, you will be able to formally and precisely predict the likelihood of an event occurring given certain constraints. Probability theory is a discipline that was created to deal with chance phenomena. For instance, before getting a surgery, a patient wants to know the chances that the surgery might fail; before taking medication, you want to know the chances of side effects; before leaving your house, you want to know the chances that it will rain today.

Probability is a measure of likelihood that takes on values between 0 and 1, inclusive, with 0 representing impossible events and 1 representing certainty. The chances of events occurring fall between these two values. The skill of calculating probability allows us to make better decisions. Whether you are evaluating how likely it is to get more than 50% of the questions correct on a quiz if you guess randomly; predicting the chance that the next storm will arrive by the end of the week; or exploring the relationship between the number of hours students spend at the gym and their performance on an exam, an understanding of the fundamentals of probability is crucial.

We will also talk about random variables. A random variable describes the outcomes of a random experiment. A statistical distribution describes the number of times each possible outcome occurs in a sample. The values of a random variable can vary with each repetition of an experiment. Intuitively, a random variable, summarizing a certain chance phenomenon, takes on values with certain probabilities. A random variable can be classified as discrete or continuous, depending on the values it assumes. Suppose you count the number of people who go to a coffee shop between 4 p.m. and 5 p.m. and the amount of waiting time they spend in that hour. In this case, the number of people is an example of a discrete random variable, and the amount of waiting time they spend is an example of a continuous random variable.

Completing this unit should take you approximately 3 hours.
- Select activity Unit 2 Learning Outcomes
  
  Unit 2 Learning Outcomes Page
Select section 2.1: Events, Sample Spaces, and Probability

Collapse Expand
2.1: Events, Sample Spaces, and Probability
- Select activity Introduction to Probability
  
  Introduction to Probability Book
  
  Students must
  
  Mark as done
- Select activity Basic Concepts of Probability
  
  Basic Concepts of Probability Book
  
  Students must
  
  Mark as done
Select section 2.2: Counting Rules

Collapse Expand
2.2: Counting Rules
- Select activity Permutations and Combinations
  
  Permutations and Combinations Book
  
  Students must
  
  Mark as done
- Select activity The Addition Rule for Probability with a Venn Diagram Example
  
  The Addition Rule for Probability with a Venn Diagram Example Page
  
  Students must
  
  Mark as done
Select section 2.3: Common Discrete Random Variables

Collapse Expand
2.3: Common Discrete Random Variables
- Select activity Random Variables and Probability Distributions
  
  Random Variables and Probability Distributions Book
  
  Students must
  
  Mark as done
- Select activity Binomial Distributions
  
  Binomial Distributions Page
  
  Students must
  
  Mark as done
- Select activity Binomial, Poisson, and Multinomial Distributions
  
  Binomial, Poisson, and Multinomial Distributions Book
  
  Students must
  
  Mark as done
Select section 2.4: Normal Distribution

Collapse Expand
2.4: Normal Distribution
- Select activity The Standard Normal Distribution
  
  The Standard Normal Distribution Book
  
  Students must
  
  Mark as done
- Select activity More on Normal Distributions
  
  More on Normal Distributions Book
  
  Students must
  
  Mark as done
- Select activity Introduction to the Normal Distribution
  
  Introduction to the Normal Distribution Page
  
  Students must
  
  Mark as done
Select section Unit 2 Assessment

Collapse Expand
Unit 2 Assessment
- Select activity Unit 2 Assessment
  
  Unit 2 Assessment Quiz
  
  Students must
  
  Receive a grade
Select section Unit 3: Sampling Distributions

Collapse Expand
Unit 3: Sampling Distributions
The concept of sampling distribution lies at the very foundation of statistical inference. It is best to introduce sampling distribution using an example here. Suppose you want to estimate a population parameter, say the population mean. There are two natural estimators: 1. sample mean, which is the average value of the data set; and 2. median, which is the middle number when the measurements are arranged in ascending (or descending) order. In particular, for a sample of even size n, the median is the mean of the middle two numbers. But which one is better, and in what sense? This involves repeated sampling, and you want to choose the estimator that would do better on average.
Different samples may give different sample means and medians; some may be closer to the truth than the others. Consequently, we cannot compare these two sample statistics or, in general, any of them based on their performance with a single sample. Instead, you should recognize that sample statistics are random variables; therefore, they should have frequency distributions by considering all possible samples. In this unit, you will study the sampling distribution of several sample statistics. This unit will show you how the central limit theorem can help to approximate sampling distributions in general.
Completing this unit should take you approximately 2 hours.
- Select activity Unit 3 Learning Outcomes
  
  Unit 3 Learning Outcomes Page
Select section 3.1: Sampling Distributions

Collapse Expand
3.1: Sampling Distributions
- Select activity Continuous Random Variables
  
  Continuous Random Variables Book
  
  Students must
  
  Mark as done
- Select activity Introduction to Sampling Distributions
  
  Introduction to Sampling Distributions Book
  
  Students must
  
  Mark as done
Select section 3.2: The Sampling Distribution of Sample Mean

Collapse Expand
3.2: The Sampling Distribution of Sample Mean
- Select activity The Sampling Distribution of a Sample Mean
  
  The Sampling Distribution of a Sample Mean Book
  
  Students must
  
  Mark as done
- Select activity The Mean, Standard Deviation, and Sampling Distribution of the Sample Mean
  
  The Mean, Standard Deviation, and Sampling Distribution of the Sample Mean Book
  
  Students must
  
  Mark as done
- Select activity Sampling Distribution
  
  Sampling Distribution Page
  
  Students must
  
  Mark as done
Select section 3.3: The Sampling Distribution of Pearson's r

Collapse Expand
3.3: The Sampling Distribution of Pearson's r
- Select activity Sampling Distribution of r
  
  Sampling Distribution of r Book
  
  Students must
  
  Mark as done
Select section 3.4: The Sampling Distribution of the Sample Proportion

Collapse Expand
3.4: The Sampling Distribution of the Sample Proportion
- Select activity Sampling Distribution of p
  
  Sampling Distribution of p Book
  
  Students must
  
  Mark as done
- Select activity Standard Deviation
  
  Standard Deviation Page
  
  Students must
  
  Mark as done
Select section Unit 3 Assessment

Collapse Expand
Unit 3 Assessment
- Select activity Unit 3 Assessment
  
  Unit 3 Assessment Quiz
  
  Students must
  
  Receive a grade
Select section Unit 4: Estimation with Confidence Intervals

Collapse Expand
Unit 4: Estimation with Confidence Intervals
In this unit, you will learn how to use the central limit theorem and confidence intervals, which enable you to estimate unknown population parameters. The central limit theorem allows us to make inferences from samples of non-normal populations. This theorem states that given any population, as the sample size increases, the sampling distribution of the means approaches a normal distribution. This powerful theorem allows us to assume that, given a large enough sample, the sampling distribution will be normally distributed. You will also learn about confidence intervals, which provide you with a way to estimate a population parameter. Instead of giving just a one-number estimate of a variable, a confidence interval gives a range of likely values for it. This is useful because point estimates vary from sample to sample, so an interval with a certain confidence level is better than a single point estimate. After completing this unit, you will know how to construct such confidence intervals and the level of confidence.
Completing this unit should take you approximately 1 hour.
- Select activity Unit 4 Learning Outcomes
  
  Unit 4 Learning Outcomes Page
Select section 4.1: Point Estimators and Their Characteristics

Collapse Expand
4.1: Point Estimators and Their Characteristics
- Select activity Basic Sample Statistics and Parameters
  
  Basic Sample Statistics and Parameters Book
  
  Students must
  
  Mark as done
- Select activity Characteristics of Estimators
  
  Characteristics of Estimators Book
  
  Students must
  
  Mark as done
Select section 4.2: Confidence Intervals

Collapse Expand
4.2: Confidence Intervals
- Select activity Confidence Intervals for the Mean
  
  Confidence Intervals for the Mean Book
  
  Students must
  
  Mark as done
- Select activity Demonstration: Confidence Intervals for a Mean
  
  Demonstration: Confidence Intervals for a Mean Page
  
  Students must
  
  Mark as done
- Select activity Confidence Interval Simulation
  
  Confidence Interval Simulation Book
  
  Students must
  
  View
- Select activity t Distribution Demonstration
  
  t Distribution Demonstration Page
  
  Students must
  
  Mark as done
- Select activity Comparing Normal and Student's t-Distributions
  
  Comparing Normal and Student's t-Distributions Page
  
  Students must
  
  Mark as done
Select section 4.3: Confidence Intervals for Correlation and Proportion

Collapse Expand
4.3: Confidence Intervals for Correlation and Proportion
- Select activity Confidence Intervals for Correlation and Proportion
  
  Confidence Intervals for Correlation and Proportion Book
  
  Students must
  
  Mark as done
- Select activity Confidence Intervals
  
  Confidence Intervals Page
  
  Students must
  
  Mark as done
Select section Unit 4 Assessment

Collapse Expand
Unit 4 Assessment
- Select activity Unit 4 Assessment
  
  Unit 4 Assessment Quiz
  
  Students must
  
  Receive a grade
Select section Unit 5: Hypothesis Test

Collapse Expand
Unit 5: Hypothesis Test
A hypothesis test involves collecting and evaluating data from a sample. The data gathered and evaluated is then used to decide whether or not the data supports the claim about the population. This unit will teach you how to conduct hypothesis tests and how to identify and differentiate between the errors associated with them. Many times, you need answers to questions to make efficient decisions. For example, a restaurant owner might claim that his restaurant's food costs 30% less than other restaurants in the area, or a phone company might claim that its phones last at least one year longer than phones from other companies. To decide whether it would be more affordable to eat at the restaurant that "costs 30% less" or another restaurant in the area, or to determine which phone company to choose based on its durability, you must collect data to justify these claims.
The process of hypothesis testing is a way of decision-making. In this unit, you will learn to establish assumptions through null and alternative hypotheses. The null hypothesis is the hypothesis that is assumed to be true and the hypothesis you hope to nullify. In contrast, the alternative hypothesis is the research hypothesis you claim to be true. This means that you need to conduct the correct tests to be able to accept or reject the null hypothesis. You will learn how to compare sample characteristics to see whether there is enough data to accept or reject the null hypothesis.
Completing this unit should take you approximately 3 hours.
- Select activity Unit 5 Learning Outcomes
  
  Unit 5 Learning Outcomes Page
Select section 5.1: Setting Up Hypotheses

Collapse Expand
5.1: Setting Up Hypotheses
- Select activity Setting Up Hypotheses
  
  Setting Up Hypotheses Book
  
  Students must
  
  Mark as done
Select section 5.2: Interpreting Hypothesis Testing Results

Collapse Expand
5.2: Interpreting Hypothesis Testing Results
- Select activity The Observed Significance of a Test
  
  The Observed Significance of a Test Book
  
  Students must
  
  Mark as done
- Select activity Results
  
  Results Book
  
  Students must
  
  Mark as done
- Select activity Hypothesis Testing with One Sample
  
  Hypothesis Testing with One Sample Book
  
  Students must
  
  Mark as done
- Select activity More on Hypothesis Testing
  
  More on Hypothesis Testing Page
  
  Students must
  
  Mark as done
Select section 5.3: Hypothesis Testing and Confidence Intervals

Collapse Expand
5.3: Hypothesis Testing and Confidence Intervals
- Select activity Steps and Confidence Intervals in Hypothesis Testing
  
  Steps and Confidence Intervals in Hypothesis Testing Book
  
  Students must
  
  Mark as done
- Select activity More on Hypothesis Testing - II
  
  More on Hypothesis Testing - II Page
Select section 5.4: Testing Single Mean

Collapse Expand
5.4: Testing Single Mean
- Select activity Single Mean
  
  Single Mean Book
  
  Students must
  
  Mark as done
- Select activity Sample Tests for a Population Mean
  
  Sample Tests for a Population Mean Book
  
  Students must
  
  Mark as done
Select section 5.5: Testing the Difference between Two Means

Collapse Expand
5.5: Testing the Difference between Two Means
- Select activity The Difference between Two Means
  
  The Difference between Two Means Book
  
  Students must
  
  Mark as done
- Select activity Difference of Means
  
  Difference of Means Page
  
  Students must
  
  Mark as done
Select section 5.6: Chi-Square Distribution

Collapse Expand
5.6: Chi-Square Distribution
- Select activity Contingency Tables
  
  Contingency Tables Book
  
  Students must
  
  Mark as done
- Select activity Chi-Square Distributions and Goodness of Fit
  
  Chi-Square Distributions and Goodness of Fit Book
  
  Students must
  
  Mark as done
- Select activity More on Chi-square Distributions
  
  More on Chi-square Distributions Page
  
  Students must
  
  Mark as done
Select section 5.7: Comparing the Proportions of Populations

Collapse Expand
5.7: Comparing the Proportions of Populations
- Select activity Comparing Population Proportions
  
  Comparing Population Proportions Page
  
  Students must
  
  Mark as done
Select section Unit 5 Assessment

Collapse Expand
Unit 5 Assessment
- Select activity Unit 5 Assessment
  
  Unit 5 Assessment Quiz
  
  Students must
  
  Receive a grade
Select section Unit 6: Linear Regression

Collapse Expand
Unit 6: Linear Regression
In this unit, we will discuss situations where the mean of a population, treated as a variable, depends on the value of another variable. One of the main reasons we conduct such analyses is to understand how two variables are related. The most common type of relationship is a linear relationship. For example, you may want to know what happens to one variable when you increase or decrease the other variable. You want to answer questions such as, "Does one variable increase as the other increases, or does the variable decrease?" For example, you may want to determine how the mean reaction time of rats depends on the amount of drug in the bloodstream.
You will also learn to measure the degree of a relationship between two or more variables. Both correlation and regression are measures for comparing variables. Correlation quantifies the strength of a relationship between two variables and is a measure of existing data. On the other hand, regression is the study of the strength of a linear relationship between an independent and dependent variable. It can be used to predict the value of the dependent variable when the value of the independent variable is known.
Completing this unit should take you approximately 3 hours.
- Select activity Unit 6 Learning Outcomes
  
  Unit 6 Learning Outcomes Page
Select section 6.1: Scatter Plot of Two Variables and Regression Line

Collapse Expand
6.1: Scatter Plot of Two Variables and Regression Line
- Select activity Introduction to Linear Regression
  
  Introduction to Linear Regression Book
  
  Students must
  
  Mark as done
- Select activity Linear Regression
  
  Linear Regression Book
  
  Students must
  
  Mark as done
Select section 6.2: Correlation Coefficient

Collapse Expand
6.2: Correlation Coefficient
- Select activity Correlation
  
  Correlation Book
  
  Students must
  
  Mark as done
- Select activity The Linear Correlation Coefficient
  
  The Linear Correlation Coefficient Book
  
  Students must
  
  Mark as done
Select section 6.3: Sums of Squares

Collapse Expand
6.3: Sums of Squares
- Select activity Partitioning Sums of Squares
  
  Partitioning Sums of Squares Book
  
  Students must
  
  Mark as done
- Select activity Regression Lines
  
  Regression Lines Page
  
  Students must
  
  Mark as done
Select section 6.4: Fitting the Model

Collapse Expand
6.4: Fitting the Model
- Select activity Standard Error of the Estimate
  
  Standard Error of the Estimate Book
  
  Students must
  
  Mark as done
- Select activity Inferential Statistics for b and r
  
  Inferential Statistics for b and r Book
  
  Students must
  
  Mark as done
- Select activity Statistical Inference about Slope
  
  Statistical Inference about Slope Book
  
  Students must
  
  Mark as done
Select section 6.5: Influential Observations

Collapse Expand
6.5: Influential Observations
- Select activity Influential Observations
  
  Influential Observations Book
  
  Students must
  
  Mark as done
- Select activity A Complete Example
  
  A Complete Example Book
  
  Students must
  
  Mark as done
Select section 6.6: ANOVA

Collapse Expand
6.6: ANOVA
This optional subunit will teach you about "Analysis of Variance" (abbreviated ANOVA), which is used for hypothesis tests involving more than two averages. ANOVA is about examining the amount of variability in the y variable and trying to see where that variability is coming from. You will study the simplest form of ANOVA, called single factor or one-way ANOVA. Finally, you will briefly study the F distribution, used for ANOVA, and the test of two variances.
- Select activity ANOVA
  
  ANOVA Page
  
  Students must
  
  Mark as done
- Select activity More on ANOVA
  
  More on ANOVA Book
  
  Students must
  
  Mark as done
Select section Unit 6 Assessment

Collapse Expand
Unit 6 Assessment
- Select activity Unit 6 Assessment
  
  Unit 6 Assessment Quiz
  
  Students must
  
  Receive a grade
Select section Study Guide

Collapse Expand
Study Guide
This study guide will help you get ready for the final exam. It discusses the key topics in each unit, walks through the learning outcomes, and lists important vocabulary terms. It is not meant to replace the course materials!
- Select activity MA121 Study Guide
  
  MA121 Study Guide Book
Select section Certificate Final Exam

Collapse Expand
Certificate Final Exam
Take this exam if you want to earn a free Course Completion Certificate.

To receive a free Course Completion Certificate, you will need to earn a grade of 70% or higher on this final exam. Your grade for the exam will be calculated as soon as you complete it. If you do not pass the exam on your first try, you can take it again as many times as you want, with a 7-day waiting period between each attempt. Once you pass this final exam, you will be awarded a free Course Completion Certificate.
- Select activity MA121: Certificate Final Exam
  
  MA121: Certificate Final Exam Quiz
Select section Course Feedback Survey

Collapse Expand
Course Feedback Survey
Please take a few minutes to give us feedback about this course. We appreciate your feedback, whether you completed the whole course or even just a few resources. Your feedback will help us make our courses better, and we use your feedback each time we make updates to our courses. If you come across any urgent problems, email contact@saylor.org.
- Select activity Course Feedback Survey
  
  Course Feedback Survey URL