• Unit 4: Applied Statistics in Python

    As data science can often involve making statistical inferences from data, many of the upcoming units will apply calculations rooted in probability and statistics. This unit is foundational in that it discusses various ways of generating random data, computing basic statistical measures, and performing statistical analyses in Python. When you finish this unit, you will be able to implement and apply Python methods from the scipy.stats module.

    You have already seen that the random module can generate scalar random numbers and that numpy can generate arrays of random numbers. We will also find that many numpy methods extend quite naturally to the pandas module we will introduce in the next unit. Additionally, the scipy.stats module allows for statistical modeling and parameter calculations. These Python implementations will serve as a foundation for more sophisticated methods discussed we will use later in the course.

    Completing this unit should take you approximately 13 hours.

    • 4.1: Basic Statistical Measures and Distributions

    • 4.2: Random Numbers in numpy

    • 4.3: The scipy.stats Module

    • 4.4: Data Science Applications

    • Unit 4 Assessment

      • Receive a grade