• Unit 5: The pandas Module

    This unit introduces the pandas module, which is necessary for generalizing numpy array operations to dataframes containing things like spreadsheet data. When you finish this unit, you will be able to process pandas dataframes and perform the suites of calculations we outlined in the earlier units.

    The pandas module contains many methods that greatly simplify processing data files relevant to data science. Data collected from real-world situations is often messy, and can contain observations you might want to discard. The pandas module offers useful methods that can deal with such situations. We will also discuss methods for visualizing data using pandas.

    Completing this unit should take you approximately 4 hours.

    • 5.1: Dataframes

    • 5.2: Data Cleaning

    • 5.3: pandas Operations: Merge, Join, and Concatenate

    • 5.4: Data Input and Output

    • 5.5: Visualization Using the pandas Module

    • Unit 5 Assessment

      • Receive a grade