5.2: Data Cleaning
Data cleaning is one of the initial steps in the data science pipeline. In practical applications, we do not always need to collect data in a pristine form, and the associated dataframe can therefore contain potential anomalies. There can be missing cells, cells that have nonsensical values, and so on. The pandas module offers several methods to deal with such scenarios.
These videos give a few more key examples of applying data cleaning methods. They are meant to serve as a summary and review of all pandas concepts we have discussed in this unit.