
Learn data science using the Python programming language by looking at data processing, data analysis, visualization, data mining, and statistical models. By the end of this course, you will be able to implement Python code for these data science topics.
This course attempts to strike a balance between presenting the vast set of methods within the field of data science and Python programming techniques for implementing them. Problem-solving and programming implementation will be emphasized throughout the course. All techniques presented will be introduced using real-world programming examples. A major goal of the course is to ensure that when you finish the course, you will have the programming and conceptual expertise you need to join the field of data science.
Several Python modules, such as pandas, scikit-learn, scipy.stats, and statsmodels, will be introduced that are useful for data analysis, data visualization, and data mining. The course will gradually shift from introductory topics such as a review of Python, matrix operations, and statistics to applications and implementing programs involving data mining, visualization, statistical models, and time series analysis.
- Unit 1: What is Data Science?
- Unit 2: Python for Data Science
- Unit 3: The numpy Module
- Unit 4: Applied Statistics in Python
- Unit 5: The pandas Module
- Unit 6: Visualization
- Unit 7: Data Mining I – Supervised Learning
- Unit 8: Data Mining II – Clustering Techniques
- Unit 9: Data Mining III - Statistical Modeling
- Unit 10: Time Series Analysis
- Use Google Colaboratory notebooks to implement and test Python programs;
- Explain how Python programming is relevant to data science;
- Construct and operate on arrays using the numpy module;
- Apply Python modules for basic statistical computation;
- Construct and operate on dataframes using the pandas module;
- Apply the pandas module to interact with spreadsheet software;
- Implement Python scripts for visualization using arrays and dataframes;
- Apply the scikit-learn module to perform data mining;
- Explain techniques for supervised and unsupervised learning;
- Apply supervised learning techniques;
- Apply unsupervised learning techniques;
- Apply the scikit-learn module to build statistical models;
- Implement Python scripts to perform regression analyses;
- Apply the statsmodels module to build and analyze models for time series analysis; and
- Explain similarities and differences between AR, MA, and ARIMA models.