Skip to main content
CS250: Python for Data Science
0%
Previous
Course Feedback Survey
Course Introduction
Course Syllabus
Unit 1: What is Data Science?
1.1: Introduction to Data Science
A History of Data Science
Understanding Data Science
1.2: How Data Science Works
How Data Science Works
The Data Science Pipeline
The Data Science Lifecycle
1.3: Important Facets of Data Science
Data Scientist Archetypes
What is the Field of Data Science?
Thinking about the World
Unit 1 Assessment
Unit 1 Assessment
Unit 2: Python for Data Science
2.1: Google Colaboratory
Introduction to Google Colab
2.2: Datatypes, Operators, and the math Module
Data Types in Python
Operators and the math Module
2.3: Control Statements, Loops, and Functions
Functions, Loops, and Logic
Functions and Control Structures
2.4: Lists, Tuples, Sets, and Dictionaries
Data Structures in Python
Sets, Tuples, and Dictionaries
Examples of Sets, Tuples, and Dictionaries
2.5: The random Module
Python's random Module
2.6: The matplotlib Module
Visualization and matplotlib
Precision Data Plotting with matplotlib
Unit 2 Assessment
Unit 2 Assessment
Unit 3: The numpy Module
3.1: Constructing Arrays
Using Matrices
Creating numpy Arrays
numpy Fundamentals
numpy for Numerical and Scientific Computing
3.2: Indexing
numpy Arrays and Vectorized Programming
Advanced Indexing with numpy
3.3: Array Operations
A Visual Intro to numpy and Data Representation
Mathematical Operations with numpy
numpy with matplotlib
3.4: Saving and Loading Data
Storing Data in Files
Load Compressed Data using numpy.load
Saving a Compressed File with numpy
".npy" versus ".npz" Files
Unit 3 Assessment
Unit 3 Assessment
Unit 4: Applied Statistics in Python
4.1: Basic Statistical Measures and Distributions
Applying Statistics
Key Statistical Terms
Descriptive Statistics
Basic Probability
Distribution and Standard Deviation
Continuous Probability Functions and the Uniform Distribution
The Normal Distribution
Confidence Intervals
Hypothesis Testing
Linear Regression
4.2: Random Numbers in numpy
Using numpy
Random Number Generation
Using np.random.normal
A Data Science Example
4.3: The scipy.stats Module
Descriptive Statistics in Python
Statistical Modeling with scipy
Probability Distributions and their Stories
4.4: Data Science Applications
Statistics and Random Numbers
Statistics in Python
Probabilistic and Statistical Risk Modeling
Unit 4 Assessment
Unit 4 Assessment
Unit 5: The pandas Module
5.1: Dataframes
pandas Dataframes
How pandas Dataframes Work
5.2: Data Cleaning
Data Cleaning
More on Data Cleaning
5.3: pandas Operations: Merge, Join, and Concatenate
pandas Data Structures
Pandas Dataframe Operations
5.4: Data Input and Output
Importing and Exporting
Loading Data into pandas Dataframes
5.5: Visualization Using the pandas Module
Using pandas to Plot Data
Plotting with pandas
Unit 5 Assessment
Unit 5 Assessment
Unit 6: Visualization
6.1: The seaborn Module
Visualization with seaborn
matplotlib and seaborn
Easy Data Visualization
6.2: Advanced Data Visualization Techniques
Data Visualization in Python
How to Create a seaborn Boxplot
Practicing Data Visualization
6.3: Data Science Applications
Visualization Examples
Using Jupyter
Visualizing with seaborn
Unit 6 Assessment
Unit 6 Assessment
Unit 7: Data Mining I – Supervised Learning
7.1: Data Mining Overview
Introduction to Data Mining
Introduction to Machine Learning
Bayes' Theorem
Bayes' Theorem and Conditional Probability
Methods for Pattern Classification
7.2: Supervised Learning
Supervised learning
Feature Selection
Model Inspection and Feature Selection
scikit-learn
7.3: Principal Component Analysis
Dimensionality Reduction
Principal Component Analysis
PCA in Python
7.4: k-Nearest Neighbors
The k-Nearest Neighbors Algorithm
Using the k-NN Algorithm
Nearest Neighbors
7.5: Decision Trees
Dealing with Uncertainty
Classification, Decision Trees, and k-Nearest-Neighbors
Decision Trees
7.6: Logistic Regression
Logistic Regression
More on Logistic Regression
Implementing Logistic Regression
7.7: Training and Testing
Supervised Learning and Model Validation
Training and Tuning a Model
Unit 7 Assessment
Unit 7 Assessment
Unit 8: Data Mining II – Clustering Techniques
8.1: Unsupervised Learning
Unsupervised Learning
More on Unsupervised Learning
8.2: K-means Clustering
K-means Clustering
More on K-means Clustering
Implementing K-means Clustering
Interpreting the Results of Clustering
PCA and Clustering
8.3: Hierarchical Clustering
Hierarchical Clustering
Hierarchical Clustering Using Trees
Agglomerative Clustering
Applying Clustering
Comparing Aggomerative and K-means Clustering
8.4: Training and Testing
Clustering with scikit-learn
Putting It All Together
Unit 8 Assessment
Unit 8 Assessment
Unit 9: Data Mining III – Statistical Modeling
9.1: Linear Regression
Simple Linear Regression
Implementing Simple Linear Regression with scikit-learn
Practicing Linear Regression
Multiple Linear Regression
Multiple Regression in scikit-learn
9.2: Residuals
The Assumptions of Simple Linear Regression
Residual Plots and Regression
Simple Linear Regression Project
9.3: Overfitting
Overfitting
Overfitting in a Learning Model
9.4: Cross-Validation
What is Cross-Validation?
More on Cross-Validation
Cross-Validation in Machine Learning
Statistical Modeling Project
Unit 9 Assessment
Unit 9 Assessment
Unit 10: Time Series Analysis
10.1: The statsmodels Module
Introduction to statsmodels
Regression Using statsmodels
Using scikit-learn with statsmodels
10.2: Autoregressive (AR) Models
Time Series Basics
Autoregressive Models
Time Series and Forecasting
10.3: Moving Average (MA) Models
Moving-Average Models
MA Model Examples
AR and MA Models
10.4: Autoregressive Integrated Moving Average (ARIMA) Models
ARIMA Models
ARIMA in Python
ARIMA and Seasonal ARIMA Models
ARIMA(p,d,q)
Time Series Forecasting with ARIMA
Unit 10 Assessment
Unit 10 Assessment
Study Guide
CS250 Study Guide
Course Feedback Survey
Course Feedback Survey
Certificate Final Exam
CS250: Certificate Final Exam
Next
Side panel
Course Catalog
All categories
Arts & Humanities
Art History
Communication
English
Philosophy
Business Administration
Computer Science
English as a Second Language
Professional Development
General Knowledge for Teachers
Science and Math
Biology
Chemistry
Mathematics
Physics
Social Science
Economics
Geography
History
Political Science
Psychology
Sociology
Home
Specialization Programs
Specialization Programs
MBA Degree Program
Help
Getting Started
Help Center & FAQ
Search
Search
Search
Search
Close
Toggle search input
You are currently using guest access
Log in
Course Catalog
Collapse
Expand
All categories
Arts & Humanities
Art History
Communication
English
Philosophy
Business Administration
Computer Science
English as a Second Language
Professional Development
General Knowledge for Teachers
Science and Math
Biology
Chemistry
Mathematics
Physics
Social Science
Economics
Geography
History
Political Science
Psychology
Sociology
Home
Specialization Programs
Collapse
Expand
Specialization Programs
MBA Degree Program
Help
Collapse
Expand
Getting Started
Help Center & FAQ
Expand all
Collapse all
Open course index
CS250: Python for Data Science
Course Feedback Survey
Course Feedback Survey
Course Feedback Survey
Completion requirements
Click on
Course Feedback Survey
to open the resource.