Skip to main content
Side panel
Course Catalog
My Dashboard
Help
Getting Started
Help Center & FAQ
Search
Close
Search
Toggle search input
Log in or Sign up
Course Catalog
My Dashboard
Help
Getting Started
Help Center & FAQ
CS250: Python for Data Science
Sections
Course Introduction
Course Syllabus
Unit 1: What is Data Science?
Unit 2: Python for Data Science
Unit 3: The numpy Module
Unit 4: Applied Statistics in Python
Unit 5: The pandas Module
Unit 6: Visualization
Unit 7: Data Mining I – Supervised Learning
Unit 8: Data Mining II – Clustering Techniques
Unit 9: Data Mining III – Statistical Modeling
Unit 10: Time Series Analysis
Study Guide
Course Feedback Survey
Certificate Final Exam
Resources
Activities
Quizzes
Home
My programs
My certificates
CS250: Python for Data Science
Home
Courses
Course Catalog
Computer Science
CS250: Python for Data Science
Sections
Course Feedback Survey
Course Feedback Survey
Back to 'Course Feedback Survey\'
Course Feedback Survey
Click
https://saylordotorg.typeform.com/to/i5d5GmAo?utm_source=CS250&utm_medium=coursepage&utm_campaign=compsurvey&typeform-source=learn.saylor.org
link to open resource.
Previous
Jump to...
Jump to...
Course Syllabus
A History of Data Science
Understanding Data Science
How Data Science Works
The Data Science Pipeline
The Data Science Lifecycle
Data Scientist Archetypes
What is the Field of Data Science?
Thinking about the World
Unit 1 Assessment
Introduction to Google Colab
Data Types in Python
Operators and the math Module
Functions, Loops, and Logic
Functions and Control Structures
Data Structures in Python
Sets, Tuples, and Dictionaries
Examples of Sets, Tuples, and Dictionaries
Python's random Module
Visualization and matplotlib
Precision Data Plotting with matplotlib
Unit 2 Assessment
Using Matrices
Creating numpy Arrays
numpy Fundamentals
numpy for Numerical and Scientific Computing
numpy Arrays and Vectorized Programming
Advanced Indexing with numpy
A Visual Intro to numpy and Data Representation
Mathematical Operations with numpy
numpy with matplotlib
Storing Data in Files
Load Compressed Data using numpy.load
Saving a Compressed File with numpy
".npy" versus ".npz" Files
Unit 3 Assessment
Applying Statistics
Key Statistical Terms
Descriptive Statistics
Basic Probability
Distribution and Standard Deviation
Continuous Probability Functions and the Uniform Distribution
The Normal Distribution
Confidence Intervals
Hypothesis Testing
Linear Regression
Using numpy
Random Number Generation
Using np.random.normal
A Data Science Example
Descriptive Statistics in Python
Statistical Modeling with scipy
Probability Distributions and their Stories
Statistics and Random Numbers
Statistics in Python
Probabilistic and Statistical Risk Modeling
Unit 4 Assessment
pandas Dataframes
How pandas Dataframes Work
Data Cleaning
More on Data Cleaning
pandas Data Structures
Pandas Dataframe Operations
Importing and Exporting
Loading Data into pandas Dataframes
Using pandas to Plot Data
Plotting with pandas
Unit 5 Assessment
Visualization with seaborn
matplotlib and seaborn
Easy Data Visualization
Data Visualization in Python
How to Create a seaborn Boxplot
Practicing Data Visualization
Visualization Examples
Using Jupyter
Visualizing with seaborn
Unit 6 Assessment
Introduction to Data Mining
Introduction to Machine Learning
Bayes' Theorem
Bayes' Theorem and Conditional Probability
Methods for Pattern Classification
Supervised learning
Feature Selection
Model Inspection and Feature Selection
scikit-learn
Dimensionality Reduction
Principal Component Analysis
PCA in Python
The k-Nearest Neighbors Algorithm
Using the k-NN Algorithm
Nearest Neighbors
Dealing with Uncertainty
Classification, Decision Trees, and k-Nearest-Neighbors
Decision Trees
Logistic Regression
More on Logistic Regression
Implementing Logistic Regression
Supervised Learning and Model Validation
Training and Tuning a Model
Unit 7 Assessment
Unsupervised Learning
More on Unsupervised Learning
K-means Clustering
More on K-means Clustering
Implementing K-means Clustering
Interpreting the Results of Clustering
PCA and Clustering
Hierarchical Clustering
Hierarchical Clustering Using Trees
Agglomerative Clustering
Applying Clustering
Comparing Aggomerative and K-means Clustering
Clustering with scikit-learn
Putting It All Together
Unit 8 Assessment
Simple Linear Regression
Implementing Simple Linear Regression with scikit-learn
Practicing Linear Regression
Multiple Linear Regression
Multiple Regression in scikit-learn
The Assumptions of Simple Linear Regression
Residual Plots and Regression
Simple Linear Regression Project
Overfitting
Overfitting in a Learning Model
What is Cross-Validation?
More on Cross-Validation
Cross-Validation in Machine Learning
Statistical Modeling Project
Unit 9 Assessment
Introduction to statsmodels
Regression Using statsmodels
Using scikit-learn with statsmodels
Time Series Basics
Autoregressive Models
Time Series and Forecasting
Moving-Average Models
MA Model Examples
AR and MA Models
ARIMA Models
ARIMA in Python
ARIMA and Seasonal ARIMA Models
ARIMA(p,d,q)
Time Series Forecasting with ARIMA
Unit 10 Assessment
CS250 Study Guide
CS250: Certificate Final Exam
Next