Welcome to CS250: Python for Data Science
Specific information about this course and its requirements can be found below. For more general information about taking Saylor Academy courses, including information about Community and Academic Codes of Conduct, please read the Student Handbook.
Learn data science using the Python programming language by looking at data processing, data analysis, visualization, data mining, and statistical models. By the end of this course, you will be able to implement Python code for these data science topics.
This course attempts to strike a balance between presenting the vast set of methods within the field of data science and Python programming techniques for implementing them. Problem-solving and programming implementation will be emphasized throughout the course. All techniques presented will be introduced using real-world programming examples. A major goal of the course is to ensure that when you finish the course, you will have the programming and conceptual expertise you need to join the field of data science.
Several Python modules such as pandas, scikit-learn, scipy.stats, and statsmodels will be introduced that are useful for data analysis, data visualization, and data mining. The course will gradually shift from introductory topics such as a review of Python, matrix operations, and statistics to applications and implementing programs involving data mining, visualization, statistical models, and time series analysis.
This course includes the following units:
- Unit 1: What is Data Science?
- Unit 2: Python for Data Science
- Unit 3: The numpy Module
- Unit 4: Applied Statistics in Python
- Unit 5: The pandas Module
- Unit 6: Visualization
- Unit 7: Data Mining I – Supervised Learning
- Unit 8: Data Mining II – Clustering Techniques
- Unit 9: Data Mining III - Statistical Modeling
- Unit 10: Time Series Analysis
Course Learning Outcomes
Upon successful completion of this course, you will be able to:
- use Google Colaboratory notebooks to implement and test Python programs;
- explain how Python programming is relevant to data science;
- construct and operate on arrays using the numpy module;
- apply Python modules for basic statistical computation;
- construct and operate on dataframes using the pandas module;
- apply the pandas module to interact with spreadsheet software;
- implement Python scripts for visualization using arrays and dataframes;
- apply the scikit-learn module to perform data mining;
- explain techniques for supervised and unsupervised learning;
- apply supervised learning techniques;
- apply unsupervised learning techniques;
- apply the scikit-learn module to build statistical models;
- implement Python scripts to perform regression analyses;
- apply the statsmodels module to build and analyze models for time series analysis; and
- explain similarities and differences between AR, MA, and ARIMA models.
Throughout this course, you will also see learning outcomes in each unit. You can use those learning outcomes to help organize your studies and gauge your progress.
The primary learning materials for this course are readings, lectures, and videos.
All course materials are free to access, and can be found in each unit of the course. Pay close attention to the notes that accompany these course materials, as they will tell you what to focus on in each resource, and will help you to understand how the learning materials fit into the course as a whole. You can also see a list of all the learning materials in this course by clicking on Resources in the navigation bar.
Evaluation and Minimum Passing Score
Only the final exam is considered when awarding you a grade for this course. In order to pass this course, you will need to earn a 70% or higher on the final exam. Your score on the exam will be calculated as soon as you complete it. If you do not pass the exam on your first try, you may take it again as many times as you want, with a 7-day waiting period between each attempt. Once you have successfully passed the final exam you will be awarded a free Course Completion Certificate.
There are also end-of-unit assessments in this course. These are designed to help you study, and do not factor into your final course grade. You can take these as many times as you want to, until you understand the concepts and material covered. You can see all of these assessments by clicking on Quizzes in the course's navigation bar.
Tips for Success
CS250: Python for Data Science is a self-paced course, which means that you can decide when you will start and when you will complete the course. There is no instructor or set schedule to follow. We estimate that the "average" student will take 67 hours to complete this course. We recommend that you work through the course at a pace that is comfortable for you and allows you to make regular progress. It's a good idea to also schedule your study time in advance and try as best as you can to stick to that schedule.
Learning new material can be challenging, so we've compiled a few study strategies to help you succeed:
- Take notes on the various terms, practices, and theories that you come across. This can help you put each concept into context, and will create a refresher that you can use as you study later on.
- As you work through the materials, take some time to test yourself on what you remember and how well you understand the concepts. Reflecting on what you've learned is important for your long-term memory, and will make you more likely to retain information over time.
In order to take this course, you should have a basic knowledge of statistics (such as computing the mean of a set of numbers).
This course is delivered entirely online. You will be required to have access to a computer or web-capable mobile device and have consistent access to the internet to either view or download the necessary course resources and to attempt any auto-graded course assessments and the final exam.
- To access the full course including assessments and the final exam, you will need to be logged into your Saylor Academy account and enrolled in the course. If you do not already have an account, you may create one for free here. Although you can access some of the course without logging in to your account, you should log in to maximize your course experience. For example, you cannot take assessments or track your progress unless you are logged in.
For additional guidance, check out Saylor Academy's FAQ.
This course is entirely free to enroll in and to access. Everything linked in the course, including textbooks, videos, webpages, and activities, are all available for no charge. This course also contains a free final exam and course completion certificate.