Fundamentals of Data Science II

Harvard Extension School

CSCI E-83

Section 1

CRN 16768

View Course Details
This course builds on CSCI E-101, giving students a solid foundation for advanced data modeling, machine learning, and artificial intelligence (AI). The focus of this course is data exploration and inference methods for understanding and interpreting complex relationships in modern datasets. Datasets are becoming more complex and data analysis tools and methods are evolving rapidly with the introduction of generative AI. As a result there is an increasing need for understanding and interpretation of complex results in order to make confident inferences. The focus of this course is on twenty-first century methods for exploratory data analysis (EDA), inference and modeling arising from the analysis of increasingly complex datasets encountered in today's world. Specific areas explored in this course include large-scale computational statistical inference for exploring and understanding complex data; graphical methods for exploring complex data and presenting results; modern resampling methods and Bayesian models for inference and exploration; and properties and models for analysis of complex time series. The course includes a combination of theory and hands-on experience using Python programming tools. The focus is on foundational computational statistical algorithms, statistical inference methods, and effective visualization and exploration methods. The hands-on component of the course uses the Python packages, NumPy, Pandas, Seaborn, Statsmodels, and PyMC3, along with selected other open-source packages. An independent project is required for all students registering for graduate credit.

Instructor Info

Stephen Elston, PhD

Principal Data Scientist


Meeting Info

Th 6:00pm - 8:00pm (8/31 - 12/19)

Participation Option: Online Asynchronous or Online Synchronous

In online asynchronous courses, you are not required to attend class at a particular time. Instead you can complete the course work on your own schedule each week.

Deadlines

Last day to register:

Additional Time Commitments

Optional sections Tuesdays, 6-8 pm.

Prerequisites

Some exposure to basic machine learning and data science methods, equivalent to CSCI E-101. Experience programming using the Python language, equivalent to CSCI E-7 or CSCI E-29. For people with limited Python programming experience, some experience programming, in any language, such as R, Matlab, or C++, is essential. Knowledge of linear algebra, including eigenvalue-eigenvector decomposition and a bit of differential and integral calculus is essential.

Notes

This course meets via web conference. Students may attend at the scheduled meeting time or watch recorded sessions asynchronously. Recorded sessions are typically available within a few hours of the end of class and no later than the following business day. See minimum technology requirements.

All Sections of this Course

CRN Section # Participation Option(s) Instructor Section Status Meets Term Dates
16768 1 Online Asynchronous, Online Synchronous Stephen Elston Open Th 6:00pm - 8:00pm
Aug 31 to Dec 19