Fundamentals of Data Science
Harvard Extension School
CSCI E-83
Section 1
CRN 16768
This course builds on CSCI E-101, giving students a solid foundation for advanced data modeling, machine learning, and artificial intelligence (AI). The course focuses on the methods of modern computational statistics underpinning advanced data science. In the twenty-first century, these powerful, computationally intensive algorithms are both practical and widely used. These algorithms enable us to explore and model the large, complex datasets commonly encountered in the real world today. The focus of this course is on methods to address the exploration, inference, and modeling challenges arising from the analysis of increasingly complex datasets. Approaches to large-scale computational statistical inference are discussed, include maximum likelihood, modern resampling methods, and Bayesian models. The properties and behavior of the rich family of linear models and Bayesian models, foundational to many statistical, machine learning, and AI algorithms are surveyed. Additionally, time series models are explored. The course employs a combination of theory and hands-on experience using Python programming tools. The focus is on foundational computational statistical algorithms, statistical inference methods, and effective visualization methods. The hands-on component of the course uses the Python packages, NumPy, Pandas, Seaborn, Statsmodels, and PyMC3, along with selected other open-source packages.
Registration Closes: August 28, 2025
Credits: 4
View Tuition Information Term
Fall Term 2025
Part of Term
Full Term
Format
Flexible Attendance Web Conference
Credit Status
Graduate, Noncredit, Undergraduate
Section Status
Open