Fundamentals of Data Science
Harvard Extension School
CSCI E-83
Section 1
CRN 16768
This course builds on CSCI E-101, giving students a solid foundation for advanced data modeling, machine learning, and artificial intelligence (AI). The course focuses on the modern computational statistical methods underpinning advanced data science. In the twenty-first century, these powerful, computationally intensive models are both practical and widely used. Such models enable us to explore and model the complex datasets commonly encountered in the real world. The course employs a combination of theory and hands-on experience using Python programming tools. The focus is on foundational computational statistical algorithms, statistical inference methods, and effective visualization methods. The hands-on component of the course uses the Python packages, NumPy, Pandas, Seaborn, Statsmodels, and PyMC3, along with selected other open-source packages. The focus of this course is on methods to address the exploration, inference, and modeling changes arising from the analysis of increasingly complex datasets. Three approaches to large scale computational statistical inference are addressed: maximum likelihood, modern resampling methods, and Bayesian models. The properties and behavior of the rich family of linear models and Bayesian models, foundational to many statistical, machine learning and AI algorithms are surveyed. Additionally, time series models are explored.
Registration Closes: August 29, 2024
Credits: 4
View Tuition Information Term
Fall Term 2024
Part of Term
Full Term
Format
Flexible Attendance Web Conference
Credit Status
Graduate, Noncredit, Undergraduate
Section Status
Open