Data Mining, Discovery, and Exploration

Harvard Extension School

CSCI E-108

Section 1

CRN 17304

View Course Details
Extracting actionable insights and relationships from massive complex data sets is the domain of data mining. Data mining has wide-ranging applications in science and technology, including web search, understanding interactions in social networks, recommender systems, analyzing data from large internet-of-things (IoT) sensor networks, image search, genetic analysis, and discovery of interactions between drugs. This course surveys a range of unsupervised learning algorithms for data mining. The emphasis is on graph algorithms and scaling for massive datasets. The course comprises readings and lectures on theory along with hands-on exercises and projects where students apply the theory through Python coding and interpretation of results. The hands-on component of the course uses a variety of libraries in the Python language, Scikit-Learn, NetworkX, Scikit-Learn-Extra, Mlextend, Surprise, and TensorFlow. Students may not take both CSCI E-96 and CSCI E-108 for degree or certificate credit.

Instructor Info

Stephen Elston, PhD

Principal Consultant, Quantia Analytics LLC


Meeting Info

W 6:00pm - 8:00pm (9/2 - 12/20)

Participation Option: Online Asynchronous or Online Synchronous

In online asynchronous courses, you are not required to attend class at a particular time. Instead you can complete the course work on your own schedule each week.

Deadlines

Last day to register: August 28, 2025

Additional Time Commitments

Optional sections Mondays, 6-8 pm.

Prerequisites

Students enrolling in this course are expected to have some background in Python programming equivalent to CSCI E-7 or CSCI E-50 and statistical modeling equivalent to CSCI E-63c, CSCI E-101, CSCI E-106, or STAT E-109. Knowledge of basic linear algebra, equivalent to MATH E-21a, is essential.

Notes

This course meets via web conference. Students may attend at the scheduled meeting time or watch recorded sessions asynchronously. Recorded sessions are typically available within a few hours of the end of class and no later than the following business day. See minimum technology requirements.

All Sections of this Course

CRN Section # Participation Option(s) Instructor Section Status Meets Term Dates
35899 1 Online Asynchronous, Online Synchronous Stephen Elston Open MW 6:30pm - 9:30pm
Jun 23 to Aug 8
17304 1 Online Asynchronous, Online Synchronous Stephen Elston Open W 6:00pm - 8:00pm
Sep 2 to Dec 20