Modern Data Analytics

Harvard Extension School

CSCI E-192

Section 1

CRN 26646

View Course Details
Data is the new gold of the modern age. It affects all aspects of business and everyday lives: social media, communication, financial and health data, web and application logs, security, and threat mitigation—all rely on the ability to collect, process, and analyze terabytes and petabytes from numerous data sources. Modern cloud-based frameworks and infrastructure serve as a foundation and an enabler for most services. In this course, students learn how to navigate this extraordinarily diverse and fast-changing field through popular tools and frameworks to process and analyze data, such as Spark 3 and related application programming interfaces (APIs) and frameworks (Spark Core, Spark SQL, Spark MLLib, and GraphX). We cover the basics of machine learning and deploying models to the cloud, including how to design and organize data using modern distributed data storage options (such as Redshift and BigQuery); elements of data lakes and data warehouse design and evolution to data mesh architectures; trends in unified data analytics and modern data stack frameworks; and integration with business intelligence (BI) tools for data visualization (Looker or Amazon Web Services [AWS] Quicksight). We work hands-on with many of the above frameworks on AWS and Google Cloud Platform (GCP) cloud. We primarily use Python for those assignments that require programming.

Instructor Info

Edward S Sumitra, MS

Director, Software Engineering, Curriculum Associates


Marina Yu Popova, ALM

Engineer, TechTarget


Meeting Info

T 6:00pm - 8:00pm (1/27 - 5/17)

Participation Option: Online Asynchronous or Online Synchronous

In online asynchronous courses, you are not required to attend class at a particular time. Instead you can complete the course work on your own schedule each week.

Deadlines

Last day to register: January 23, 2025

Additional Time Commitments

Optional sections to be arranged.

Prerequisites

CSCI E-88, CSCI E-88a, or CSCI E-90, and intermediate Python skills. Some familiarity with Docker and cloud environments. CSCI E-88c is recommended.

Notes

This course meets via web conference. Students may attend at the scheduled meeting time or watch recorded sessions asynchronously. Recorded sessions are typically available within a few hours of the end of class and no later than the following business day.

Syllabus

All Sections of this Course

CRN Section # Participation Option(s) Instructor Section Status Meets Term Dates
26646 1 Online Asynchronous, Online Synchronous Team Taught Open T 6:00pm - 8:00pm
Jan 27 to May 17