Introduction to Natural Language Processing

Harvard Extension School

CSCI E-89B

Section 1

CRN 17133

View Course Details

Students are introduced to modern techniques of natural language processing (NLP) and learn foundations of text classification, named entity recognition, parsing, language modeling including text generation, topic modeling, and machine translation. Methods for representing text as data studied in the course are tokenization, n-grams, bag of words, term frequency-inverse document frequency (TD-IDF) weighting, word embeddings like Word2Vec and GloVe, autoencoders, t-SNE, character embeddings, and topic modeling. The machine learning algorithms for NLP covered in the course are recurrent neural networks (RNNs) including long short-term memory (LSTM), conditional random fields (CRFs), bidirectional LSTM with a CRF (BiLSTM-CRF), generative adversarial networks (GANs), attention models, transformers, bidirectional encoder representations from transformers (BERT), latent Dirichlet allocation (LDA), non-negative matrix factorization (NMF), and structural topic modeling (STM). Students get hands-on experience using both Python and R.

Credits: 4

Term

Fall Term 2025

Part of Term

Full Term

Format

Flexible Attendance Web Conference

Credit Status

Graduate, Noncredit, Undergraduate

Section Status

Open

Instructor Info

Dmitry V. Kurochkin, PhD

Meeting Info

T 8:10pm - 10:10pm (9/2 - 12/20)

Participation Option: Online Asynchronous or Online Synchronous

Deadlines

Last day to register:

Additional Time Commitments

Optional sections Fridays, time to be arranged.

Prerequisites

Students are expected to have taken Python programming course equivalent to CSCI E-7. Most of the problems will be solved in Python. The structural topic modeling will be performed using the 'stm' R package. Prior programming experience in R is helpful, but not required. In addition, basic knowledge of calculus, probability, and statistics is expected. Students need to have access to a computer with a 64-bit operating system and at least 8 GB of RAM. GPU is highly recommended.

Notes

This course meets via web conference. Students may attend at the scheduled meeting time or watch recorded sessions asynchronously. Recorded sessions are typically available within a few hours of the end of class and no later than the following business day. See minimum technology requirements.

Syllabus

Learn More

Tuition

All Sections of this Course

CRN	Section #	Participation Option(s)	Instructor	Section Status	Meets	Term Dates
17133	1	Online Asynchronous, Online Synchronous	Dmitry Kurochkin	Open	T 8:10pm - 10:10pm	Sep 2 to Dec 20
36127	1	Online Asynchronous, Online Synchronous	Dmitry Kurochkin	Open	TTh 6:30pm - 9:30pm	Jun 22 to Aug 7