Lecture: Word embeddings. Distributional semantics. Count-based (pre-neural) methods. Word2Vec: learn vectors. GloVe: count, then learn. Evaluation: intrinsic vs extrinsic. Analysis and Interpretability. Interactive lecture materials and more.
Seminar: Playing with word and sentence embeddings
Homework: Embedding-based machine translation system
Lecture: Text classification: introduction and datasets. General framework: feature extractor + classifier. Classical approaches: Naive Bayes, MaxEnt (Logistic Regression), SVM. Neural Networks: General View, Convolutional Models, Recurrent Models. Practical Tips: Data Augmentation. Analysis and Interpretability. Interactive lecture materials and more.
Seminar: Text classification with convolutional NNs.
Homework: Statistical & neural text classification.
Lecture: Language Modeling: what does it mean? Left-to-right framework. N-gram language models. Neural Language Models: General View, Recurrent Models, Convolutional Models. Evaluation. Practical Tips: Weight Tying. Analysis and Interpretability. Interactive lecture materials and more.
Seminar: Build a N-gram language model from scratch
Homework: Neural LMs & smoothing in count-based models.
Lecture: What is Transfer Learning? Great idea 1: From Words to Words-in-Context (CoVe, ELMo). Great idea 2: From Replacing Embeddings to Replacing Models (GPT, BERT). (A Bit of) Adaptors. Analysis and Interpretability. Interactive lecture materials and more.
Invited Lecture by Arthur Bražinskas, University of Edinburgh. Intro: different views on summarization. Extractive vs abstractive summarization, evaluation. Overview of the two main domains: news summarization and opinion summarization. Abstractive summarization: pointer-generator network and modern approaches (BertSum, BART, MeanSum, Copycat). Few-shot learning for opinion summarization.
More TBA
Contributors & course staff
Course materials and teaching performed by
Elena Voita - course admin, lectures, seminars, homeworks