KURSPLAN
Språkteknologi och textutvinning, 7,5 högskolepoäng
Natural Language Processing and Text Mining, 7.5 credits
Kursplan för studenter höst 2023
Kurskod:TSTS22
Fastställd av:VD 2022-03-01
Gäller fr.o.m.:2022-08-01
Version:1
Utbildningsnivå:Avancerad nivå
Utbildningsområde:Tekniska området
Ämnesgrupp:DT1
Fördjupning:A1F
Huvudområde:Datavetenskap

Lärandemål

After a successful course, the student shall

Kunskap och förståelse

- display knowledge of basic operations for manipulating text
- display knowledge of state-of-the-art NLP algorithms
- demonstrate comprehension of classifying and clustering text
- demonstrate comprehension of syntactic and semantic analysis, language modelling, vector semantics and The Distributional Hypothesis, sequential- and sequence-to-sequence processing tasks and, evaluation metrics for various NLP tasks and applications
- show familiarity with common NLP applications

Färdighet och förmåga

- demonstrate the ability to apply basic text manipulation operations
- demonstrate the ability to implement NLP algorithms and apply them to common tasks
- demonstrate the ability to construct and evaluate NLP applications

Värderingsförmåga och förhållningssätt

- demonstrate an understanding of data bias, ethics and fairness in NLP

Innehåll

This is an introductory course in Natural Language Processing (NLP) and Text Mining. The course covers basic and state-of-the-art techniques for the analysis and interpretation of natural language, focusing on methods involving machine learning on text, alternating theory with practice. The course includes assignments, in which the student implements algorithms for various NLP tasks and applications. After completing the course, the student shall have acquired a thorough theoretical understanding of, and practical experience with, modern algorithms for common NLP tasks and applications. Specifically, the student should understand and be able to apply all theoretical concepts covered.
The course includes the following elements:
- Regular Expressions and Text Normalization
- Data Annotation, Data Bias, Ethics and Fairness in NLP
- Word Embeddings and Word Senses
- Syntactic and Semantic Analysis
- Encoder-Decoder Models (Seq2Seq), Attention and Transformers
- State-of-the-art neural NLP models
- Analysis, Interpretation and Evaluation of NLP Models
- NLP Tasks, such as; Language Modelling , Text Classification and Clustering, Information Extraction, Named Entity Recognition, Semantic Role Labelling, Part-Of-Speech Tagging Coreference Resolution and Discourse Coherence
- NLP Applications, such as; Machine Translation, Information Retrieval, Text Generation, Summarization, Question Answering, Dialogue Systems, Chatbots, Automatic Speech Recognition (ASR) and Text-to-Speech Synthesis (TTS)

Undervisningsformer

The teaching in the course consists of lectures, quizzes, workshops, tutoring and a seminar in connection with a final project.

Undervisningen bedrivs på engelska.

Förkunskapskrav

Passed courses at least 90 credits within the major subject Computer Engineering, Electrical Engineering (with relevant courses in Computer Engineering), or equivalent, or passed courses at least 150 credits from the programme Computer Science and Engineering, and completed courses Artificial Intelligence, 7,5 credits, Mathematics for Intelligent Systems, 7,5 credits, Machine Learning, 7,5 credits, Data Science Programming, 7,5 credits and Deep Learning, 7.5 credits or equivalent. Proof of English proficiency is required.

Examination och betyg

Kursen bedöms med betygen 5, 4, 3 eller Underkänd.

Poängregistrering av examinationen för kursen sker enligt följande system:
ExaminationsmomentOmfattningBetyg
Sluttentamen14,5 hp5/4/3/U
Inlämningsuppgifter3 hpU/G
1 Bestämmer kursens slutbetyg vilket utfärdas först när samtliga moment godkänts.

Kurslitteratur

The literature list for the course will be provided 8 weeks before the course starts.

Principal texts:

Title: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition, 2nd ed, 2008.
Authors: Jurafsky, D. and Martin, J.H.
Publisher: Prentice Hall.
ISBN: 978-0131873216
https:www.cs.colorado.edu/~martin/slp2.html
https:
web.stanford.edu/~jurafsky/slp3

Title: Introduction to Natural Language Processing (Adaptive Computation and Machine Learning series), 1st ed, 2019.
Authors: Eisenstein, J.
Publisher: MIT Press.
ISBN: 978-0262042840
https://github.com/jacobeisenstein/gt-nlp-class/blob/master/notes/eisenstein-nlp-notes.pdf

Title: Neural Network Models in Natural Language Processing (Synthesis Lectures on Human Language Technologies), 1st ed, 2017.
Authors: Goldberg, Y.
Publisher: Morgan and Claypool.
ISBN: 978-1627052986