Subject description
The syllabus is based on a selection of modern deep learning based natural learning processing techniques and their practical use. The lectures introduce the main tasks and techniques, explain their operation and theoretical background. During practical sessions and seminars the gained knowledge is applied to language practical task using open source tools. Student investigate and solve assignments, based on real-world research and commercial problems form English and Slovene languages.
- Introduction to natural language processing: motivation, language understanding, ambiguity, traditional, statistical, and neural approaches.
- Text preprocessing and normalization: regular expressions, grammars, string similarity, advanced normalization techniques, lemmatization.
- Language resources: corpora, dictionaries, thesauri, networks and semantic databases, WordNet.
- Text similarity: measures, clustering approaches, cosine distance, language networks and graphs.
- Text representation: sparse and dense embeddings; language models; word, sentence, and document embeddings.
- Deep neural networks for text: recurrent neural networks, convolutional networks for text, transformers.
- Neural embeddings: word2vec, fastText, ELMo, BERT, cross-lingual embeddings.
- Large language models: BERT, GPT, and T5, multimodal models.
- Shallow computational and lexical semantics: part-of-speech tagging, dependency parsing, named entity recognition, semantic role labelling.
- Word senses and their disambiguation.
- Affective computing: sentiment, emotions.
- Text summarization, question answering and reading comprehension: methods and evaluation.
- Machine translation: methods and evaluation.
The subject is taught in programs
Objectives and competences
Upon completion of the course, students shall be able to explain and apply fundamental algorithms and techniques in the area of natural language processing. In particular, students will:
understand approaches to syntax and semantics in NLP,
understand approaches to summarization and question answering
understand statistical and neural approaches to machine translation,
understand deep learning techniques used in NLP,
know how to apply standard natural language processing tools.
Teaching and learning methods
Lectures, lab work, work in small groups, public presentations of projects.
Expected study results
Upon completion of the course, students will:
understand approaches to syntax and semantics in NLP,
evaluate approaches to summarization
differentiate between different approaches to machine translation,
use and adapt machine learning techniques for NLP
apply and critically evaluate natural language processing tools
know the existing language resources and be able to design new ones
use different text representations and adapt them to new contexts
Basic sources and literature
Jurafsky, David and Martin, James H. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition, 3rd edition draft. 2023.
Jacob Eisenstein. Natural Language Processing, MIT press, 2019