Natural language processing
Osnovni podatki
Opis predmeta
The syllabus is based on a selection of modern deep learning based natural learning processing techniques and their practical use. The lectures introduce the main tasks and techniques, explain their operation and theoretical background. During practical sessions and seminars the gained knowledge is applied to language practical task using open source tools. Student investigate and solve assignments, based on real-world research and commercial problems form English and Slovene languages.
- Introduction to natural language processing: motivation, language understanding, ambiguity, traditional, statistical, and neural approaches.
- Text preprocessing and normalization: regular expressions, grammars, string similarity, advanced normalization techniques, lemmatization.
- Language resources: corpora, dictionaries, thesauri, networks and semantic databases, WordNet.
- Text similarity: measures, clustering approaches, cosine distance, language networks and graphs.
- Text representation: sparse and dense embeddings; language models; word, sentence, and document embeddings.
- Deep neural networks for text: recurrent neural networks, convolutional networks for text, transformers.
- Neural embeddings: word2vec, fastText, ELMo, BERT, cross-lingual embeddings.
- Large language models: BERT, GPT, and T5, multimodal models.
- Shallow computational and lexical semantics: part-of-speech tagging, dependency parsing, named entity recognition, semantic role labelling.
- Word senses and their disambiguation.
- Affective computing: sentiment, emotions.
- Text summarization, question answering and reading comprehension: methods and evaluation.
- Machine translation: methods and evaluation.
Cilji
Upon completion of the course, students shall be able to explain and apply fundamental algorithms and techniques in the area of natural language processing. In particular, students will:
understand approaches to syntax and semantics in NLP,
understand approaches to summarization and question answering
understand statistical and neural approaches to machine translation,
understand deep learning techniques used in NLP,
know how to apply standard natural language processing tools.
Metode poučevanja in učenja
Lectures, lab work, work in small groups, public presentations of projects.