Natural language processing

Subject description

The syllabus is based on a selection of modern deep learning based natural learning processing techniques and their practical use. The lectures introduce the main tasks and techniques, explain their operation and theoretical background. During practical sessions and seminars the gained knowledge is applied to language practical task using open source tools. Student investigate and solve assignments, based on real-world research and commercial problems form English and Slovene languages.

  1. Introduction to natural language processing: motivation, language understanding, ambiguity, traditional, statistical, and neural approaches.
  2. Text preprocessing and normalization: regular expressions, grammars, string similarity, advanced normalization techniques, lemmatization.
  3. Language resources: corpora, dictionaries, thesauri, networks and semantic databases, WordNet.
  4. Text similarity: measures, clustering approaches, cosine distance, language networks and graphs.
  5. Text representation: sparse and dense embeddings; language models; word, sentence, and document embeddings.
  6. Deep neural networks for text: recurrent neural networks, convolutional networks for text, transformers.
  7. Neural embeddings: word2vec, fastText, ELMo, BERT, cross-lingual embeddings.
  8. Large language models: BERT, GPT, and T5, multimodal models.
  9. Shallow computational and lexical semantics: part-of-speech tagging, dependency parsing, named entity recognition, semantic role labelling.
  10. Word senses and their disambiguation.
  11. Affective computing: sentiment, emotions.
  12. Text summarization, question answering and reading comprehension: methods and evaluation.
  13. Machine translation: methods and evaluation.

The subject is taught in programs

Objectives and competences

Upon completion of the course, students shall be able to explain and apply fundamental algorithms and techniques in the area of natural language processing. In particular, students will:

understand approaches to syntax and semantics in NLP,

understand approaches to summarization and question answering

understand statistical and neural approaches to machine translation,

understand deep learning techniques used in NLP,

know how to apply standard natural language processing tools.

Teaching and learning methods

Lectures, lab work, work in small groups, public presentations of projects.

Expected study results

Upon completion of the course, students will:

understand approaches to syntax and semantics in NLP,

evaluate approaches to summarization

differentiate between different approaches to machine translation,

use and adapt machine learning techniques for NLP

apply and critically evaluate natural language processing tools

know the existing language resources and be able to design new ones

use different text representations and adapt them to new contexts

Basic sources and literature

Jurafsky, David and Martin, James H. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition, 3rd edition draft. 2023.

Jacob Eisenstein. Natural Language Processing, MIT press, 2019

Stay up to date

University of Ljubljana, Faculty of Electrical Engineering Tržaška cesta 25, 1000 Ljubljana

E: T:  01 4768 411