Web information extraction and retrieval

Basic information

Course coordinator

Course type: strokovni izbirni predmet

Number of ECTS credits: 6

Semester: 2. semester

Course code: 63551

Subject description

Content of the course:

This course will cover the following topics:

Information Retrieval and Web Search

Basic Concepts of Information Retrieval
Information Retrieval Models
Relevance Feedback
Evaluation Measures
Text and Web Page Pre-Processing
Inverted Index and Its Compression
Latent Semantic Indexing
Web Search
Meta-Search: Combining Multiple Rankings

Web Crawling

A Basic Crawler Algorithm
Implementation Issues
Universal Crawlers
Focused Crawlers
Topical Crawlers

Structured Data Extraction

Wrapper Induction
Instance-Based Wrapper Learning
Automatic Wrapper Generation
String Matching and Tree Matching
Multiple Alignment
Building DOM Trees
Extraction Based on a Single List Page or Multiple Pages

Information Integration

Schema-Level Matching
Domain and Instance-Level Matching
Combining Similarities
1:m Match
Integration of Web Query Interfaces
Constructing a Unified Global Query Interface

Opinion Mining and Sentiment Analysis

Document Sentiment Classification
Sentence Subjectivity and Sentiment Classification
Opinion Lexicon Expansion
Aspect-Based Opinion Mining
Opinion Search and Retrieval

Objectives

The main objective of this course is to teach students about how to develop programs for web search (including surface web and deep web search) and for extraction of structural data from both, static and dynamic web pages. Beside basic concepts of the web search and retrieval, students will learn about relevant techniques and approaches. After the course, if successful, students will be able to develop programs for automatic web search and structured data extraction from web pages (including search and extraction from on-line social media).

Teaching and learning methods

Lectures, seminars, homeworks, oral presentations, project work.

How can we help you?

Popular searches

Basic information

Subject description

Objectives

Teaching and learning methods

Search

Popular searches

Web information extraction and retrieval

Basic information

Subject description

Objectives

Teaching and learning methods

Stay up to date