Statistical Methods for High-dimensional Data

Subject description

  1. Statistical peculiarities of research using high-dimensional data. Design of experiments with high-dimensional data. Graphical representation of data.
  2. Identification of variables associated with an outcome (recurrence of disease, survival time, etc). Multiple testing: family-wise error rate, false discovery rate; parametric and non-parametric approaches.
  3. Evaluation of multivariable functions for outcome prediction. Methods for variable selection, estimation of the classification function and of predictive accuracy.
  4. Interpretation of results.

 

Use of statistical program R and of Bioconductor.

The subject is taught in programs

Objectives and competences

High-dimensional experiments are very common in practice. The peculiarity of this type of experiments is that the number of measured variables greatly exceeds the number of samples included in the study. For this reason it is important to use statistical methods that appropriately take into account the high-dimensionality of the data.

The aim of this course is to allow the student to work independently with this type of data. The emphasis is on design and analysis of high-dimensional studies.

Teaching and learning methods

Lectures and laboratory work – practicals using computers. Project development and presentation, working in small groups. Positively evaluated home assignments are required for admission to the exam.

Part of the pedagogical process will be carried out with the help of ICT technologies and the opportunities they offer.

Expected study results

Knowledge and understanding:

The student knows how to design a study with high-dimensional data and can select the appropriate methods for the analysis of data.  The student correctly interprets the results and can prepare a report that presents them.

Basic sources and literature

Sandrine Dudoit, Mark J. van der Laan.  Multiple Testing Procedures with Applications to Genomics (2005).  Springer Series in Statistics.

Richard M. Simon, Edward L. Korn, Lisa M. McShane, and Michael D. Radmacher et al.  Design and Analysis of DNA Microarray Investigations (2004). Springer.

Richard O. Duda, Peter E. Hart, David G. Stork. Pattern Classification (2000). Wiley-Interscience.

Stay up to date

University of Ljubljana, Faculty of Electrical Engineering Tržaška cesta 25, 1000 Ljubljana

E:  dekanat@fe.uni-lj.si T:  01 4768 411