The course featured three guest lectures.
Tuesday 20 January
On Tuesday, Peter Grünwald of the CWI will talk on sequential prediction problems:
Consider a set of experts that sequentially predict the future given the past and given some side information. For example, each expert may be a weather(wo)man who, at each day, predicts the probability that it will rain the next day.
We describe a method for combining the experts' predictions that performs well, in some sense, on every possible sequence of data, something which is often thought to be impossible. In marked contrast, classical statistical and learning theory methods only work well under stochastic assumptions ("the data are drawn from some distribution P") that are often violated in practice.
Nevertheless, we get analogous asymptotic results as in uniform-convergence based learning theory, to which we will make a comparison. The individual-sequence based theory is closely related to Bayesian statistics, but there is a twist, because Bayes' theorem has to be supplied with a 'learning rate', which makes the prior more and the data less important.
The main ideas have been re-discovered several times in various guises, within game theory (Hannan 57), statistics (Dawid '84), information theory (Rissanen's Minimum Description Length Principle) and machine learning (Warmuth, Littlestone '89 and Vovk '90) but the field has only really taken off with the publication Prediction, Learning and Games by Cesa-Bianchi and Lugosi (2006).
Consider a set of experts that sequentially predict the future given the past and given some side information. For example, each expert may be a weather(wo)man who, at each day, predicts the probability that it will rain the next day.
We describe a method for combining the experts' predictions that performs well, in some sense, on every possible sequence of data, something which is often thought to be impossible. In marked contrast, classical statistical and learning theory methods only work well under stochastic assumptions ("the data are drawn from some distribution P") that are often violated in practice.
Nevertheless, we get analogous asymptotic results as in uniform-convergence based learning theory, to which we will make a comparison. The individual-sequence based theory is closely related to Bayesian statistics, but there is a twist, because Bayes' theorem has to be supplied with a 'learning rate', which makes the prior more and the data less important.
The main ideas have been re-discovered several times in various guises, within game theory (Hannan 57), statistics (Dawid '84), information theory (Rissanen's Minimum Description Length Principle) and machine learning (Warmuth, Littlestone '89 and Vovk '90) but the field has only really taken off with the publication Prediction, Learning and Games by Cesa-Bianchi and Lugosi (2006).
Wednesday 21 January
On Wednesday, Ronald Meester of the Vrije Universiteit Amsterdam will talk about the use of statistics in the courtroom:
There are several ways probability and statistics play a role in legal court cases. First of all, there is the issue of reasoning under uncertainty; how does that work? Secondly, legal representatives use phrases like "posterior probability of guilt" and the like.
Do these probabilities mean anything and if so, can they be computed? What is the relation between assumptions and conclusion? I will discuss the general issues, but also pay attention to a real, rather dramatic case, in which probability and statistics played a crucial role.
For background material, see Ronald Meester, Marieke Collins, Richard Gill, and Michiel van Lambalgen: "On the (ab)use of statistics in the legal case against the nurse Lucia de B." (available at arXiv.org).
There are several ways probability and statistics play a role in legal court cases. First of all, there is the issue of reasoning under uncertainty; how does that work? Secondly, legal representatives use phrases like "posterior probability of guilt" and the like.
Do these probabilities mean anything and if so, can they be computed? What is the relation between assumptions and conclusion? I will discuss the general issues, but also pay attention to a real, rather dramatic case, in which probability and statistics played a crucial role.
For background material, see Ronald Meester, Marieke Collins, Richard Gill, and Michiel van Lambalgen: "On the (ab)use of statistics in the legal case against the nurse Lucia de B." (available at arXiv.org).
Friday 23 January
On Friday, Nina Gierasimczuk of the ILLC will talk about Gold's theorems. She writes:
In the lecture I will discuss the basic setting of formal learning theory, as proposed by Gold in 1960s. I will focus on identifiability in the limit from positive data, Gold theorems, the notions of locking sequences, tell-tales, and characterizations of identifiability in the limit.
In the lecture I will discuss the basic setting of formal learning theory, as proposed by Gold in 1960s. I will focus on identifiability in the limit from positive data, Gold theorems, the notions of locking sequences, tell-tales, and characterizations of identifiability in the limit.