DM828 - Introduction to Artificial Intelligence Exercise Sheet 5, Autumn 2011 [pdf format]

Exercises

Given the following data set X₁(3,8),X₂=(4,7),X₃=(5,5),X₄=(6,3),X₅=(7,2), estimate the parameters µ and Σ of the multivariate Gaussian distribution that generated them.

A naive Bayes network for spam prediction is depicted in Figure 1. The variable Y is 1 if the email is a spam and 0 otherwise. Each variable X_i, given Y, is a Bernoulli random variable that is 1 if the word that it represents is present in the email and zero otherwise. Given a large training set {(x₁,y₁),(x₂,y₂),…,(x_N,y_N)}, estimate the parameters θ, θ_1|Y=1, θ_1|Y=0, for Y, X₁|Y=1, X₁|Y=0, respectively. Given these estimates and a new email with feature vector x how will the email be predicted?

[ place/.style=ellipse,draw=black!50,fill=black!20,thick, inner sep=0pt,minimum size=0.5cm, var/.style=circle,draw=black!50,inner sep=0pt,minimum size=1cm, center/.style=coordinate,draw=black!50,fill=black!20,thick, inner sep=0pt,minimum size=0.5cm, pre/.style=<-,shorten <=1pt,>=angle 60,semithick, post/.style=->,shorten >=1pt,>=stealth’,semithick, transition/.style=rectangle,draw=black!100,fill=black!20,thick, inner sep=0pt,minimum size=3mm] [var] (s) at (5,3) Y; [var] (v1) at (0,0) X₁ edge [pre] (s); [var] (v2) at (2,0) X₂ edge [pre] (s); (5,0) circle (1pt) node[above] ; (5.1,0) circle (1pt) node[above] edge [pre] (s); (5.2,0) circle (1pt) node[above] ; [var] (vn) at (10,0) X_n edge [pre] (s);

Figure 1: A naive Bayes network for spam prediction

Design a Naive Bayes Network approach for Digit Recognition.

Consider Figure 2. In which points, among A, B, C, D, will k-means move the centers? And in which points will the EM-algorithm move the centers? Motivate your answer.

[ place/.style=ellipse,draw=black!50,fill=black!20,thick, inner sep=0pt,minimum size=0.5cm, center/.style=coordinate,draw=black!50,fill=black!20,thick, inner sep=0pt,minimum size=0.5cm, pre/.style=<-,shorten <=1pt,>=angle 60,semithick, post/.style=->,shorten >=1pt,>=stealth’,semithick, transition/.style=rectangle,draw=black!100,fill=black!20,thick, inner sep=0pt,minimum size=3mm] [place] (b) at (0,0) ; [place] (r) at (4,0) ; [transition] (a) at (0,1.5) ; [transition] (b) at (0.5,1.5) ; [transition] (c) at (3.5,1.5) ; [transition] (d) at (4,1.5) ; [place] (p3) at (4,3) ; [place] (p4) at (0,3) ; [above=2pt, minimum width=2cm] (A) at (a.base) A; [above=2pt, minimum width=2cm] (B) at (b.base) B; [above=2pt, minimum width=2cm] (C) at (c.base) C; [above=2pt, minimum width=2cm] (D) at (d.base) D; (1.7,1.5) circle (2pt) node[above] C₁; (2.3,1.5) circle (2pt) node[above] C₂;

Figure 2: The situation in exercise on EM vs k-means.

Exercises 17.1 from text book.

Consider the GridWorld where each cell has a reward of -3 and the two terminal states have rewards +100 and -100. Compare the utility values in the cells between a deterministic environment and a stochastic environment in which the noise for an action is 0.2 (i.e. with 0.8 the action is the one chosen, and with 0.1 one of the two adjacent actions with equal probability). In the value iteration algorithm perform as many iterations as needed to change the utility value at least once in all cells. Use γ=1.

Consider the following tasks and classify them as supervised, unsupervised or reinforcement learning:

speech recognition (in the training set we are given in input the signal and as output the word)
star data (from the spectrum of the star to a classification of the stars)
lever pressing in animal testing (a device release food when a lever is pulled)
elevator controller (in the training set we are given a sequence of button press and the waiting time of the passengers that we wish to minimize)