DM825 - Introduction to Machine Learning
Sheet 3, Spring 2011
Exercise 1 Bayesian prediction.
- [(a)]Let θ ∼ Dir(α). Consider
multinomial random variables (X1, X2, …, XN), where Xn
∼ Mult(θ) for each n, and where the Xn are
assumed conditionally independent given θ. Now consider a
random variable Xnew ∼ Mult(θ) that is assumed
conditionally independent of (X1, X2, …, XN) given
θ. Compute the predictive distribution:
by integrating over θ.
|p(xnew | x1 , x2 ,…, xN ,α)|
- Redo the problem in part (a), replacing the multinomial
distribution with an arbitrary exponential family distribution, and
the Dirichlet distribution with the corresponding exponential family
conjugate distribution. You are to show that in general the predictive
probability p(xnew | x1 , x2 , … , xN) is a ratio of
Exercise 2 Classification. The course website
contains a data set
of (xn, yn ) pairs, where the xn are 2-dimensional vectors
and yn is a binary label.
- [(a)]Plot the
data, using 0’s and X’s for the two classes. The plots in the following
parts should be plotted on top of this plot.
- Write a program to fit a logistic regression model using
stochastic gradient ascent. Plot the line where the logistic function
is equal to 0.5. Compare this outcome with the result attained using
glm function in R (check example in
- Fit a linear regression to the problem, treating the class labels
as real values 0 and 1. (You can solve the linear regression in any
way you like, including solving the normal equations, using the LMS
algorithm, or calling the built-in
lm routine in R). Plot the
line where the linear regression function is equal to 0.5.
- The data set
is a separate data set generated from the same source. Test your
fits from parts (b), (c), and (d) on these data and compare the