Generative Learning Algorithms
Gaussian Discriminant Analysis
If we want to classify between cats and dogs, discrimnative learning algorithms try to learn a hyperplane that separates the two classes directly. Generative learning algorithms, try to learn a model for what cats look like and a separate model to learn what dogs look like.
After modelling
p(x∣y) for each class and the class prior
p(y), we can classify a new example by computing the posterior
p(y∣x) for each class and picking the class with the highest posterior probability.
argymaxp(y∣x)=argymaxp(x)p(x∣y)p(y) =argymaxp(x∣y)p(y) In Gaussian Discriminant Analysis (GDA), we assume that each of
p(x∣y) follows a multivariate gaussian distribution with unique mean but a shared covariance matrix.
p(y)=ϕy(1−ϕ)1−y p(x∣y=0)=(2π)d/2∣Σ∣1/21exp(−21(x−μ0)TΣ−1(x−μ0)) p(x∣y=1)=(2π)d/2∣Σ∣1/21exp(−21(x−μ1)TΣ−1(x−μ1)) The log-likelihood of the data is then given by:
ℓ(ϕ,μ0,μ1,Σ)=logi=1∏np(x(i),y(i);ϕ,μ0,μ1,Σ) By maximizing
ℓ with respect to the parameters, we find the maximum likelihood estimate of the parameters to be:
ϕ=n1i=1∑n1{y(i)=1} μ0=∑i=1n1{y(i)=0}∑i=1n1{y(i)=0}x(i) μ1=∑i=1n1{y(i)=1}∑i=1n1{y(i)=1}x(i) Σ=n1i=1∑n(x(i)−μy(i))(x(i)−μy(i))T