This project deals with estimation of the parameters of the Polya’s distribution. The PMF of the observations x1,,xm is given by: p(x1,x2,,xm)=i=1m(n(xi)!knk(xi)!Γ(kαk)Γ(n(xi)+kαk)kΓ(nk(xi)+αk)Γ(αk))

For our problem, we focus on the beta binomial model, with k1,2.

Defining parameter vector θ.

In general form: θ=[α1,α2,,αk]T

Beta binomial form: θ=[α1,α2]T

Compute the FIM and CRLB.

For our problem, we are given the following assumptions:

  • m=20
  • n(xi)=20 for 1im
  • k1,2

Note: since k1,2, then kαk=α1+α2

p(x1,,xm)=i=1m(n(xi)!knk(xi)!Γ(kαk)Γ(n(xi)+kαk)kΓ(nk(xi)+αk)Γ(αk))log(p(x1,,xm))=log(i=1m(n(xi)!knk(xi)!Γ(kαk)Γ(n(xi)+kαk)kΓ(nk(xi)+αk)Γ(αk)))=i=1mlog(n(xi)!knk(xi)!Γ(kαk)Γ(n(xi)+kαk)kΓ(nk(xi)+αk)Γ(αk))=i=1mlog(n(xi)!knk(xi)!)+log(Γ(kαk)Γ(n(xi)+kαk))+log(kΓ(nk(xi)+αk)Γ(αk))=i=1mlog(n(xi)!knk(xi)!)+log(Γ(α1+α2)Γ(20+α1+α2))+log(Γ(n1(xi)+α1)Γ(n2(xi)+α2)Γ(α1)Γ(α2))Note: we can drop the first term since it does not depend on α, equivalent to logpθ=0=i=1mlog(Γ(α1+α2)Γ(20+α1+α2))+log(Γ(n1(xi)+α1)Γ(n2(xi)+α2)Γ(α1)Γ(α2))=mlog(Γ(α1+α2)Γ(20+α1+α2))+i=1mlog(Γ(n1(xi)+α1)Γ(n2(xi)+α2))log(Γ(α1)Γ(α2))=m[log(Γ(α1+α2))log(Γ(20+α1+α2))]+i=1mlog(Γ(n1(xi)+α1))+log(Γ(n2(xi)+α2))log(Γ(α1))log(Γ(α2))

We now find logpθ=[logpα1logpα2] from the above log likelihood function.

logpα1=m(ψ(α1+α2)ψ(20+α1+α2))+i=1mψ(n1(xi)+α1)ψ(α1)=m(ψ(α1+α2)ψ(20+α1+α2)ψ(α1))+i=1mψ(n1(xi)+α1)logpα2=m(ψ(α1+α2)ψ(20+α1+α2))+i=1mψ(n1(xi)+α1)ψ(α2)=m(ψ(α1+α2)ψ(20+α1+α2)ψ(α2))+i=1mψ(n1(xi)+α1)

We now find 2logpθθT=[2logpα1α12logpα1α22logpα2α12logpα2α2] from the above first-order partial derivatives.

2logpα1α1=m(ψ(α1+α2)ψ(20+α1+α2)ψ(α1))+i=1mψ(n1(xi)+α1)2logpα1α2=2logpα2α1=m(ψ(α1+α2)ψ(20+α1+α2))2logpα2α2=m(ψ(α1+α2)ψ(20+α1+α2)ψ(α2))+i=1mψ(n2(xi)+α2)

We now find FIM=E[2logpθθT] and CRLB=FIM1 from the above second-order partial derivatives.

For simplicity and spacing, let v=m(ψ(α1+α2)ψ(20+α1+α2)) since it occurs in each partial derivative. FIM=E[2logpα1α12logpα1α22logpα2α12logpα2α2]=E[vmψ(α1)+i=1mψ(n1(xi)+α1)vvvmψ(α2)+i=1mψ(n1(xi)+α2)]=[E[vmψ(α1)+i=1mψ(n1(xi)+α1)]E[v]E[v]E[vmψ(α2)+i=1mψ(n2(xi)+α2)]]CRLB=FIM1=[E[vmψ(α1)+i=1mψ(n1(xi)+α1)]E[v]E[v]E[vmψ(α2)+i=1mψ(n2(xi)+α2)]]1

Maximum Likelihood Estimation

Recall from the previous section that we derived the partial derivatives of the log likelihood function as follows: logpα1=m(ψ(α1+α2)ψ(20+α1+α2))+i=1mψ(n1(xi)+α1)ψ(α1)=m(ψ(α1+α2)ψ(20+α1+α2)ψ(α1))+i=1mψ(n1(xi)+α1)logpα2=m(ψ(α1+α2)ψ(20+α1+α2))+i=1mψ(n1(xi)+α1)ψ(α2)=m(ψ(α1+α2)ψ(20+α1+α2)ψ(α2))+i=1mψ(n1(xi)+α1) We can then set each of the derivatives equal to 0 and solve for our αk parameter in order to find the Maximum Likelihood Estimator for each. logpα1=m(ψ(α1+α2)ψ(20+α1+α2)ψ(α1))+i=1mψ(n1(xi)+α1)=0logpα2=m(ψ(α1+α2)ψ(20+α1+α2)ψ(α2))+i=1mψ(n1(xi)+α1)=0

However, no closed form solution exists for this equation. As proposed in the project guidelines, we used the method of fixed point iteration proposed by Minka et al. in order to update each of the αk values: αk(t)=αk(t1)iψ(nik+αk(t1))ψ(αk(t1))iψ(ni+kαk(t1))ψ(kαk(t1))

Specifically, for our problem of k1,2, we implement: α1(t)=α1(t1)iψ(ni1+α1(t1))ψ(α1(t1))iψ(ni+α1(t1)+α2(t1))ψ(α1(t1)+α2(t1))α2(t)=α2(t1)iψ(ni2+α2(t1))ψ(α2(t1))iψ(ni+α1(t1)+α2(t1))ψ(α1(t1)+α2(t1))

Method of Moments

For the beta-binomial distribution, via Wikipedia, the first moment (μ1) and second moment (μ2) based on n1 are given as follows: μ1=n1α1α1+α2 μ2=n1α1(n1(1+α1)+α2)(α1+α2)(α1+α2+1)

For solving the first and second order moments, we set the following: x=1Ni=1Nxi x2=1Ni=1Nxi2

x=E[x]=μ1=n1α1α1+α2x(α1+α2)=n1α1xα1+xα2=n1α1xα2=n1α1xα1α2=n1α1xα1x=α1(n1x1)

Let v=n1x1, and v+1=n1x, such that α2=α1v.

x2=E[x2]=μ2=n1α1(n1(1+α1)+α2)(α1+α2)(α1+α2+1)=n1α1(n1(1+α1)+α1v)(α1+α1v)(α1+α1v+1)=n1α1(α1(n1+v)+n1α1(1+v)(α1(1+v)+1)=n1(α1(n1+v)+n1α1(1+v)2+(1+v))=n1(α1(n1+n1x1)+n1)α1(n1x)2+(n1x)=n1(α1(n1+n1x1)+n1)n1(α1n1x2)+(1x)=α1(n1x+n1x)+n1xα1n1x+1x2(α1n1x+1)=α1(n1x+n1x)+n1xα1x2n1x+x=α1(n1x+n1x)+n1xα1(x2n1xn1xn1+x)=n1xx2α1^mom=n1xx2n1(x2xx1)+x

We can now plug this back into our first order moment to solve for β. α2=α1(n1x1)=(n1xx2n1(x2xx1)+x)(n1x1)=n12n1xn1x2x+x2n1(x2xx1)+xα2^mom=(n1x)(n1x2x)n1(x2xx1)+x