Problem Set 4
CS 6347
Due: 4/25/2024 by 11:59pm
Note: all answers should be accompanied by explanations for full credit. Late homeworks
cannot be accepted. All submitted code MUST compile/run.
Problem 1: Expectation Maximization for Colorings (40 pts)
For this problem, we will use the same factorization as we have in past assignments. As on the
previous assignment, the weights will now be considered parameters of the model that need to be
learned from samples.
Suppose that some of the vertices, L ⊆ V , are latent variables in the model. Given m samples
of the observed variables in V L, what is the log-likelihood as a function of the weights? Perform
MLE using the EM algorithm. Your solution should be written as a MATLAB function that takes
as input an n × n matrix A corresponding to the adjacency matrix of a graph G, an n-dimensional
binary vector L whose non-zero entries correspond to the latent variables, and samples which is an
n × m k-ary matrix where samplesi,t corresponds to observed color for vertex i in the t
th sample
(you should discard any inputs related to the latent variables). The output should be the vector of
weights w corresponding to the MLE parameters for each color from the EM algorithm. Note that
you should use belief propagation to approximate the counting problem in the E-step.
function w = colorem(A, L, samples)
Problem 2: EM for Bayesian Networks (60pts)
For this problem, you will use the house-votes-84.data data set provided with this problem set.
Each row of the provided data file corresponds to a single observation of a voting record for a
congressperson: the first entry is party affiliation and the remaining entries correspond to votes on
different legislation with question marks denoting missing data.
1. Using the first three features and the first 300 data observations only, fit a Bayesian network
to this data using the EM algorithm for each of the eight possible complete DAGs over three
variables.
2. Do different runs of the EM algorithm produce different models?
3. Evaluate your eight models, on the data that was not used for training, for the task of
predicting party affiliation given the values of the other two features. Is the prediction highly
请加QQ:99515681 邮箱:99515681@qq.com WX:codinghelp
- 赛诺威盛:大孔径专科化CT领航者
- 网易硬刚腾讯 两大游戏玩家之间的口水仗不断
- 全球“最独特”的一台华为 nova 6 5G 版手机是什么样子的?
- 拼多多抖音淘宝京东,谁是真低价?
- 老杨第一次再度抓握住一瓶水,他由此产生了新的憧憬
- 丰田章男称未来依然需要内燃机 已经启动电动机新项目
- B站更新决策机构名单:共有 29 名掌权管理者,包括陈睿、徐逸、李旎、樊欣等人
- 苹果罕见大降价,华为的压力给到了?
- 三明列东又有房子要拆迁!住这里的人要发了!
- 放大招后,广州又忍不住了…
- 私募积极加仓,百亿股票私募仓位指数创出近八周新高
- 他,传闻中马云最想见的人
- 升级的脉脉,正在以招聘业务铺开商业化版图
- 如何经营一家好企业,需要具备什么要素特点
- 智慧驱动 共创未来| 东芝硬盘创新数据存储技术