Adrian Weller, PhD Candidacy & Proposal

Adrian Weller, PhD Candidacy & Proposal: Graphical Models
Sep 26, 2013



			Committee Prof Alfred Aho, CS Prof Maria Chudnovsky, IEOR Prof Tony Jebara, CS Location & Times Both the candidacy and proposal will take place on Thursday Sep 26 between 10am and approx. 1:30pm in the Machine Learning Lab (6LE5) on the 6th floor of CEPSR. Expected timeline (subject to variance depending on questions etc.): Candidacy 10:00am start Presentation 10:00-10:40 (30-40 mins) Questions 10:40-11:30 (up to max 90 minutes) Short break to get lunch (11:30-12:00), fine to bring food back to the lab to finish during the proposal Proposal 12:00 noon start Presentation 12:00-12:45 (45 mins) Questions/Discussion 12:45-1:30pm (45 mins) Candidacy & Proposal Document Here is my proposal, which includes the candidacy material in the introductory sections. The candidacy reading list is in section 7, just before the proposal proper begins in section 8, but it is also provided below with links for ease of reference. Candidacy Reading List General background on graphical models The introductory sections 1-6 of the proposal. Chapter 2 Background pp. 7-36, pp. 55-60 on distribution representations and polytopes of [WJ08] M. Wainwright and M. Jordan. Graphical models, exponential families and variational inference. Foundations and Trends in Machine Learning, 1(1-2):1–305, 2008. Chapter 2 Background pp. 15-29, Chapter 3 Cycle Relaxation pp. 30-34 of [Son10] D. Sontag. Approximate Inference in Graphical Models using LP Relaxations. PhD thesis, Mas- sachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2010. Complexity foundations [Shi94] S.E. Shimony. Finding MAPs for belief networks is NP-hard. Artificial Intelligence, 68(2):399–410, 1994. Graph theory background High level overview only of: [CCL+05] M. Chudnovsky, G. Cornuéjols, X. Liu, P. Seymour, and K. Vusković. Recognizing Berge graphs. Combinatorica, 25:143–186, 2005. [FOS11] Y. Faenza, G. Oriolo, and G. Stauffer. An algorithmic decomposition of claw-free graphs leading to an O(n^3)-algorithm for the weighted stable set problem. In SODA, pages 630–646, 2011. With description of quasi-line graphs in [CS05] M. Chudnovsky and P. Seymour. The structure of claw-free graphs. In London Mathematical Society Lecture Note Series, Cambridge University Press, volume 324. 2005. Efficient max flow/min cut for inference and experimental comparison The O(n^3) push-relabel algorithm of [GT88] A. Goldberg and R. Tarjan. A new approach to the maximum flow problem. Journal of the ACM, 35:921–940, 1988. [GPS89] D. Greig, B. Porteous, and A. Seheult. Exact maximum a posteriori estimation for binary images. J. Royal Statistical Soc., Series B, 51(2):271–279, 1989. [BK04] Y. Boykov and V. Kolmogorov. An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Trans. Pattern Anal. Mach. Intell., 26(9):1124–1137, 2004. [SZS+08] R. Szeliski, R. Zabih, D. Scharstein, O. Veksler, V. Kolmogorov, A. Agarwala, M. Tappen, and C. Rother. A comparative study of energy minimization methods for Markov random fields with smoothness-based priors. IEEE Trans. Pattern Anal. Mach. Intell., 30(6):1068–1080, 2008. [KAH+13] J. Kappes, B. Andres, F. Hamprecht, C. Schnörr, S. Nowozin, D. Batra, S. Kim, B. Kausler, J. Lellmann, N. Komodakis, and C. Rother. A comparative study of modern inference techniques for discrete energy minimization problems. 2013. Efficient transformation to a binary problem [SF06] D. Schlesinger and B. Flach. Transforming an arbitrary minsum problem into a binary one. Technical report, Dresden University of Technology, 2006. Reduction of MAP inference to MWSS, perfect graphs [Jeb13] T. Jebara. Tractability: Practical Approaches to Hard Problems, chapter Perfect graphs and graphical modeling. Cambridge Press, 2013. [Jeb09] T. Jebara. MAP estimation, message passing, and perfect graphs. In Uncertainty in Artificial Intelligence, 2009. [FNSI11] J. Foulds, N. Navaroli, P. Smyth, and A. Ihler. Revisiting MAP estimation, message passing, and perfect graphs. In Artificial Intelligence and Statistics, 2011. Bethe approximation. [YFW01] J. Yedidia, W. Freeman, and Y. Weiss. Understanding belief propagation and its generalizations. In International Joint Conference on Artificial Intelligence, Distinguished Lecture Track, 2001. [WT01] M. Welling and Y. Teh. Belief optimization for binary networks: A stable alternative to loopy belief propagation. In Uncertainty in Artificial Intelligence, 2001. pp. 75-93 of [WJ08] M. Wainwright and M. Jordan. Graphical models, exponential families and variational inference. Foundations and Trends in Machine Learning, 1(1-2):1–305, 2008. Overview of [MK07] J. Mooij and H. Kappen. Sufficient conditions for convergence of the sum-product algorithm. IEEE Transactions on Information Theory, 53(12):4422–4437, December 2007. [Shi12] J. Shin. Complexity of Bethe approximation. In Artificial Intelligence and Statistics, 2012. [Ruo12] N. Ruozzi. The Bethe partition function of log-supermodular graphical models. In Neural Information Processing Systems, 2012. Other entropy approximations. pp. 165-175 of [WJ08] M. Wainwright and M. Jordan. Graphical models, exponential families and variational inference. Foundations and Trends in Machine Learning, 1(1-2):1–305, 2008. More on message passing, relationship to the dual of the LP. [GJ07] A. Globerson and T. Jaakkola. Fixing max-product: Convergent message passing algorithms for MAP LP-relaxations. NIPS 2007. [SJ09] D. Sontag and T. Jaakkola. Tree block coordinate descent for MAP in graphical models. In Pro- ceedings of the Twelfth International Conference on Artificial Intelligence and Statistics (AISTATS), volume 8, pages 544–551. JMLR: W&CP, 2009. Dual decomposition approach with projected subgradient for constrained optimization to consider tighter relaxations. [SJ07] D. Sontag and T. Jaakkola. New outer bounds on the marginal polytope. In NIPS, 2007. [BM07] S. Boyd and A. Mutapcic. Subgradient Methods, notes for EE364b, Jan 2007. My Publications to Date [WJ13a] A. Weller and T. Jebara. Bethe bounds and approximating the global optimum. In Artificial Intelligence and Statistics, 2013. [WJ13b] A. Weller and T. Jebara. On MAP inference by MWSS on perfect graphs. In Twenty Ninth Conference on Uncertainty in Artificial Intelligence (UAI), 2013. Earlier work A. Weller, D. Ellis and T. Jebara. Structured Prediction Models for Chord Transcription of Music Audio . International Conference on Machine Learning and Applications (ICMLA), December 2009. These methods were used to provide a slight improvement to Dan Ellis' existing, powerful approach to chord transcription, which led to us submitting the best entry to the MIREX open competition that year, see results here. If interested, a brief description of the overall 2010 LabROSA chord recognition system is given here.