Published Versions 1 Vol 3 (2) : 228-260 2021
Download
Probabilistic Tractable Models in Mixed Discrete-Continuous Domains
: 2020 - 07 - 01
: 2020 - 09 - 27
: 2020 - 10 - 14
235 4 0
Abstract & Keywords
Abstract: We study the problem of the unsupervised learning of graphical models in mixed discrete-continuous domains. The problem of unsupervised learning of such models in discrete domains alone is notoriously challenging, compounded by the fact that inference is computationally demanding. The situation is generally believed to be significantly worse in discrete-continuous domains: estimating the unknown probability distribution given samples is often limited in practice to a handful of parametric forms, and in addition to that, computing conditional queries need to carefully handle low-probability regions in safety-critical applications. In recent years, the regime of tractable learning has emerged, which attempts to learn a graphical model that permits efficient inference. Most of the results in this regime are based on arithmetic circuits, for which inference is linear in the size of the obtained circuit. In this work, we show how, with minimal modifications, such regimes can be generalized by leveraging efficient density estimation schemes based on piecewise polynomial approximations. Our framework is realized on a recent computational abstraction that permits efficient inference for a range of queries in the underlying language. Our empirical results show that our approach is effective, and allows a study of the trade-off between the granularity of the learned model and its predictive power.
Keywords: Graphical models; Tractable inference; Hybrid domains; Weighted model integration
Acknowledgements
Andreas Bueff was partly supported by EPSRC Platform Grant EP/N014758/1. Vaishak Belle was partly supported by the EPSRC grant Towards Explainable and Robust Statistical AI: A Symbolic Approach, and partly supported by a Royal Society University Research Fellowship.
[1]
Murphy, K.: Machine learning: A probabilistic perspective. The MIT Press, Cambridge (2012)
[2]
Bacchus F., Dalmao, S., Pitassi, T.: Solving #SAT and Bayesian inference with backtracking search. Journal of Artificial Intelligence Research 34(1), 391–442 (2009)
[3]
Valiant, L.G.: The complexity of enumeration and reliability problems. SIAM Journal on Computing 8(3), 410–421 (1979)
[4]
Koller D., Friedman, N.: Probabilistic graphical models: Principles and techniques. The MIT Press, Cambridge (2009)
[5]
Bach, F.R., Jordan, M.I.: Thin junction trees. In: Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic, pp. 569–576 (2001)
[6]
Chavira, M., Darwiche A.: On probabilistic inference by weighted model counting. Artificial Intelligence 172(6-7), 772–799 (2008)
[7]
Poon, H., Domingos, P.: Sum-product networks: A new deep architecture. In: Proceedings of the 12th Conference on Uncertainty in Artificial Intelligence, p. 337–346 (2011)
[8]
Rashwan, A., Poupart, P., Chen Z.: Discriminative training of sum-product networks by extended baum-welch. In: Proceedings of the Ninth International Conference on Probabilistic Graphical Models (PMLR), pp. 356-367 (2018)
[9]
Bengio, Y.: Learning deep architectures for AI. Foundations and Trends in Machine Learning 2(1): 1–127 (2009)
[10]
Gens, R., Domingos, P.: Learning the structure of sum-product networks. In: International Conference on Machine Learning, pp. 873–880 (2013)
[11]
Hsu, W., Kalra, A., Poupart, P.: Online structure learning for sum-product networks with Gaussian leaves. arXiv preprint arXiv:1701.05265 (2017)
[12]
Liang, Y., Bekker, J., van den Broeck, G.: Learning the structure of probabilistic sentential decision diagrams. In: Proceedings of the 33rd Conference on Uncertainty in Artificial Intelligence (UAI), pp. 1-10 (2017)
[13]
Heckerman, D., Geiger, D., Chickering, D.M.: Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning 20, 197–243 (1995)
[14]
Yang, X., Cao, J., Yu, W.: Exponential synchronization of memristive cohen–grossberg neural networks with mixed delays. Cognitive Neurodynamics 8(3), 239–249 (2014)
[15]
Fierens, D., et al.: Inference in probabilistic logic programs using weighted CNF’s. In: Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence, pp. 211–220 (2011)
[16]
Belle, V., Passerini, A., van den Broeck, G.: Probabilistic inference in hybrid domains by weighted model integration. In: Proceedings of 24th International Joint Conference on Artificial Intelligence (IJCAI), pp. 2770–2776 (2015)
[17]
Acharya, J. et al.: Sample-optimal density estimation in nearly-linear time. arXiv preprint arXiv: 1506.00671 (2015)
[18]
Albarghouthi, A. et al.: Quantifying program bias. arXiv preprint arXiv:1702.05437 (2017)
[19]
Sanner, S., Abbasnejad, E.: Symbolic variable elimination for discrete and continuous graphical models. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence, pp. 1954–1960 (2012)
[20]
Shenoy, P.P., West, J.C.: Extended shenoy–shafer architecture for inference in hybrid Bayesian networks with deterministic conditionals. International Journal of Approximate Reasoning 52(6), 805–818 (2011)
[21]
Chan, S.-O., et al.: Efficient density estimation via piecewise polynomial approximation. In: Proceedings of the 46th Annual ACM Symposium on Theory of Computing, pp. 604–613 (2014)
[22]
Baldoni, V. et al.: A user’s guide for latte integrale v1. 7.1. Optimization 22, 2 (2014)
[23]
Belle, V., van den Broeck, G., Passerini, A.: Component caching in hybrid domains with piecewise polynomial densities. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence, pp. 3369–3375 (2016)
[24]
Morettin, P. et al. Efficient weighted model integration via SMT-based predicate abstraction. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI'17), pp. 720–728 (2017)
[25]
Vergari, A. Di Mauro, N., Esposito, F.: Simplifying, regularizing and strengthening sum-product network structure learning. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 343–358 (2015)
[26]
Molina, A., Natarajan, S., Kersting, K.: Poisson sum-product networks: A deep architecture for tractable multivariate poisson distributions. In: Proceedings of 26th International Joint Conference on Artificial Intelligence (IJCAI), pp. 2357–2363 (2017)
[27]
Baldoni, V. et al.: How to integrate a polynomial over a simplex. Mathematics of Computation 80(273), 297–325 (2011)
[28]
Molina, A., et al.: Mixed sum-product networks: A deep architecture for hybrid domains. In: The 32nd AAAI Conference on Artificial Intelligence (AAAI-18), pp. 1-8 (2018)
[29]
Gebelein, H.: Das statistische problem der korrelation als variations-und eigenwertproblem und sein zusammen hang mit der ausgleichsrechnung. ZAMM-Journal of Applied Mathematics and Mechanics/Zeitschrift fur Ange- ¨ wandte Mathematik und Mechanik 21(6), 364–379 (1941)
[30]
Kulessa, M.: Model-based approximate query processing. arXiv preprint arXiv: 1811.06224 (2018)
[31]
Vergari, A. et al.: Automatic Bayesian density analysis. In: The 33th AAAI Conference on Artificial Intelligence (AAAI-19), pp 5207-5215 (2019)
[32]
Murphy, K.P.: A variational approximation for Bayesian networks with discrete and continuous latent variables, Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence (UAI'99), pp. 457–466 (1999)
[33]
Lauritzen, S.L., Jensen, F. Stable local computation with conditional Gaussian distributions. Statistics and Computing 11(2), 191–203 (2001)
[34]
Nitti, D. et al.: Learning the structure of dynamic hybrid relational models. In: The 22nd European Conference on Artificial Intelligence (ECAI) pp. 1283–1290 (2016)
[35]
Ravkic I., Ramon, J., Davis, J.: Learning relational dependency networks in hybrid domains. Machine Learning 100(2-3), 217–254 (2015)
[36]
Peharz, R., et al.: On Theoretical Properties of SumProduct Networks. In: Proceedings of the 18th International Conference on Artificial Intelligence and Statistics, pp. 744–752 (2015)
[37]
Trapp, M., et al.: Bayesian learning of sum-product networks. In: The Neural Information Processing Systems Conference (NIPS 2019), pp. 1-12 (2019)
[38]
Peharz, R., et al.: Random sum-product networks: A simple and effective approach to probabilistic deep learning. In: The 35th Conference on Uncertainty in Artificial Intelligence (UAI 2019), pp. 334–344 (2020)
[39]
Peharz, R., et al.: Einsum networks: Fast and scalable learning of tractable probabilistic circuits. arXiv preprint arXiv: 2004.06231 (2020)
[40]
Belle V., van den Broeck, G., Passerini, A.: Hashing-based approximate probabilistic inference in hybrid do mains. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence, pp. 4115–4119 (2016)
[41]
Zeng, Z., van den Broeck, G.: Efficient search-based weighted model integration. arXiv preprint arXiv: 1903.05334 (2019)
[42]
Kolb, S., Dos Martires, P.Z., De Raedt, L.: How to exploit structure while solving weighted model integration problems, Conference on Uncertainty in Artificial Intelligence, pp. 1-11 (2019)
[43]
Chistikov, D., Dimitrova, R., Majumdar, R.: Approximate counting in SMT and value estimation for probabilistic programs. In: Tools and Algorithms for the Construction and Analysis of Systems (TACAS), pp. 320–334 (2015)
[44]
Barrett, C.W., et al.: Satisfiability modulo theories. In: Biere, A., et al. (eds.) Handbook of Satisfiability, pp. 825–885. IOS Press, Amsterdam (2009)
[45]
Shenoy, P.P.: Two issues in using mixtures of polynomials for inference in hybrid Bayesian networks. International Journal of Approximate Reasoning 53(5), 847–866 (2012)
[46]
Bekker, J. et al.: Tractable learning for complex probability queries. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, pp. 2242–2250 (2015)
[47]
Dougherty, J., et al.: Supervised and unsupervised discretization of continuous features, Machine learning. In: Proceedings of the Twelfth International Conference, pp. 194–202 (1995)
[48]
Speichert, S.: Learning Hybrid Relational Rules with Piecewise Polynomial Weight Functions for Probabilistic Logic Programming, Master’s thesis, University of Edinburgh (2017)
[49]
Schwarz, G.: Estimating the dimension of a model. The Annals of Statistics 6(2), 461–464 (1978)
[50]
Lopez-Cruz, P.L., Bielza, C.,Larrañaga, P.: Learning mixtures of polynomials of multidimensional probability densities from data using B-spline interpolation. International Journal of Approximate Reasoning 55(4), 989–1010 (2014)
[51]
Dheeru, D., Taniskidou, E.K.: UCI machine learning repository. Available at: http://archive.ics.uci.edu/ml/about.html.
Article and author information
Cite As
Bueff, A., Speichert, S., Belle, V.: Probabilistic tractable models in mixed discrete-continuous domains. Data Intelligence 3(2021). doi: 10.1162/dint_a_00064
Andreas Bueff
A. Bueff (andreas.bueff@ed.ac.uk), S. Speichert (s.speichert@ed.ac.uk) and V. Belle (vaishak@ed.ac.uk)were all responsible for the design of the research, the implementation of the approach, the analysis of theresults. All authors contributed to the manuscript writing and they edited and reviewed the final version ofthe article.
Andreas Bueff is a postgraduate research student in the Artificial Intelligenceand its Applications Institute at the School of Informatics, University ofEdinburgh. He is interested in explainable AI as a platform for exploringunsupervised and reinforcement learning methods, research into tractablelearning of probabilistic graphical models across continuous and discrete datadomains, and the interpretability of relational reinforcement learning agentsin model-free learning environments, including the use of counterfactuals forbetter agent understanding. As part of his research, he has collaborated withthe University of Edinburgh, Business School in conjunction with Nationwidein stress scenario testing of credit risk models.
0000-0002-7538-0699
Stefanie Speichert
A. Bueff (andreas.bueff@ed.ac.uk), S. Speichert (s.speichert@ed.ac.uk) and V. Belle (vaishak@ed.ac.uk)were all responsible for the design of the research, the implementation of the approach, the analysis of theresults. All authors contributed to the manuscript writing and they edited and reviewed the final version ofthe article.
Stefanie Speichert is a final year PhD student at the Institute for Perception,Action and Behavior at the School of Informatics, University of Edinburgh.She is interested in interpretable machine learning methods with a particularfocus on bridging the gap between traditional probabilistic planners andneural networks. Her previous work on probabilistic logic programming inhybrid discrete-continuous domains has received the best student paperaward (long paper track) at the International Conference on Inductive LogicProgramming (ILP) in 2019. She was further able to apply her knowledge tothe medical and self-driving domain through two internships at IBM Researchand Lyft's Self-Driving Research lab, respectively. When she is not workingshe loves to engage in science communication which has led her to co-foundand lead two of the largest Edinburgh-based AI public outreach groups todemocratize AI knowledge to students and professionals alike.
0000-0003-1122-0685
Vaishak Belle
A. Bueff (andreas.bueff@ed.ac.uk), S. Speichert (s.speichert@ed.ac.uk) and V. Belle (vaishak@ed.ac.uk)were all responsible for the design of the research, the implementation of the approach, the analysis of theresults. All authors contributed to the manuscript writing and they edited and reviewed the final version ofthe article.
Vaishak Belle is a Chancellor’s Fellow and Faculty at the School of Informatics,University of Edinburgh, an Alan Turing Institute Faculty Fellow, a RoyalSociety University Research Fellow, and a member of the RSE (Royal Societyof Edinburgh) Young Academy of Scotland. At the University of Edinburgh, hedirects a research lab on artificial intelligence, specializing in the unificationof symbolic systems and machine learning, with a recent emphasis onexplainability and ethics. He has given research seminars at numerousacademic institutions, tutorials at AI conferences, and talks at venues such asArs Electronica and the Samsung AI Forum. He has co-authored over 50scientific articles on AI, at venues such as IJCAI, UAI, AAAI, MLJ, AIJ, JAIR,AAMAS, and along with his co-authors, he has won the Microsoft best paperaward at UAI, the Machine learning journal best student paper award atECML-PKDD, and the Machine learning journal best student paper award atILP. In 2014, he received a silver medal by the Kurt Goedel Society. Recently,he has consulted with major banks on explainable AI and its impact infinancial institutions.
0000-0001-5573-8465
Publication records
Published: July 6, 2021 (Versions1
References
Data Intelligence