## AI helps you reading Science

## AI Insight

AI extracts a summary of this paper

Weibo:

# High-Dimensional Bayesian Optimization via Nested Riemannian Manifolds

NIPS 2020, (2020)

EI

Keywords

Abstract

Despite the recent success of Bayesian optimization (BO) in a variety of applications where sample efficiency is imperative, its performance may be seriously compromised in settings characterized by high-dimensional parameter spaces. A solution to preserve the sample efficiency of BO in such problems is to introduce domain knowledge int...More

Introduction

- Bayesian optimization (BO) is considered as a powerful machine-learning based optimization method to globally maximize or minimize expensive black-box functions [54].
- A common assumption in high-dimensional BO approaches is that the objective function depends on a limited set of features, i.e. that it evolves along an underlying low-dimensional latent space
- Following this hypothesis, various solutions based either on random embeddings [61, 45, 9] or on latent space learning [15, 25, 44, 64] have been proposed.

Highlights

- Bayesian optimization (BO) is considered as a powerful machine-learning based optimization method to globally maximize or minimize expensive black-box functions [54]
- We proposed HD-geometry-aware BO (GaBO), a high-dimensional geometry-aware Bayesian optimization framework that exploited geometric prior knowledge on the parameter space to optimize highdimensional functions lying on low-dimensional latent spaces
- We used a geometry-aware GP that jointly learned a nested structure-preserving mapping and a representation of the objective function in the latent space.We considered the geometry of the latent space while optimizing the acquisition function and took advantage of the nested mappings to express the query point in the high-dimensional parameter space
- We showed that high-dimensional geometry-aware BO (HD-GaBO) outperformed other BO approaches in several settings, and consistently performed well while optimizing various objective functions, unlike geometry-unaware state-of-the-art methods
- In order to avoid suboptimal solutions where the optimum of the function may not be included in the estimated latent space, we hypothesize that the dimension d should be selected slightly higher in case of uncertainty on its value [37]
- A limitation of HD-GaBO is that it depends on nested mappings that are specific to each Riemannian manifold

Methods

- The authors evaluate the proposed HD-GaBO framework to optimize high-dimensional functions that lie on an intrinsic low-dimensional space.
- The authors carry out the optimization by running 30 trials with random initialization.
- Both GaBO and HD-GaBO use the geodesic generalization of the SE kernel and their acquisition functions are optimized using trust region on Riemannian manifolds [1].
- All the tested methods use EI as acquisition function and are initialized with 5 random samples.

Conclusion

- The authors proposed HD-GaBO, a high-dimensional geometry-aware Bayesian optimization framework that exploited geometric prior knowledge on the parameter space to optimize highdimensional functions lying on low-dimensional latent spaces.
- A limitation of HD-GaBO is that it depends on nested mappings that are specific to each Riemannian manifold.
- The inverse map does not necessarily exist if the manifold contains self-intersection.
- In this case, a non-parametric reconstruction mapping may be learned.

Study subjects and analysis

random samples: 5

The other state-of-the-art approaches use the classical SE kernel and the constrained acquisition functions are optimized using sequential least squares programming [36]. All the tested methods use EI as acquisition function and are initialized with 5 random samples. The GP parameters are estimated using MLE

random samples: 5

The other state-of-the-art approaches use the classical SE kernel and the constrained acquisition functions are optimized using sequential least squares programming [36]. All the tested methods use EI as acquisition function and are initialized with 5 random samples. The GP parameters are estimated using MLE

Reference

- P. A. Absil, C. G. Baker, and K. A. Gallivan. Trust-region methods on Riemannian manifolds. Foundations of Computational Mathematics, 7:303–330, 2007.
- P. A. Absil, R. Mahony, and R. Sepulchre. Optimization Algorithms on Matrix Manifolds. Princeton University Press, 2007.
- R. Antonova, A. Rai, and C. Atkeson. Deep kernels for optimizing locomotion controllers. In Conference on Robot Learning (CoRL), pages 47–56, 2017.
- R. Antonova, A. Rai, T. Li, and D. Kragic. Bayesian optimization in variational latent spaces with dynamic compression. In Conference on Robot Learning (CoRL), 2019.
- V. Arsigny, P. Fillard, X. Pennec, and N. Ayache. Log-Euclidean metrics for fast and simple calculus on diffusion tensors. Magnetic Resonance in Medicine, 56(2):411–421, 2006.
- Rajendra B. Positive Definite Matrices. Princeton University Press, 2007.
- M. Balandat, B. Karrer, D. R. Jiang, S. Daulton, B. Letham, A. G. Wilson, and E. Bakshy. BoTorch: Programmable bayesian optimization in PyTorch. arXiv preprint 1910.06403, 2019.
- A. Barachant, S. Bonnet, M. Congedo, and C. Jutten. Multiclass brain-computer interface classification by Riemannian geometry. IEEE Trans. on Biomedical Engineering, 59(4):920– 928, 2012.
- M. Binois, D. Ginsbourger, and O. Roustant. On the choice of the low-dimensional domain for global optimization via random embeddings. Journal of Global Optimization, 76(1):69–90, 2020.
- N. Boumal. Riemannian trust regions with finite-difference hessian approximations are globally convergent. In Geometric Science of Information (GSI), pages 467–475, 2015.
- R. H. Byrd, R. B. Schnabel, and G. A. Shultz. A trust region algorithm for nonlinearly constrained optimization. SIAM Journal on Numerical Analysis, 24(5):1152–1170, 1987.
- R. Calandra, J. Peters, C. E. Rasmussen, and M. P. Deisenroth. Manifold Gaussian processes for regression. In Proc. IEEE Intl Joint Conf. on Neural Networks (IJCNN), 2016.
- A. Cully, J. Clune, D. Tarapore, and J. B. Mouret. Robots that can adapt like animals. Nature, 521:503–507, 2015.
- T. R. Davidson, L. Falorsi, N. De Cao, T. Kipf, and J. M. Tomczak. Hyperspherical variational auto-encoders. In Conference on Uncertainty in Artificial Intelligence (UAI), 2018.
- J. Djolonga, A. Krause, and V. Cevher. High-dimensional Gaussian process bandits. In Neural Information Processing Systems (NeurIPS), 2013.
- D. K. Duvenaud. Automatic Model Construction with Gaussian Processes. PhD thesis, University of Cambridge, 2014.
- A. Edelman, T. A. Arias, and S. Smith. The geometry of algorithms with orthogonality constraints. SIAM Journal of Matrix Analysis and Applications, 20(2):303–351, 1998.
- Peter Englert and Marc Toussaint. Combined optimization and reinforcement learning for manipulations skills. In Robotics: Science and Systems (R:SS), 2016.
- A. Feragen and S. Hauberg. Open problem: Kernel methods on manifolds and metric spaces. what is the probability of a positive definite geodesic exponential kernel? In 29th Annual Conference on Learning Theory, pages 1647–1650, 2016.
- A. Feragen, F. Lauze, and S. Hauberg. Geodesic exponential kernels: When curvature and linearity conflict. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2015.
- N. I. Fisher, T. Lewis, and B. J. J. Embleton. Statistical analysis of spherical data. Cambridge University Press, 1987.
- P. T. Fletcher and S. C. Joshi. Principal geodesic analysis on symmetric spaces: Statistics of diffusion tensors. In In Proc. of CVAMIA and MMBIA Worshops, pages 87–98, 2004.
- J. R. Gardner, C. Guo, K. Q. Weinberger, R. Garnett, and R. Grosse. Discovering and exploiting additive structure for Bayesian optimization. In Proc. of the Intl Conf. on Artificial Intelligence and Statistics (AISTATS), pages 1311–1319, 2017.
- J. R. Gardner, G. Pleiss, D. Bindel, K. Q. Weinberger, and A. G. Wilson. GPyTorch: Blackbox matrix-matrix gaussian process inference with GPU acceleration. In Neural Information Processing Systems (NeurIPS), 2018.
- R. Garnett, M. A. Osborne, and P. Hennig. Active learning of linear embeddings for Gaussian processes. In Conference of Uncertainty in Artificial Intelligence (UAI), pages 230–239, 2014.
- D. Gaudrie, R. Le Riche, V. Picheny, B. Enaux, and V. Herbert. Modeling and optimization with Gaussian processes in reduced eigenbases. Structural and Multidisciplinary Optimization, 61(6):2343–2361, 2020.
- B. Gong, Y. Shi, F. Sha, and K. Grauman. Geodesic flow kernel for unsupervised domain adaptation. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages 2066– 2073, 2012.
- M. Harandi, M. Salzmann, and R. Hartley. From manifold to manifold: Geometry-aware dimensionality reduction for spd matrices. In Proc. European Conf. on Computer Vision (ECCV), 2014.
- M. Harandi, M. Salzmann, and R. Hartley. Dimensionality reduction on spd manifolds: The emergence of geometry-aware methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(1):48–62, 2018.
- S. Hauberg. Principal curves on Riemannian manifolds. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(9):1915–1921, 2016.
- J. Hu, X. Liu, Z. Wen, and Y. Yuan. A brief introduction to manifold optimization. arXiv preprint 1906.05450, 2019.
- N. Jaquier, L. Rozo, S. Calinon, and M. Bürger. Bayesian optimization meets Riemannian manifolds in robot learning. In Conference on Robot Learning (CoRL), 2019.
- S. Jayasumana, R. Hartley, M. Salzmann, H. Li, and M. Harandi. Kernel methods on Riemannian manifolds with Gaussian RBF kernels. IEEE Trans. on Pattern Analysis and Machine Intelligence, 37(12):2464–2477, 2015.
- S. Jung, I. L. Dryden, and J. S. Marron. Analysis of principal nested spheres. Biometrika, 99(3): 551–568, 2012.
- K. Kandasamy, J. Schneider, and B. Poczos. High dimensional Bayesian optimisation and bandits via additive models. In Intl. Conf. on Machine Learning (ICML), 2015.
- D. Kraft. A software package for sequential quadratic programming. Technical report, Technical Report DFVLR-FB 88-28, Institut für Dynamik der Flugsysteme, Oberpfaffenhofen, 1988.
- B. Letham, R. Calandra, A. Rai, and E. Bakshy. Re-examining linear embeddings for highdimensional Bayesian optimization. In Neural Information Processing Systems (NeurIPS), 2020.
- C. Li, S. Gupta, S. Rana, V. Nguyen, S. Venkatesh, and A. Shilton. High dimensional Bayesian optimization using dropout. In Intl. Joint Conf. on Artificial Intelligence (IJCAI), pages 2096– 2102, 2017.
- C.-L. Li, K. Kandasamy, B. Póczos, and J. Schneider. High dimensional Bayesian optimization via restricted projection pursuit models. In Proc. of the Intl Conf. on Artificial Intelligence and Statistics (AISTATS), 2016.
- C. Liu and N. Boumal. Simple algorithms for optimization on Riemannian manifolds with constraints. Applied Mathematics & Optimization, pages 1–33, 2019.
- A. Mallasto and A. Feragen. Wrapped Gaussian process regression on Riemannian manifolds. In IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages 5580–5588, 2018.
- A. Marco, P. Hennig, J. Bohg, S. Schaal, and S. Trimpe. Automatic LQR tuning based on Gaussian process global optimization. In IEEE Intl. Conf. on Robotics and Automation (ICRA), pages 270–277, 2016.
- A. Marco, P. Hennig, S. Schaal, and S. Trimpe. On the design of LQR kernels for efficient controller learning. In IEEE Conference on Decision and Control (CDC), pages 5193–5200, 2017.
- R. Moriconi, M. P. Deisenroth, and K. S. Sesh Kumar. High-dimensional Bayesian optimization using low-dimensional feature spaces. Machine Learning, 109:1925–1943, 2020.
- A. Munteanu, A. Nayebi, and M. Poloczek. A framework for Bayesian optimization in embedded subspaces. In Intl. Conf. on Machine Learning (ICML), volume 97, pages 4752–4761, 2019.
- M. Mutný and A. Krause. Efficient high dimensional Bayesian optimization with additivity and quadrature fourier features. In Neural Information Processing Systems (NeurIPS), 2018.
- C. Oh, E. Gavves, and M. Welling. BOCK: Bayesian optimization with cylindrical kernels. In Intl. Conf. on Machine Learning (ICML), pages 3868–3877, 2018.
- X. Pennec. Barycentric subspace analysis on manifolds. Annals of Statistics, 46(6A):2711–2746, 2018.
- X. Pennec, P. Fillard, and N. Ayache. A Riemannian framework for tensor computing. Intl. Journal on Computer Vision, 66(1):41–66, 2006.
- X. Pennec, S. Sommer, and T. Fletcher. Riemannian Geometric Statistics in Medical Image Analysis. Elsevier, 2019.
- Arthur Pewsey and Eduardo García-Portugués. Recent advances in directional statistics. arXiv preprint 2005.06889, 2020.
- S. M. Pizer, S. Jung, D. Goswami, J. Vicory, X. Zhao, R. Chaudhuri, J. N. Damon, S. Huckemann, and J. S. Marron. Nested sphere statistics of skeletal models. Innovations for Shape Analysis, pages 93–115, 2012.
- A. Rai, R. Antonova, S. Song, W. Martin, H. Geyer, and C. Atkeson. Bayesian optimization using domain knowledge on the ATRIAS biped. In IEEE Intl. Conf. on Robotics and Automation (ICRA), pages 1771–1778, 2018.
- B. Shahriari, K. Swersky, Z. Wang, R. P. Adams, and N. de Freitas. Taking the human out of the loop: A review of Bayesian optimization. Proceedings of the IEEE, 104(1):148–175, 2016.
- J. Snoek, H. Larochelle, and R. P. Adams. Practical Bayesian optimization of machine learning algorithms. In Neural Information Processing Systems (NeurIPS), page 2951–2959, 2012.
- S. Sommer, F. Lauze, S. Hauberg, and M. Nielsen. Manifold valued statistics, exact principal geodesic analysis and the effect of linear approximations. In European Conf. On Computer Vision, pages 43–56, 2010.
- S. Sommer, F. Lauze, and M. Nielsen. Optimization over geodesics for exact principal geodesic analysis. Advances in Computational Mathematics, 40(2):283–313, 2014.
- S. Sra. Directional statistics in machine learning: a brief review. In C. Ley and T. Verdebout, editors, Applied Directional Statistics, Chapman & Hall/CRC Interdisciplinary Statistics Series, pages 259–276. CRC Press, Boca Raton, 2018.
- J. Townsend, N. Koep, and S. Weichwald. Pymanopt: A python toolbox for optimization on manifolds using automatic differentiation. Journal of Machine Learning Research, 17(137): 1–5, 2016.
- O. Tuzel, F. Porikli, and P. Meer. Region covariance: A fast descriptor for detection and classification. In European Conference on Computer Vision (ECCV), pages 589–600, 2006.
- Z. Wang, M. Zoghiy, F. Hutterz, D. Matheson, and N. De Freitas. Bayesian optimization in high dimensions via random embeddings. In Intl. Joint Conf. on Artificial Intelligence (IJCAI), pages 1778–1784, 2013.
- J. Xu and G. Durrett. Spherical latent spaces for stable variational autoencoders. In In Proc. of Conf. on Empirical Methods in Natural Language Processing (EMNLP), 2018.
- Y. Yuan. A review of trust region algorithms for optimization. In In Proc. of the Intl Congress on Industrial & Applied Mathematics (ICIAM), pages 271–282, 1999.
- M. Zhang, H. Li, and S. Su. High dimensional Bayesian optimization via supervised dimension reduction. In Proc. of Intl Joint Conf. on Artificial Intelligence (IJCAI), 2019.
- S. Zhu, D. Surovik, K. Bekris, and A. Boularias. Efficient model identification for tensegrity locomotion. In IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems (IROS), pages 2985– 2990, 2018.
- 2. As a consequence of the previous point, the candidate is obtained by computing Expzk (ηk).

Tags

Comments

数据免责声明

页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果，我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问，可以通过电子邮件方式联系我们：report@aminer.cn