Convex and concave envelopes of artificial neural network activation functions for deterministic global optimization

Wilhelm, Matthew E.; Wang, Chenyu; Stuber, Matthew D.

doi:10.1007/s10898-022-01228-x

Convex and concave envelopes of artificial neural network activation functions for deterministic global optimization

Published: 29 August 2022

Volume 85, pages 569–594, (2023)
Cite this article

Journal of Global Optimization Aims and scope Submit manuscript

769 Accesses
2 Citations
Explore all metrics

Abstract

In this work, we present general methods to construct convex/concave relaxations of the activation functions that are commonly chosen for artificial neural networks (ANNs). The choice of these functions is often informed by both broader modeling considerations balanced with a need for high computational performance. The direct application of factorable programming techniques to compute bounds and convex/concave relaxations of such functions often lead to weak enclosures due to the dependency problem. Moreover, the piecewise formulation that defines several popular activation functions, prevents the computation of convex/concave relaxations as they violate the factorable function requirement. To improve the performance of relaxations of ANNs for deterministic global optimization applications, this study presents the development of a library of envelopes of the thoroughly studied rectifier-type and sigmoid activation functions, in addition to the novel self-gated sigmoid-weighted linear unit (SiLU) and Gaussian error linear unit activation functions. We demonstrate that the envelopes of activation functions directly lead to tighter relaxations of ANNs on their input domain. In turn, these improvements translate to a dramatic reduction in CPU runtime required for solving optimization problems involving ANN models to epsilon-global optimality. We further demonstrate that the factorable programming approach leads to superior computational performance over alternative state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Neural Network Models in Combinatorial Optimization

Deterministic Global Optimization with Artificial Neural Networks Embedded

Article 12 October 2018

Neural Networks with Linear Threshold Activations: Structure and Algorithms

Data Availability

The datasets generated and analysed during the current study are available in the Github repository, https://github.com/PSORLab/RSActivationFunctions.

References

Kahrs, O., Marquardt, W.: The validity domain of hybrid models and its application in process optimization. Chem. Eng. Process. 46(11), 1054–1066 (2007). https://doi.org/10.1016/j.cep.2007.02.031
Article Google Scholar
Henao, C.A., Maravelias, C.T.: Surrogate-based superstructure optimization framework. AIChE J. 57(5), 1216–1232 (2010). https://doi.org/10.1002/aic.12341
Article Google Scholar
Schweidtmann, A.M., Mitsos, A.: Deterministic global optimization with artificial neural networks embedded. J. Optim. Theory Appl. 180(3), 925–948 (2018). https://doi.org/10.1007/s10957-018-1396-0
Article MathSciNet MATH Google Scholar
Williams, C., Rasmussen, C.: Gaussian processes for regression. Adv. Neural. Inf. Process. Syst. 8, 514–520 (1995). https://doi.org/10.5555/2998828.2998901
Article Google Scholar
Caballero, J.A., Grossmann, I.E.: An algorithm for the use of surrogate models in modular flowsheet optimization. AIChE J. 54(10), 2633–2650 (2008). https://doi.org/10.1002/aic.11579
Article Google Scholar
Schweidtmann, A.M., Bongartz, D., Grothe, D., Kerkenhoff, T., Lin, X., Najman, J., Mitsos, A.: Deterministic global optimization with gaussian processes embedded. Math. Program. Comput. 13(3), 553–581 (2021). https://doi.org/10.1007/s12532-021-00204-y
Article MathSciNet MATH Google Scholar
Schweidtmann, A.M., Weber, J.M., Wende, C., Netze, L., Mitsos, A.: Obey validity limits of data-driven models through topological data analysis and one-class classification. Optim. Eng. (2021). https://doi.org/10.1007/s11081-021-09608-0
Article MATH Google Scholar
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., Zheng, X.: TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. Software available from tensorflow.org (2015). https://www.tensorflow.org/
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., Chintala, S.: PyTorch: An imperative style, high-performance deep learning library. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’ Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems 32 (NeurIPS 2019), pp. 8024–8035. Curran Associates, Inc., Vancouver (2019)
Fahmi, I., Cremaschi, S.: Process synthesis of biodiesel production plant using artificial neural networks as the surrogate models. Comput. Chem. Eng. 46, 105–123 (2012). https://doi.org/10.1016/j.compchemeng.2012.06.006
Article Google Scholar
Nagata, Y., Chu, K.H.: Optimization of a fermentation medium using neural networks and genetic algorithms. Biotech. Lett. 25(21), 1837–1842 (2003). https://doi.org/10.1023/a:1026225526558
Article Google Scholar
Anna, H.R.S., Barreto, A.G., Tavares, F.W., de Souza, M.B.: Machine learning model and optimization of a PSA unit for methane-nitrogen separation. Comput. Chem. Eng. 104, 377–391 (2017). https://doi.org/10.1016/j.compchemeng.2017.05.006
Article Google Scholar
Dornier, M., Decloux, M., Trystram, G., Lebert, A.M.: Interest of neural networks for the optimization of the crossflow filtration process. LWT Food Sci. Technol. 28(3), 300–309 (1995). https://doi.org/10.1016/s0023-6438(95)94364-1
Article Google Scholar
Nascimento, C.A.O., Giudici, R.: Neural network based approach for optimisation applied to an industrial nylon-6,6 polymerisation process. Comput. Chem. Eng. 22, 595–600 (1998). https://doi.org/10.1016/s0098-1354(98)00105-7
Article Google Scholar
Hussain, M.A.: Review of the applications of neural networks in chemical process control—simulation and online implementation. Artif. Intell. Eng. 13(1), 55–68 (1999). https://doi.org/10.1016/s0954-1810(98)00011-9
Article Google Scholar
Onel, M., Kieslich, C.A., Pistikopoulos, E.N.: A nonlinear support vector machine-based feature selection approach for fault detection and diagnosis: application to the tennessee eastman process. AIChE J. 65(3), 992–1005 (2019). https://doi.org/10.1002/aic.16497
Article Google Scholar
Seong, Y., Park, C., Choi, J., Jang, I.: Surrogate model with a deep neural network to evaluate gas–liquid flow in a horizontal pipe. Energies 13(4), 968 (2020). https://doi.org/10.3390/en13040968
Article Google Scholar
Villmann, T., Ravichandran, J., Villmann, A., Nebel, D., Kaden, M.: Investigation of activation functions for generalized learning vector quantization. In: International Workshop on Self-Organizing Maps, pp. 179–188. Springer, Berlin (2019). https://doi.org/10.1007/978-3-030-19642-4_18
Xu, L., Chen, C.P.: Comparison and combination of activation functions in broad learning system. In: 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 3537–3542 (2020). https://doi.org/10.1109/SMC42975.2020.9282871
Nader, A., Azar, D.: Searching for activation functions using a self-adaptive evolutionary algorithm. In: Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion, pp. 145–146 (2020). https://doi.org/10.1145/3377929.3389942
Cristina, G.N.M., Sanchez, V.G.C., Villegas, O.O.V., Nandayapa, M., Dominguez, H.d.J.O., Azuela, J.H.S.: Study of the effect of combining activation functions in a convolutional neural network. IEEE Lat. Am. Trans. 19(5), 844–852 (2021). https://doi.org/10.1109/TLA.2021.9448319
Fischetti, M., Jo, J.: Deep neural networks and mixed integer linear optimization. Constraints 23(3), 296–309 (2018)
Article MathSciNet MATH Google Scholar
Anderson, R., Huchette, J., Ma, W., Tjandraatmadja, C., Vielma, J.P.: Strong mixed-integer programming formulations for trained neural networks. Math. Program. 1–37 (2020)
Kronqvist, J., Misener, R., Tsay, C.: Between steps: Intermediate relaxations between big-m and convex hull formulations. In: International Conference on Integration of Constraint Programming, Artificial Intelligence, and Operations Research, pp. 299–314. Springer, Berlin (2021)
Tsay, C., Kronqvist, J., Thebelt, A., Misener, R.: Partition-based formulations for mixed-integer optimization of trained ReLU neural networks. In: Advances in Neural Information Processing Systems, vol. 34 (2021)
Grimstad, B., Andersson, H.: ReLU networks as surrogate models in mixed-integer linear programs. Comput. Chem. Eng. 131, 106580 (2019). https://doi.org/10.1016/j.compchemeng.2019.106580
Article Google Scholar
Schweidtmann, A.M., Huster, W.R., Lüthje, J.T., Mitsos, A.: Deterministic global process optimization: Accurate (single-species) properties via artificial neural networks. Comput. Chem. Eng. 121, 67–74 (2019). https://doi.org/10.1016/j.compchemeng.2018.10.007
Article Google Scholar
Moore, R.E., Kearfott, R.B., Cloud, M.J.: Introduction to Interval Analysis. SIAM, Philadelpha (2009). https://doi.org/10.1137/1.9780898717716
Sahlodin, A.M., Chachuat, B.: Convex/concave relaxations of parametric odes using Taylor models. Comput. Chem. Eng. 35(5), 844–857 (2011). https://doi.org/10.1016/j.compchemeng.2011.01.031
Article MATH Google Scholar
McCormick, G.P.: Computability of global solutions to factorable nonconvex programs: part I—convex underestimating problems. Math. Program. 10(1), 147–175 (1976). https://doi.org/10.1007/bf01580665
Article MATH Google Scholar
Scott, J.K., Stuber, M.D., Barton, P.I.: Generalized McCormick relaxations. J. Global Optim. 51(4), 569–606 (2011). https://doi.org/10.1007/s10898-011-9664-7
Article MathSciNet MATH Google Scholar
Mitsos, A., Chachuat, B., Barton, P.I.: McCormick-based relaxations of algorithms. SIAM J. Optim. 20(2), 573–601 (2009). https://doi.org/10.1137/080717341
Article MathSciNet MATH Google Scholar
Stuber, M.D., Scott, J.K., Barton, P.I.: Convex and concave relaxations of implicit functions. Optim. Methods Softw. 30(3), 424–460 (2015). https://doi.org/10.1080/10556788.2014.924514
Article MathSciNet MATH Google Scholar
Yajima, Y.: Convex envelopes in optimization problems: Convex envelopes in optimization problems. In: Floudas, C.A., Pardalos, P.M. (eds.) Encyclopedia of Optimization, pp. 343–344. Springer, Boston (2001). https://doi.org/10.1007/0-306-48332-7_74
Chapter Google Scholar
Wilhelm, M.E., Gottlieb, R.X., Stuber, M.D.: PSORLab/McCormick.jl. Zenodo (2020). https://doi.org/10.5281/ZENODO.5749918. https://github.com/PSORLab/McCormick.jl
Funahashi, K.-I., Nakamura, Y.: Approximation of dynamical systems by continuous time recurrent neural networks. Neural Netw. 6(6), 801–806 (1993). https://doi.org/10.1016/S0893-6080(05)80125-X
Article Google Scholar
Haber, E., Ruthotto, L.: Stable architectures for deep neural networks. Inverse Prob. 34(1), 014004 (2017). https://doi.org/10.1088/1361-6420/aa9a90
Article MathSciNet MATH Google Scholar
Lu, Y., Zhong, A., Li, Q., Dong, B.: Beyond finite layer neural networks: Bridging deep architectures and numerical differential equations. In: International Conference on Machine Learning, pp. 3276–3285 (2018). PMLR
Ruthotto, L., Haber, E.: Deep neural networks motivated by partial differential equations. J. Math. Imaging Vis. 62(3), 352–364 (2020). https://doi.org/10.1007/s10851-019-00903-1
Article MathSciNet MATH Google Scholar
Chen, R.T., Rubanova, Y., Bettencourt, J., Duvenaud, D.: Neural ordinary differential equations. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 6572–6583 (2018). https://doi.org/10.5555/3327757.3327764
Rackauckas, C., Innes, M., Ma, Y., Bettencourt, J., White, L., Dixit, V.: Diffeqflux.jl—A Julia library for neural differential equations. arXiv preprint arXiv:1902.02376 (2019)
Scott, J.K., Barton, P.I.: Improved relaxations for the parametric solutions of odes using differential inequalities. J. Global Optim. 57(1), 143–176 (2013). https://doi.org/10.1007/s10898-012-9909-0
Article MathSciNet MATH Google Scholar
Scott, J.K., Chachuat, B., Barton, P.I.: Nonlinear convex and concave relaxations for the solutions of parametric odes. Optimal Control Appl. Methods 34(2), 145–163 (2013). https://doi.org/10.1002/oca.2014
Article MathSciNet MATH Google Scholar
Wilhelm, M.E., Le, A.V., Stuber, M.D.: Global optimization of stiff dynamical systems. AIChE J. (2019). https://doi.org/10.1002/aic.16836
Article Google Scholar
Song, Y., Khan, K.A.: Optimization-based convex relaxations for nonconvex parametric systems of ordinary differential equations. Math. Program. (2021). https://doi.org/10.1007/s10107-021-01654-x
Article Google Scholar
El Ghaoui, L., Gu, F., Travacca, B., Askari, A., Tsai, A.: Implicit deep learning. SIAM J. Math. Data Sci. 3(3), 930–958 (2021). https://doi.org/10.1137/20M1358517
Article MathSciNet MATH Google Scholar
Celik, A.N., Kolhe, M.: Generalized feed-forward based method for wind energy prediction. Appl. Energy 101, 582–588 (2013). https://doi.org/10.1016/j.apenergy.2012.06.040
Article Google Scholar
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of The Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010). https://proceedings.mlr.press/v9/glorot10a.html
Huang, G.-B., Zhu, Q.-Y., Siew, C.-K.: Extreme learning machine: theory and applications. Neurocomputing 70(1–3), 489–501 (2006). https://doi.org/10.1016/j.neucom.2005.12.126
Article Google Scholar
Medsker, L., Jain, L.C.: Recurrent Neural Networks: Design and Applications, pp. 64–67. CRC Press, Boca Raton (1999). https://doi.org/10.1201/9781003040620
Book Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90
Horst, R., Tuy, H.: Global Optimization: Deterministic Approaches. Springer, Berlin (2013)
MATH Google Scholar
Schweidtmann, A.M., Bongartz, D., Huster, W.R., Mitsos, A.: Deterministic global process optimization: Flash calculations via artificial neural networks. Comput. Aided Chem. Eng. 46, 937–942 (2019). https://doi.org/10.1016/b978-0-12-818634-3.50157-0
Article Google Scholar
Chachuat, B.C.: MC++: Toolkit for Construction, Manipulation and Bounding of Factorable Functions (2020). https://omega-icl.github.io/mcpp/
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017). https://doi.org/10.1145/3065386
Article Google Scholar
Lu, L., Shin, Y., Su, Y., Karniadakis, G.E.: Dying ReLU and initialization: theory and numerical examples. Commun. Comput. Phys. 28(5), 1671–1706 (2020). https://doi.org/10.4208/cicp.OA-2020-0165
Article MathSciNet MATH Google Scholar
Clevert, D.-A., Unterthiner, T., Hochreiter, S.: Fast and accurate deep network learning by exponential linear units (ELUs). arXiv preprint arXiv:1511.07289 (2015)
Klambauer, G., Unterthiner, T., Mayr, A., Hochreiter, S.: Self-normalizing neural networks. In: Advances in Neural Information Processing Systems, pp. 971–980 (2017). https://doi.org/10.5555/3294771.3294864
Wilhelm, M.E., Stuber, M.D.: EAGO.jl: easy advanced global optimization in Julia. Optim. Methods Softw. 1, 1–26 (2020). https://doi.org/10.1080/10556788.2020.1786566
Article MATH Google Scholar
Bompadre, A., Mitsos, A.: Convergence rate of McCormick relaxations. J. Global Optim. 52(1), 1–28 (2011). https://doi.org/10.1007/s10898-011-9685-2
Article MathSciNet MATH Google Scholar
Kannan, R., Barton, P.I.: The cluster problem in constrained global optimization. J. Global Optim. 69(3), 629–676 (2017). https://doi.org/10.1007/s10898-017-0531-z
Article MathSciNet MATH Google Scholar
Ryoo, H.S., Sahinidis, N.V.: A branch-and-reduce approach to global optimization. J. Global Optim. 8(2), 107–138 (1996). https://doi.org/10.1007/bf00138689
Article MathSciNet MATH Google Scholar
Tawarmalani, M., Sahinidis, N.V.: A polyhedral branch-and-cut approach to global optimization. Math. Program. 103(2), 225–249 (2005). https://doi.org/10.1007/s10107-005-0581-8
Article MathSciNet MATH Google Scholar
Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: ICML: Proceedings of the 27th International Conference on Machine Learning, pp. 807–814 (2010). https://doi.org/10.5555/3104322.3104425
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1026–1034. IEEE, Santiago, Chile (2015). https://doi.org/10.1109/iccv.2015.123
Eger, S., Youssef, P., Gurevych, I.: Is it time to swish? Comparing deep learning activation functions across NLP tasks. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (2018). https://doi.org/10.18653/v1/d18-1472
Zheng, H., Yang, Z., Liu, W., Liang, J., Li, Y.: Improving deep neural networks using softplus units. In: 2015 International Joint Conference on Neural Networks (IJCNN), pp. 1–4 (2015). https://doi.org/10.1109/IJCNN.2015.7280459. IEEE
Nwankpa, C.E., Ijomah, W., Gachagan, A., Marshall, S.: Activation functions: comparison of trends in practice and research for deep learning. In: 2nd International Conference on Computational Sciences and Technology, pp. 124–133 (2021)
Hornik, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are universal approximators. Neural Netw. 2(5), 359–366 (1989). https://doi.org/10.1016/0893-6080(89)90020-8
Article MATH Google Scholar
Elliott, D.L.: A better activation function for artificial neural networks. Technical report, Institute for Systems Research (1993). http://hdl.handle.net/1903/5355
Sahlodin, A.M.: Global optimization of dynamic process systems using complete search methods. Ph.D. thesis, McMaster University (2013). https://macsphere.mcmaster.ca/handle/11375/12803
Hendrycks, D., Gimpel, K.: Gaussian error linear units (GELUs). arXiv preprint (2016) arXiv:1606.08415
Elfwing, S., Uchibe, E., Doya, K.: Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural Netw. 107, 3–11 (2018). https://doi.org/10.1016/j.neunet.2017.12.012
Article Google Scholar
Elfwing, S., Uchibe, E., Doya, K.: Expected energy-based restricted Boltzmann machine for classification. Neural Netw. 64, 29–38 (2015). https://doi.org/10.1016/j.neunet.2014.09.006
Article MATH Google Scholar
Ramachandran, P., Zoph, B., Le, Q.V.: Searching for activation functions. arXiv preprint (2017) https://arxiv.org/abs/1710.05941
Chen, J., Revels, J.: Robust benchmarking in noisy environments. arXiv e-prints arXiv:1608.04295 [cs.PF] (2016)
Najman, J., Mitsos, A.: Convergence analysis of multivariate McCormick relaxations. J. Global Optim. 66(4), 597–628 (2016). https://doi.org/10.1007/s10898-016-0408-6
Article MathSciNet MATH Google Scholar
Du, K., Kearfott, R.B.: The cluster problem in multivariate global optimization. J. Global Optim. 5(3), 253–265 (1994). https://doi.org/10.1007/bf01096455
Article MathSciNet MATH Google Scholar
Wechsung, A., Schaber, S.D., Barton, P.I.: The cluster problem revisited. J. Global Optim. 58(3), 429–438 (2014). https://doi.org/10.1007/s10898-013-0059-9
Article MathSciNet MATH Google Scholar
Epperly, T.G.W., Pistikopoulos, E.N.: A reduced space branch and bound algorithm for global optimization. J. Global Optim. 11(3), 287–311 (1997). https://doi.org/10.1023/A:1008212418949
Article MathSciNet MATH Google Scholar
Stuber, M.D.: Evaluation of process systems operating envelopes. Ph.D. thesis, Massachusetts Institute of Technology (2012). https://doi.org/10.13140/2.1.1775.4409
Wechsung, A.: Global optimization in reduced space. Ph.D. thesis, Massachusetts Institute of Technology (2014). https://dspace.mit.edu/handle/1721.1/87131
Bongartz, D., Mitsos, A.: Deterministic global optimization of process flowsheets in a reduced space using McCormick relaxations. J. Global Optim. 69(4), 761–796 (2017). https://doi.org/10.1007/s10898-017-0547-4
Article MathSciNet MATH Google Scholar
Sahinidis, N.V.: BARON 21.1.13: Global Optimization of Mixed-Integer Nonlinear Programs, User’s Manual (2017). https://www.minlp.com/downloads/docs/baron%20manual.pdf
Misener, R., Floudas, C.A.: ANTIGONE: Algorithms for continuous/integer global optimization of nonlinear equations. J. Global Optim. 59(2–3), 503–526 (2014). https://doi.org/10.1007/s10898-014-0166-2
Article MathSciNet MATH Google Scholar
Bongartz, D., Najman, J., Sass, S., Mitsos, A.: MAiNGO: McCormick based algorithm for mixed integer nonlinear global optimization. Process Systems Engineering (AVT. SVT), RWTH Aachen University (2018). https://git.rwth-aachen.de/avt-svt/public/maingo
Kearfott, R.B., Castille, J., Tyagi, G.: A general framework for convexity analysis in deterministic global optimization. J. Global Optim. 56(3), 765–785 (2013). https://doi.org/10.1007/s10898-012-9905-4
Article MathSciNet MATH Google Scholar
Khan, K.A., Watson, H.A.J., Barton, P.I.: Differentiable McCormick relaxations. J. Global Optim. 67(4), 687–729 (2016). https://doi.org/10.1007/s10898-016-0440-6
Article MathSciNet MATH Google Scholar
Khan, K.A., Wilhelm, M., Stuber, M.D., Cao, H., Watson, H.A.J., Barton, P.I.: Corrections to: Differentiable McCormick relaxations. J. Global Optim. 70(3), 705–706 (2018). https://doi.org/10.1007/s10898-017-0601-2
Article MathSciNet MATH Google Scholar
Dolan, E.D., Moré, J.J.: Benchmarking optimization software with performance profiles. Math. Program. 91(2), 201–213 (2002). https://doi.org/10.1007/s101070100263
Article MathSciNet MATH Google Scholar
Bezanson, J., Edelman, A., Karpinski, S., Shah, V.B.: Julia: A fresh approach to numerical computing. SIAM Rev. 59(1), 65–98 (2017). https://doi.org/10.1137/141000671
Article MathSciNet MATH Google Scholar
Sanders, D.P., Benet, L., lucaferranti, Agarwal, K., Richard, B., Grawitter, J., Gupta, E., Herbst, M.F., Forets, M., yashrajgupta, Hanson, E., van Dyk, B., Rackauckas, C., Vasani, R., Micluţa-Câmpeanu, S., Olver, S., Koolen, T., Wormell, C., Vázquez, F.A., TagBot, J., O’Bryant, K., Carlsson, K., Piibeleht, M., Reno, Deits, R., Holy, T., Kaluba, M., matsueushi: JuliaIntervals/IntervalArithmetic.jl: V0.18.2. https://doi.org/10.5281/zenodo.4739394
Fedorov, G., Nguyen, K.T., Harrison, P., Singh, A.: Intel Math Kernel Library 2019 Update 2 Release Notes. (2019). https://software.intel.com/en-us/mkl
Anderson, E., Bai, Z., Bischof, C., Blackford, L.S., Demmel, J., Dongarra, J., Croz, J.D., Greenbaum, A., Hammarling, S., McKenney, A., Sorensen, D.: LAPACK Users’ Guide. Society for Industrial and Applied Mathematics, Philadelphia (1999). https://doi.org/10.1137/1.9780898719604
Wang, E., Zhang, Q., Shen, B., Zhang, G., Lu, X., Wu, Q., Wang, Y.: Intel math kernel library. In: High-Performance Computing on the Intel® Xeon Phi™, pp. 167–188. Springer, New York (2014). https://doi.org/10.1007/978-3-319-06486-4_7
Blackford, L.S., Petitet, A., Pozo, R., Remington, K., Whaley, R.C., Demmel, J., Dongarra, J., Duff, I., Hammarling, S., Henry, G., Heroux, M.: An updated set of basic linear algebra subprograms (BLAS). ACM Trans. Math. Softw. 28(2), 135–151 (2002). https://doi.org/10.1145/567806.567807
Article MathSciNet Google Scholar
Vigerske, S., Gleixner, A.: SCIP: global optimization of mixed-integer nonlinear programs in a branch-and-cut framework. Optim. Methods Softw. 33(3), 563–593 (2018). https://doi.org/10.1080/10556788.2017.1335312
Article MathSciNet MATH Google Scholar
Grant, M., Boyd, S., Ye, Y.: In: Liberti, L., Maculan, N. (eds.) Disciplined convex programming, pp. 155–210. Springer, Boston (2006). https://doi.org/10.1007/0-387-30528-9_7
Khajavirad, A., Sahinidis, N.V.: A hybrid LP/NLP paradigm for global optimization relaxations. Math. Program. Comput. 10(3), 383–421 (2018). https://doi.org/10.1007/s12532-018-0138-5
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

This material is based upon work supported by the National Science Foundation under Grant No. 1932723. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Author information

Authors and Affiliations

Process Systems and Operations Research Laboratory, Department of Chemical and Biomolecular Engineering, University of Connecticut, 191 Auditorium Road, Unit 3222, Storrs, CT, 06269, USA
Matthew E. Wilhelm, Chenyu Wang & Matthew D. Stuber

Authors

Matthew E. Wilhelm
View author publications
You can also search for this author in PubMed Google Scholar
Chenyu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Matthew D. Stuber
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Matthew D. Stuber.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Wilhelm, M.E., Wang, C. & Stuber, M.D. Convex and concave envelopes of artificial neural network activation functions for deterministic global optimization. J Glob Optim 85, 569–594 (2023). https://doi.org/10.1007/s10898-022-01228-x

Download citation

Received: 30 September 2021
Accepted: 16 August 2022
Published: 29 August 2022
Issue Date: March 2023
DOI: https://doi.org/10.1007/s10898-022-01228-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Convex and concave envelopes of artificial neural network activation functions for deterministic global optimization

Abstract

Access this article

Similar content being viewed by others

Neural Network Models in Combinatorial Optimization

Deterministic Global Optimization with Artificial Neural Networks Embedded

Neural Networks with Linear Threshold Activations: Structure and Algorithms

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Convex and concave envelopes of artificial neural network activation functions for deterministic global optimization

Abstract

Access this article

Similar content being viewed by others

Neural Network Models in Combinatorial Optimization

Deterministic Global Optimization with Artificial Neural Networks Embedded

Neural Networks with Linear Threshold Activations: Structure and Algorithms

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation