Abstract
In this work, we present general methods to construct convex/concave relaxations of the activation functions that are commonly chosen for artificial neural networks (ANNs). The choice of these functions is often informed by both broader modeling considerations balanced with a need for high computational performance. The direct application of factorable programming techniques to compute bounds and convex/concave relaxations of such functions often lead to weak enclosures due to the dependency problem. Moreover, the piecewise formulation that defines several popular activation functions, prevents the computation of convex/concave relaxations as they violate the factorable function requirement. To improve the performance of relaxations of ANNs for deterministic global optimization applications, this study presents the development of a library of envelopes of the thoroughly studied rectifier-type and sigmoid activation functions, in addition to the novel self-gated sigmoid-weighted linear unit (SiLU) and Gaussian error linear unit activation functions. We demonstrate that the envelopes of activation functions directly lead to tighter relaxations of ANNs on their input domain. In turn, these improvements translate to a dramatic reduction in CPU runtime required for solving optimization problems involving ANN models to epsilon-global optimality. We further demonstrate that the factorable programming approach leads to superior computational performance over alternative state-of-the-art approaches.
Similar content being viewed by others
Data Availability
The datasets generated and analysed during the current study are available in the Github repository, https://github.com/PSORLab/RSActivationFunctions.
References
Kahrs, O., Marquardt, W.: The validity domain of hybrid models and its application in process optimization. Chem. Eng. Process. 46(11), 1054–1066 (2007). https://doi.org/10.1016/j.cep.2007.02.031
Henao, C.A., Maravelias, C.T.: Surrogate-based superstructure optimization framework. AIChE J. 57(5), 1216–1232 (2010). https://doi.org/10.1002/aic.12341
Schweidtmann, A.M., Mitsos, A.: Deterministic global optimization with artificial neural networks embedded. J. Optim. Theory Appl. 180(3), 925–948 (2018). https://doi.org/10.1007/s10957-018-1396-0
Williams, C., Rasmussen, C.: Gaussian processes for regression. Adv. Neural. Inf. Process. Syst. 8, 514–520 (1995). https://doi.org/10.5555/2998828.2998901
Caballero, J.A., Grossmann, I.E.: An algorithm for the use of surrogate models in modular flowsheet optimization. AIChE J. 54(10), 2633–2650 (2008). https://doi.org/10.1002/aic.11579
Schweidtmann, A.M., Bongartz, D., Grothe, D., Kerkenhoff, T., Lin, X., Najman, J., Mitsos, A.: Deterministic global optimization with gaussian processes embedded. Math. Program. Comput. 13(3), 553–581 (2021). https://doi.org/10.1007/s12532-021-00204-y
Schweidtmann, A.M., Weber, J.M., Wende, C., Netze, L., Mitsos, A.: Obey validity limits of data-driven models through topological data analysis and one-class classification. Optim. Eng. (2021). https://doi.org/10.1007/s11081-021-09608-0
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., Zheng, X.: TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. Software available from tensorflow.org (2015). https://www.tensorflow.org/
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., Chintala, S.: PyTorch: An imperative style, high-performance deep learning library. In: Wallach, H., Larochelle, H., Beygelzimer, A., d’ Alché-Buc, F., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems 32 (NeurIPS 2019), pp. 8024–8035. Curran Associates, Inc., Vancouver (2019)
Fahmi, I., Cremaschi, S.: Process synthesis of biodiesel production plant using artificial neural networks as the surrogate models. Comput. Chem. Eng. 46, 105–123 (2012). https://doi.org/10.1016/j.compchemeng.2012.06.006
Nagata, Y., Chu, K.H.: Optimization of a fermentation medium using neural networks and genetic algorithms. Biotech. Lett. 25(21), 1837–1842 (2003). https://doi.org/10.1023/a:1026225526558
Anna, H.R.S., Barreto, A.G., Tavares, F.W., de Souza, M.B.: Machine learning model and optimization of a PSA unit for methane-nitrogen separation. Comput. Chem. Eng. 104, 377–391 (2017). https://doi.org/10.1016/j.compchemeng.2017.05.006
Dornier, M., Decloux, M., Trystram, G., Lebert, A.M.: Interest of neural networks for the optimization of the crossflow filtration process. LWT Food Sci. Technol. 28(3), 300–309 (1995). https://doi.org/10.1016/s0023-6438(95)94364-1
Nascimento, C.A.O., Giudici, R.: Neural network based approach for optimisation applied to an industrial nylon-6,6 polymerisation process. Comput. Chem. Eng. 22, 595–600 (1998). https://doi.org/10.1016/s0098-1354(98)00105-7
Hussain, M.A.: Review of the applications of neural networks in chemical process control—simulation and online implementation. Artif. Intell. Eng. 13(1), 55–68 (1999). https://doi.org/10.1016/s0954-1810(98)00011-9
Onel, M., Kieslich, C.A., Pistikopoulos, E.N.: A nonlinear support vector machine-based feature selection approach for fault detection and diagnosis: application to the tennessee eastman process. AIChE J. 65(3), 992–1005 (2019). https://doi.org/10.1002/aic.16497
Seong, Y., Park, C., Choi, J., Jang, I.: Surrogate model with a deep neural network to evaluate gas–liquid flow in a horizontal pipe. Energies 13(4), 968 (2020). https://doi.org/10.3390/en13040968
Villmann, T., Ravichandran, J., Villmann, A., Nebel, D., Kaden, M.: Investigation of activation functions for generalized learning vector quantization. In: International Workshop on Self-Organizing Maps, pp. 179–188. Springer, Berlin (2019). https://doi.org/10.1007/978-3-030-19642-4_18
Xu, L., Chen, C.P.: Comparison and combination of activation functions in broad learning system. In: 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 3537–3542 (2020). https://doi.org/10.1109/SMC42975.2020.9282871
Nader, A., Azar, D.: Searching for activation functions using a self-adaptive evolutionary algorithm. In: Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion, pp. 145–146 (2020). https://doi.org/10.1145/3377929.3389942
Cristina, G.N.M., Sanchez, V.G.C., Villegas, O.O.V., Nandayapa, M., Dominguez, H.d.J.O., Azuela, J.H.S.: Study of the effect of combining activation functions in a convolutional neural network. IEEE Lat. Am. Trans. 19(5), 844–852 (2021). https://doi.org/10.1109/TLA.2021.9448319
Fischetti, M., Jo, J.: Deep neural networks and mixed integer linear optimization. Constraints 23(3), 296–309 (2018)
Anderson, R., Huchette, J., Ma, W., Tjandraatmadja, C., Vielma, J.P.: Strong mixed-integer programming formulations for trained neural networks. Math. Program. 1–37 (2020)
Kronqvist, J., Misener, R., Tsay, C.: Between steps: Intermediate relaxations between big-m and convex hull formulations. In: International Conference on Integration of Constraint Programming, Artificial Intelligence, and Operations Research, pp. 299–314. Springer, Berlin (2021)
Tsay, C., Kronqvist, J., Thebelt, A., Misener, R.: Partition-based formulations for mixed-integer optimization of trained ReLU neural networks. In: Advances in Neural Information Processing Systems, vol. 34 (2021)
Grimstad, B., Andersson, H.: ReLU networks as surrogate models in mixed-integer linear programs. Comput. Chem. Eng. 131, 106580 (2019). https://doi.org/10.1016/j.compchemeng.2019.106580
Schweidtmann, A.M., Huster, W.R., Lüthje, J.T., Mitsos, A.: Deterministic global process optimization: Accurate (single-species) properties via artificial neural networks. Comput. Chem. Eng. 121, 67–74 (2019). https://doi.org/10.1016/j.compchemeng.2018.10.007
Moore, R.E., Kearfott, R.B., Cloud, M.J.: Introduction to Interval Analysis. SIAM, Philadelpha (2009). https://doi.org/10.1137/1.9780898717716
Sahlodin, A.M., Chachuat, B.: Convex/concave relaxations of parametric odes using Taylor models. Comput. Chem. Eng. 35(5), 844–857 (2011). https://doi.org/10.1016/j.compchemeng.2011.01.031
McCormick, G.P.: Computability of global solutions to factorable nonconvex programs: part I—convex underestimating problems. Math. Program. 10(1), 147–175 (1976). https://doi.org/10.1007/bf01580665
Scott, J.K., Stuber, M.D., Barton, P.I.: Generalized McCormick relaxations. J. Global Optim. 51(4), 569–606 (2011). https://doi.org/10.1007/s10898-011-9664-7
Mitsos, A., Chachuat, B., Barton, P.I.: McCormick-based relaxations of algorithms. SIAM J. Optim. 20(2), 573–601 (2009). https://doi.org/10.1137/080717341
Stuber, M.D., Scott, J.K., Barton, P.I.: Convex and concave relaxations of implicit functions. Optim. Methods Softw. 30(3), 424–460 (2015). https://doi.org/10.1080/10556788.2014.924514
Yajima, Y.: Convex envelopes in optimization problems: Convex envelopes in optimization problems. In: Floudas, C.A., Pardalos, P.M. (eds.) Encyclopedia of Optimization, pp. 343–344. Springer, Boston (2001). https://doi.org/10.1007/0-306-48332-7_74
Wilhelm, M.E., Gottlieb, R.X., Stuber, M.D.: PSORLab/McCormick.jl. Zenodo (2020). https://doi.org/10.5281/ZENODO.5749918. https://github.com/PSORLab/McCormick.jl
Funahashi, K.-I., Nakamura, Y.: Approximation of dynamical systems by continuous time recurrent neural networks. Neural Netw. 6(6), 801–806 (1993). https://doi.org/10.1016/S0893-6080(05)80125-X
Haber, E., Ruthotto, L.: Stable architectures for deep neural networks. Inverse Prob. 34(1), 014004 (2017). https://doi.org/10.1088/1361-6420/aa9a90
Lu, Y., Zhong, A., Li, Q., Dong, B.: Beyond finite layer neural networks: Bridging deep architectures and numerical differential equations. In: International Conference on Machine Learning, pp. 3276–3285 (2018). PMLR
Ruthotto, L., Haber, E.: Deep neural networks motivated by partial differential equations. J. Math. Imaging Vis. 62(3), 352–364 (2020). https://doi.org/10.1007/s10851-019-00903-1
Chen, R.T., Rubanova, Y., Bettencourt, J., Duvenaud, D.: Neural ordinary differential equations. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 6572–6583 (2018). https://doi.org/10.5555/3327757.3327764
Rackauckas, C., Innes, M., Ma, Y., Bettencourt, J., White, L., Dixit, V.: Diffeqflux.jl—A Julia library for neural differential equations. arXiv preprint arXiv:1902.02376 (2019)
Scott, J.K., Barton, P.I.: Improved relaxations for the parametric solutions of odes using differential inequalities. J. Global Optim. 57(1), 143–176 (2013). https://doi.org/10.1007/s10898-012-9909-0
Scott, J.K., Chachuat, B., Barton, P.I.: Nonlinear convex and concave relaxations for the solutions of parametric odes. Optimal Control Appl. Methods 34(2), 145–163 (2013). https://doi.org/10.1002/oca.2014
Wilhelm, M.E., Le, A.V., Stuber, M.D.: Global optimization of stiff dynamical systems. AIChE J. (2019). https://doi.org/10.1002/aic.16836
Song, Y., Khan, K.A.: Optimization-based convex relaxations for nonconvex parametric systems of ordinary differential equations. Math. Program. (2021). https://doi.org/10.1007/s10107-021-01654-x
El Ghaoui, L., Gu, F., Travacca, B., Askari, A., Tsai, A.: Implicit deep learning. SIAM J. Math. Data Sci. 3(3), 930–958 (2021). https://doi.org/10.1137/20M1358517
Celik, A.N., Kolhe, M.: Generalized feed-forward based method for wind energy prediction. Appl. Energy 101, 582–588 (2013). https://doi.org/10.1016/j.apenergy.2012.06.040
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of The Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256 (2010). https://proceedings.mlr.press/v9/glorot10a.html
Huang, G.-B., Zhu, Q.-Y., Siew, C.-K.: Extreme learning machine: theory and applications. Neurocomputing 70(1–3), 489–501 (2006). https://doi.org/10.1016/j.neucom.2005.12.126
Medsker, L., Jain, L.C.: Recurrent Neural Networks: Design and Applications, pp. 64–67. CRC Press, Boca Raton (1999). https://doi.org/10.1201/9781003040620
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90
Horst, R., Tuy, H.: Global Optimization: Deterministic Approaches. Springer, Berlin (2013)
Schweidtmann, A.M., Bongartz, D., Huster, W.R., Mitsos, A.: Deterministic global process optimization: Flash calculations via artificial neural networks. Comput. Aided Chem. Eng. 46, 937–942 (2019). https://doi.org/10.1016/b978-0-12-818634-3.50157-0
Chachuat, B.C.: MC++: Toolkit for Construction, Manipulation and Bounding of Factorable Functions (2020). https://omega-icl.github.io/mcpp/
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017). https://doi.org/10.1145/3065386
Lu, L., Shin, Y., Su, Y., Karniadakis, G.E.: Dying ReLU and initialization: theory and numerical examples. Commun. Comput. Phys. 28(5), 1671–1706 (2020). https://doi.org/10.4208/cicp.OA-2020-0165
Clevert, D.-A., Unterthiner, T., Hochreiter, S.: Fast and accurate deep network learning by exponential linear units (ELUs). arXiv preprint arXiv:1511.07289 (2015)
Klambauer, G., Unterthiner, T., Mayr, A., Hochreiter, S.: Self-normalizing neural networks. In: Advances in Neural Information Processing Systems, pp. 971–980 (2017). https://doi.org/10.5555/3294771.3294864
Wilhelm, M.E., Stuber, M.D.: EAGO.jl: easy advanced global optimization in Julia. Optim. Methods Softw. 1, 1–26 (2020). https://doi.org/10.1080/10556788.2020.1786566
Bompadre, A., Mitsos, A.: Convergence rate of McCormick relaxations. J. Global Optim. 52(1), 1–28 (2011). https://doi.org/10.1007/s10898-011-9685-2
Kannan, R., Barton, P.I.: The cluster problem in constrained global optimization. J. Global Optim. 69(3), 629–676 (2017). https://doi.org/10.1007/s10898-017-0531-z
Ryoo, H.S., Sahinidis, N.V.: A branch-and-reduce approach to global optimization. J. Global Optim. 8(2), 107–138 (1996). https://doi.org/10.1007/bf00138689
Tawarmalani, M., Sahinidis, N.V.: A polyhedral branch-and-cut approach to global optimization. Math. Program. 103(2), 225–249 (2005). https://doi.org/10.1007/s10107-005-0581-8
Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: ICML: Proceedings of the 27th International Conference on Machine Learning, pp. 807–814 (2010). https://doi.org/10.5555/3104322.3104425
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1026–1034. IEEE, Santiago, Chile (2015). https://doi.org/10.1109/iccv.2015.123
Eger, S., Youssef, P., Gurevych, I.: Is it time to swish? Comparing deep learning activation functions across NLP tasks. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (2018). https://doi.org/10.18653/v1/d18-1472
Zheng, H., Yang, Z., Liu, W., Liang, J., Li, Y.: Improving deep neural networks using softplus units. In: 2015 International Joint Conference on Neural Networks (IJCNN), pp. 1–4 (2015). https://doi.org/10.1109/IJCNN.2015.7280459. IEEE
Nwankpa, C.E., Ijomah, W., Gachagan, A., Marshall, S.: Activation functions: comparison of trends in practice and research for deep learning. In: 2nd International Conference on Computational Sciences and Technology, pp. 124–133 (2021)
Hornik, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are universal approximators. Neural Netw. 2(5), 359–366 (1989). https://doi.org/10.1016/0893-6080(89)90020-8
Elliott, D.L.: A better activation function for artificial neural networks. Technical report, Institute for Systems Research (1993). http://hdl.handle.net/1903/5355
Sahlodin, A.M.: Global optimization of dynamic process systems using complete search methods. Ph.D. thesis, McMaster University (2013). https://macsphere.mcmaster.ca/handle/11375/12803
Hendrycks, D., Gimpel, K.: Gaussian error linear units (GELUs). arXiv preprint (2016) arXiv:1606.08415
Elfwing, S., Uchibe, E., Doya, K.: Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural Netw. 107, 3–11 (2018). https://doi.org/10.1016/j.neunet.2017.12.012
Elfwing, S., Uchibe, E., Doya, K.: Expected energy-based restricted Boltzmann machine for classification. Neural Netw. 64, 29–38 (2015). https://doi.org/10.1016/j.neunet.2014.09.006
Ramachandran, P., Zoph, B., Le, Q.V.: Searching for activation functions. arXiv preprint (2017) https://arxiv.org/abs/1710.05941
Chen, J., Revels, J.: Robust benchmarking in noisy environments. arXiv e-prints arXiv:1608.04295 [cs.PF] (2016)
Najman, J., Mitsos, A.: Convergence analysis of multivariate McCormick relaxations. J. Global Optim. 66(4), 597–628 (2016). https://doi.org/10.1007/s10898-016-0408-6
Du, K., Kearfott, R.B.: The cluster problem in multivariate global optimization. J. Global Optim. 5(3), 253–265 (1994). https://doi.org/10.1007/bf01096455
Wechsung, A., Schaber, S.D., Barton, P.I.: The cluster problem revisited. J. Global Optim. 58(3), 429–438 (2014). https://doi.org/10.1007/s10898-013-0059-9
Epperly, T.G.W., Pistikopoulos, E.N.: A reduced space branch and bound algorithm for global optimization. J. Global Optim. 11(3), 287–311 (1997). https://doi.org/10.1023/A:1008212418949
Stuber, M.D.: Evaluation of process systems operating envelopes. Ph.D. thesis, Massachusetts Institute of Technology (2012). https://doi.org/10.13140/2.1.1775.4409
Wechsung, A.: Global optimization in reduced space. Ph.D. thesis, Massachusetts Institute of Technology (2014). https://dspace.mit.edu/handle/1721.1/87131
Bongartz, D., Mitsos, A.: Deterministic global optimization of process flowsheets in a reduced space using McCormick relaxations. J. Global Optim. 69(4), 761–796 (2017). https://doi.org/10.1007/s10898-017-0547-4
Sahinidis, N.V.: BARON 21.1.13: Global Optimization of Mixed-Integer Nonlinear Programs, User’s Manual (2017). https://www.minlp.com/downloads/docs/baron%20manual.pdf
Misener, R., Floudas, C.A.: ANTIGONE: Algorithms for continuous/integer global optimization of nonlinear equations. J. Global Optim. 59(2–3), 503–526 (2014). https://doi.org/10.1007/s10898-014-0166-2
Bongartz, D., Najman, J., Sass, S., Mitsos, A.: MAiNGO: McCormick based algorithm for mixed integer nonlinear global optimization. Process Systems Engineering (AVT. SVT), RWTH Aachen University (2018). https://git.rwth-aachen.de/avt-svt/public/maingo
Kearfott, R.B., Castille, J., Tyagi, G.: A general framework for convexity analysis in deterministic global optimization. J. Global Optim. 56(3), 765–785 (2013). https://doi.org/10.1007/s10898-012-9905-4
Khan, K.A., Watson, H.A.J., Barton, P.I.: Differentiable McCormick relaxations. J. Global Optim. 67(4), 687–729 (2016). https://doi.org/10.1007/s10898-016-0440-6
Khan, K.A., Wilhelm, M., Stuber, M.D., Cao, H., Watson, H.A.J., Barton, P.I.: Corrections to: Differentiable McCormick relaxations. J. Global Optim. 70(3), 705–706 (2018). https://doi.org/10.1007/s10898-017-0601-2
Dolan, E.D., Moré, J.J.: Benchmarking optimization software with performance profiles. Math. Program. 91(2), 201–213 (2002). https://doi.org/10.1007/s101070100263
Bezanson, J., Edelman, A., Karpinski, S., Shah, V.B.: Julia: A fresh approach to numerical computing. SIAM Rev. 59(1), 65–98 (2017). https://doi.org/10.1137/141000671
Sanders, D.P., Benet, L., lucaferranti, Agarwal, K., Richard, B., Grawitter, J., Gupta, E., Herbst, M.F., Forets, M., yashrajgupta, Hanson, E., van Dyk, B., Rackauckas, C., Vasani, R., Micluţa-Câmpeanu, S., Olver, S., Koolen, T., Wormell, C., Vázquez, F.A., TagBot, J., O’Bryant, K., Carlsson, K., Piibeleht, M., Reno, Deits, R., Holy, T., Kaluba, M., matsueushi: JuliaIntervals/IntervalArithmetic.jl: V0.18.2. https://doi.org/10.5281/zenodo.4739394
Fedorov, G., Nguyen, K.T., Harrison, P., Singh, A.: Intel Math Kernel Library 2019 Update 2 Release Notes. (2019). https://software.intel.com/en-us/mkl
Anderson, E., Bai, Z., Bischof, C., Blackford, L.S., Demmel, J., Dongarra, J., Croz, J.D., Greenbaum, A., Hammarling, S., McKenney, A., Sorensen, D.: LAPACK Users’ Guide. Society for Industrial and Applied Mathematics, Philadelphia (1999). https://doi.org/10.1137/1.9780898719604
Wang, E., Zhang, Q., Shen, B., Zhang, G., Lu, X., Wu, Q., Wang, Y.: Intel math kernel library. In: High-Performance Computing on the Intel® Xeon Phi™, pp. 167–188. Springer, New York (2014). https://doi.org/10.1007/978-3-319-06486-4_7
Blackford, L.S., Petitet, A., Pozo, R., Remington, K., Whaley, R.C., Demmel, J., Dongarra, J., Duff, I., Hammarling, S., Henry, G., Heroux, M.: An updated set of basic linear algebra subprograms (BLAS). ACM Trans. Math. Softw. 28(2), 135–151 (2002). https://doi.org/10.1145/567806.567807
Vigerske, S., Gleixner, A.: SCIP: global optimization of mixed-integer nonlinear programs in a branch-and-cut framework. Optim. Methods Softw. 33(3), 563–593 (2018). https://doi.org/10.1080/10556788.2017.1335312
Grant, M., Boyd, S., Ye, Y.: In: Liberti, L., Maculan, N. (eds.) Disciplined convex programming, pp. 155–210. Springer, Boston (2006). https://doi.org/10.1007/0-387-30528-9_7
Khajavirad, A., Sahinidis, N.V.: A hybrid LP/NLP paradigm for global optimization relaxations. Math. Program. Comput. 10(3), 383–421 (2018). https://doi.org/10.1007/s12532-018-0138-5
Acknowledgements
This material is based upon work supported by the National Science Foundation under Grant No. 1932723. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wilhelm, M.E., Wang, C. & Stuber, M.D. Convex and concave envelopes of artificial neural network activation functions for deterministic global optimization. J Glob Optim 85, 569–594 (2023). https://doi.org/10.1007/s10898-022-01228-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10898-022-01228-x