Activation functions for deep neural networks
Abstract
Deep Neural Networks are connectionist models composed of multiple layers of neuron-like computational units that try to simulate two fundamental aspects of human intelligence: i) learning from examples, and ii) generalizing the learned knowledge and skills to new and unseen examples. The design of such a model for any real-world application requires several steps, including the choice of an adequate architecture in terms of the number of layers, as well as the size, the type, and the activation function of each layer. This paper investigates the impact of activation functions on the overall performance of the network, which has remained underestimated until the early 2010s. The paper provides a brief historical review of recent literature, along with some practical recommendations for selecting the most appropriate activation functions according to each situation. To illustrate the studied problem, we also present some experimental results showing that a shallow neural network with appropriate activation functions may be more efficient than a deeper neural network using traditional and inappropriate functions.
Commun. Math. Biol. Neurosci.
ISSN 2052-2541
Editorial Office: [email protected]
Copyright ©2025 CMBN
Communications in Mathematical Biology and Neuroscience