Background

Biological Neural Networks (BNN)

Signal Processing in BNNs

Artificial Neural Networks (ANN)

Fundamental Unit of the ANN (The Neuron)

ANN Activation Functions

ANN Training Mechanisms Overview

# The (Artificial) Neural Networks Framework; More Interesting Than You Think

Author: Oluwole Oyetoke (12th May, 2017)

## Background

Sometimes, when people hear titles like 'Artificial Neural Networks', 'Machine Learning', 'Deep Learning', 'Artificial Intelligence' etc, it comes across to them as obscure concepts and probably a field requiring a lot of expertise, mathematics and statistical knowledge. While that is true, it is also true that we can start off learning in this area in a more simplistic manner. No rush into the maths and calculations, but rather, a little more verbal explanation on the philosophy behind some of these systems. This post is focused more on Neural Networks in general.

Neural networks, both in humans and in their artificial replica are made up of interconnected neurons which can pass data between each other and act accordingly on these data. The ANN itself is a near efficient abstraction of the Neural Network of the human body, as they are a biologically inspired family of computation architectures built as extremely simplified models of the human brain. This section describes in detail both the Biological Neural Network (BNN) and the Artificial Neural Network (ANN), juxtaposing their characteristics while also highlighting their fundamental similarities and operating principles.

## Biological Neural Networks (BNN)

The human nervous system can be broadly divided into two main classes which are the Central Nervous System (CNS) and the Peripheral Nervous System (PNS). The CNS consists of the brain and the spinal cord where all the analysis of information takes place, with the brain containing both large scale and small scale anatomical structures carrying out different functions at higher and lower levels. These structures include Molecules and Ions, Synapses, Neuronal microcircuits, Dendritic trees, Neurons, Local circuits and Inter-regional circuit. On the other hand, the peripheral nervous system (PNS) consists of the neurons such as the sensory neurons and motor neurons. The sensory neurons bring signals into the CNS while the motor neurons carry the signals out of the CNS. Based on the different roles and functionality of these neurons, they can be classified under three main classes as explained below.

1. Sensor Neurons: Sensory neurons get information on activities taking place both within and outside the human body and thereafter brings this information into the CNS so it can be processed.
2. Motor Neurons: Motor neurons get information from other neurons and convey commands to the human muscles, organs and glands.
3. Interneuronst: Interneurons, which are found only in the CNS, connect one neuron to another. They receive information from other neurons and transmit same to other neurons.

Based on the three classes of neurons, we can categorize their functionality as to receive signals (or information), integrate incoming signals (to determine whether the information should be passed along) and communicate signals to target cells.

Largely, the human nervous system can be broken down into three stages as represented in Diagram 1 below. The receptors collect information from the environment (e.g. photons on the retina). The effectors generate interactions with the environment (e.g. activate muscles).

Diagram 1: Three Stage Human Neural Network Representation

The BNN atomic structure exploited in the realization of Artificial Neural Networks are the Neurons. Figure 2.1 below shows the schematic diagram of a biological neuron and subsequently, a breakdown of the functions of the various elements that make up the neuron is given.

Diagram 2: Schematic Diagram of a Typical Biological Neuron

The function of the four main parts of the BNN are as listed below

• Soma: The soma is the Neuron’s cell body. The nucleus of the Neuron resides here and various other extensions such as the Axon and Dendrites are attached to this part of the Neuron.
• Dendrites: Branches that receive chemical messages from other neurons. They receive and process incoming information. The interpreted incoming signal can be either excitatory or inhibitory. Whether a neuron is excited into firing an impulse depends on the sum of all the excitatory and inhibitory signals it receives.
• Axon: It is basically the trunk of the neuron. If the neuron does end up firing, the nerve impulse, or action potential, is conducted down the axon. Towards its end, the axon splits up into many branches and develops bulbous swellings known as axon terminals (or nerve terminals). These axon terminals make connections on target cells.
• Synapse: These are small gaps that exist between the axon of one neuron and the dendrites of the other. Neuron-to-neuron connections are made onto the dendrites and cell bodies of other neurons. These connections, known as synapses, are the sites at which information is carried from the first neuron to the target neuron.

## Signal Processing in BNNs

The Axon-Dendrite connection of the neurons in the BNN do not only suffice for the required transmission of information between them. For adequate transmission of information, the BNN itself has a biological method of signal processing within the network which are based on the steps listed below. The BNN signal processing method is highly dependent on timing as input signals must arrive together and stronger inputs translate to more action potentials per unit time being generated.

1. Signals from connected neurons are collected by the dendrites.
2. The soma sums the incoming signals (spatially and temporally).
3. When sufficient input is received (threshold is exceeded), the neuron generates an action potential.
4. That action potential is transmitted along the axon to other neurons, or to structures outside the nervous systems (e.g., muscles).
5. If sufficient input is not received (threshold not exceeded), the inputs quickly decay and no action potential is generated.

## Artificial Neural Networks (ANN)

Artificial Neural Networks (ANN) are computing system designed to simulate the way the human brain analyses and process information. They are designed to operate using the same/similar principles used by the human Biological Neural Networks for learning and solving classification challenges. Its major elements are its processing unit, topology and learning algorithms. It is important to note that the human brain which the ANN is aimed at being modelled after has over 86 billion interconnected neurons which together form an incredibly complex interacting network enabling humans to see, hear, move, communicate, remember, analyse and understand. Considering the fact that the human brain works in parallel, ANNs are also designed to consists of a bunch of (artificial) neurons that act together in parallel to produce classified outputs, even from previously unknown data. It was developed as a generalization of mathematical models of neural biology based on the following assumptions.

1. Neurons are simple units in a nervous system at which information processing occurs.
2. Incoming information are signals that are passed between neurons through connection links
3. Each connection link has a corresponding weight which multiplies the transmitted signal.
4. Each neuron applies an activation function to its net input which is the sum of weighted input signals to determine the output signal

The first ANN was designed by Warren McCulloch and Walter Pitts in 1943 whose efforts were complemented in 1949 by Donald Hebb, the psychologist at McGill University who developed the first learning law for ANNs. Today, different types of ANN exist, all broadly under the single/multilayer feed forwards, and feed backwards categories.

The application areas for ANN include:

1. Pattern Recognition/Classification: Optical Character Recognition (OCR).
2. Biometrics: Speech Recognition.
3. Signal Processing.
4. Control Systems
5. Stock Market Prediction

## Fundamental Unit of the ANN (The Neuron)

Artificial neurons are the constitutive units in an artificial neural network. Depending on the specific model used they may be called a semi-linear unit, Nv neuron, binary neuron, linear threshold function, or McCulloch–Pitts (MCP) neuron. The neuron performs two functions, namely, collection of inputs & generation of an output. Each node output depends only on the information that is locally available to it, at its node, either stored internally or arriving via the weighted connections. As each unit receives inputs from many other nodes, it transmits its output to yet another set of nodes, however, by itself, a single processing element is not very powerful; it generates a scalar output with single numerical value, which is a simple non-linear function of its inputs. The power of the system emerges when layers of these fundamental units (neurons) are cascaded together into a network generally called the Artificial Neural Network. Figure 2.2 below shows a schematic representation of the artificial neuron and its mode of operation.

Diagram 3: An ANN Neuron

$$Output =Activation Function{( \sum_0^n (x(n) * w(n)) + Bias )} \dots (1)$$

$$x = Inputs$$

$$w = Weights$$

An initial choice of random weights is given to the connections between the various layers on neurons in the network. Inputs coming in to each of the neurons in the network are multiplied by the respective weights of their connection path while the neuron sums up these inputs with its own bias and then passes the net input through a specified activation function which has a threshold value. If the result of the computation performed by the activation function exceeds the threshold value, an output is triggered, otherwise, no output is triggered. This is analogous to the model of signal processing used by the BNN. It is also important to note that these weight values changes as the network gets tuned over time.

Diagram 4: Abbreviated (Matrix) Notation for Neuron with Multiple-Input

$$m = \sum_0^n \begin{pmatrix} {\begin{bmatrix} x0 \\ x1 \\ x2 \\ x3 \\ . \\ xn \end{bmatrix}} & {*} & {\begin{bmatrix} w0w1w2w3...wm \end{bmatrix}} \\ \end{pmatrix} + Bias \dots (2)$$

$$f(m) = a \dots (3)$$

## ANN Activation Functions

In biologically inspired neural networks, the activation function is usually an abstraction representing the rate of action potential firing in the cell. For the ANN, a feature is added to serve as the decision maker which finally maps the sets of input to a particular space in the output domain. This characteristic is called the activation function (or transfer function) which when applied to the net input of the ANN neuron produces a specified output. It limits the amplitude of the output of neuron to some finite value. An acceptable range of output is usually between 0 and 1, or -1 and 1. Many activation functions have been tested for artificial neurons but only a few have found practical application. As we know, the individual spike timings are often important, which makes “spike time coding” the most realistic representation for artificial neural networks

Table 1: Table Showing List of Usually Used ANN Activation Functions

S/N Function Name Mathematical Description Diagram
1
Identity Function
$$f(z) = z$$
2
Binary Step Function
$$f(z) = \begin{cases} 1, & \text{if z \geq T} \\ 0, & \text{if z \lt T} \end{cases}$$
3
Bipolar Step Function
$$f(z) = \begin{cases} 1, & \text{if z \geq T} \\ -1, & \text{if z \lt T} \end{cases}$$
4
Binary Sigmoid Function
$$f(z) = {{1} \over {1+exp(- \alpha z)}}$$ $$z = sum \; of \; weighted \; inputs$$ $$\alpha = steepness \; parameter$$
5
Bipolar Sigmoid Function
$$f(z) = {{1 - exp(- \alpha z) } \over {1+exp(- \alpha z)}}$$ $$z = sum of weighted inputs$$ $$\alpha = steepness \; parameter$$
6
Hypebolic Tan Function
$$f(z) = tanh(z) ={{2} \over {1 + e^{-2z}}} -1$$
7
Rectified Linear Unit (ReLU)
$$f(z) = \begin{cases} 0, & \text{if z \lt 0} \\ z, & \text{if z \geq 0} \end{cases}$$

## ANN Training Mechanisms Overview

The ANN neurons are connected to form a computational network. Once the desired network has been structured for an application, it is proceeded to the training stage which starts with the initial choice of random weights for each neurons connection inlet (weighted connection). These initial random weights are adapted over the training process to produce a finally suitable weight value which on interaction with future input data will most likely produce the desired results even for previously unknown data. In other words, in the training process, the network is stimulated by its environment (through sets of input data) Due to the stimulation, the network experiences some changes in its internal parameters (weights) as regulated by a group of pre-written rules. These changes to the internal structure makes the network responds to its surroundings in a different way on future interactions. The pre-written rules defined to solve the learning process of the ANN are categorized under different classes which are:

• Supervised Learning (Error Based)
• Unsupervised Learning
• Reinforcement Learning

Diagram 5: Hierarchical Model of the ANN Learning Class and Algorithms

## Supervised Learning

The scope of this project covers the use of neural networks developed through a supervised learning process, and as the name indicates, supervised learning is a method of training artificial intelligent systems by feeding them with data along sides information. In other words, training data includes both the input and the desired results (supervisory signal). Essentially, an effective enough training process leaves the system at a state whereby it can confidently classify further inputs to the system based on its previous training experience. The network processes the training inputs and compares its resulting outputs against the desired outputs and errors are then propagated back through the System (back propagation) causing the system to adjust the weights which control the network. This process occurs over and over as the weights are continually adjusted until decent weight values are generated capable of producing near correct classifications when fed with raw data in the future. There exist two patterns of implementing supervised learning on an ANN which are as listed below.

• Stochastic or Online Training
• Batch or Offline Training

• Phase 1(Propagation Phase): Essentially, in this phase, the forward propagation of the training inputs is made across the neural network in order to generate the network's output value(s). After this, a back-propagation process for error correction is engaged in.
• Phase 2 (Weight update): The percentage contribution of each weight to the network’s total error value will be calculated and the result will be used in determining how to adjust each of the weights so as to minimize the total error of the output of the system.

Phases 1 and 2 are repeated until the performance of the network is satisfactory. It is important to note that the speed at which the neural network learns is highly dependent on the chosen learning rate for the system

Diagram 7: Supervised Learning

## Unsupervised Learning

ANNs are now also applicable for unsupervised operations whereby the network must make sense of the inputs without outside help (self-organization or adaption). In this type of learning, the network learns about the pattern from the data itself without a priori knowledge and the network basically performs a clustering of the input space. One of the most important features in neural networks is its learning ability, which makes it in general suitable for computational applications whose structures are relatively unknown. However, neural networks in unsupervised learning are widely used to learn better representations of the input. Among neural network models, the self-organizing map (SOM) and adaptive resonance theory (ART) are commonly used unsupervised learning algorithms. The SOM is a topographic organization in which nearby locations in the map represent inputs with similar properties. SOM represent types of NNs that have a set of neurons connected to form topological grid (usually rectangular). When some pattern is presented to SOM, neuron with closest weight vector are considered winners and their weights are adapted to the weights of their neighbourhood. This way SOM naturally finds data clusters. The ART model allows the number of clusters to vary with problem size and lets the user control the degree of similarity between members of the same clusters by means of a user-defined constant called the vigilance parameter. ART networks are also used for many pattern recognition tasks, such as automatic target recognition and seismic signal processing

## Reinforcement Learning

The reinforcement learning class is a bit more interesting, as, although a teacher is present, it does not present the expected answer but only indicates if the computed output is correct or incorrect. The information provided by the “teacher” helps the network in the learning process.

One other introductory area you might want to also learn about is how back propagation happens during learning in Neural Networks. I will talk about this in one of my . But note that this post will delve a bit more into the maths involved