The Simulated Brain: Artificial Neural Networks

Artificial Neural Networks

Artificial Neural Networks (ANNs) are a subset of machine learning and the core of deep learning technology. Their structure is inspired by the human brain and is analogous to the way biological neurons excite and send signals to one another.

ANNs are comprised of densely connected processing nodes in several layers above and below it. Data moves through the nodes in only one direction, from input to output. An individual node receives data from several nodes below it and then sends data to nodes above it. Each node is associated with a weight or a threshold that acts as a gate that passes data along. When the network is active, the node receives various numbers from each of its connections, and multiplies it by the associated weight. If the sum of the weighted input is greater than the specified threshold of the node, the gate can be opened and data can be passed onto the next layer of the network.

Like machine learning, Neural Networks rely on training data to learn and improve their accuracy over time. These algorithms are powerful tools in computer science and technology as they allow us to classify and group data quickly. One of the most notable neural networks is Google’s search engine.

 

Types of Neural Networks

There are several neural networks, and they are classified by their use purposes. Let’s review the most common types below.

1. Perceptron

The Perceptron is the simplest neural network structure, created by the Cornell University psychologist Frank Rosenblatt in 1957. This model is known as a single-layer neural network, consisting of only two layers: the input layer and output layer. There are no hidden layers.

The Perceptron receives data and calculates the weighted input for each node. This weighted input is passed through the activated node to create an output. Due to its simple architecture, carrying out complex tasks would be impractical and time consuming. Therefore, Perceptrons are useful in the formation of Multilayer Perceptrons, which we’ll discuss next.

2. Multilayer Perceptrons

Multilayer Perceptrons (MLPs), or feed-forward neural networks, are the most commonly known neural networks. They are comprised of an input layer, hidden layer(s), and an output layer that are all fully connected to one another. That is, each neuron in one layer is connected to all neurons in the surrounding layers. Therefore, MLPs have higher processing capabilities than the Perceptron. MLPs are commonly used in data compression and encryption.

3. Convolutional Neural Networks

Convolutional neural networks (CNNs) are inspired by the visual cortex of the brain. The convolution layer is the core of CNN and is what sets it apart from other neural networks. (The term ‘convolution’ simply refers to a type of mathematical operation). Within the convolution layer lies the input data, a filter (or feature detector), and a feature map. Initially, the filters are randomized and do not generate useful results. However, over time the filters are adjusted and fine-tuned to identify greater portions of the image. The final output of the products from the filters is called the feature map.

Though they often require extensive training data, CNNs are widely applicable to a wide range of image and language tasks. CNNs are used in image and pattern recognition applications, such as facial recognition and detection of tumors in medical diagnosis.

4. Recurrent neural networks

Recurrent neural networks (RNNs) are adapted to data that involves sequences. Because the data is sequential, one data point is dependent upon the previous data point.  For example, time-series data is a type of sequential data. RNNs have the concept of ‘memory’ that helps store information from previous inputs to produce the next output of the sequence. The information cycles through a loop in the middle, hidden layer. RNNs are commonly used in speech detection and natural language processing.

5. Long Short-Term Memory Networks

RNNs can only retain information from the most recent stage. But for more complicated problems, we need more retention. This is where Long Short-Term Memory (LSTM) Networks come into play.

LSTM Networks have the same loop-like structure as RNNs, but with a different repeating module structure. This structure allows the network to preserve larger amounts of previous output data. LSTM networks are useful for more involved applications like language translation systems, which may require persistent information retention for context.

6. Generative Adversarial Networks

Generative Adversarial Networks (GANs) learn to generate new data that resemble the training data. For example, GANs can create images that look like photographs of human faces, even though the faces don’t belong to any real person.

GANs consist of two neural networks, the generator and the discriminator, which play against one other. The generator is trained to fabricate data and the discriminator is trained to distinguish it from the existing training data. Over time, the generator’s output becomes more realistic and indistinguishable from the training data.  Some GAN applications consist of creating new colors, editing photos, generating 3D objects, synthetic art, and more.

*  *  *

Neural networks are integral to the growth of AI applications. Artificial Neural Networks can learn and model relationships between inputs and outputs that are nonlinear and complex. They can make inferences, recognize patterns, predict events, and construct new data. As a result, ANNs can improve decision processes across many different fields in medicine, security, arts and entertainment, agriculture, and marketing.

 

Special thanks to Jenna Malone and Angela Chong for the help on this post.

##########