|
|
This article is in need of attention from an expert on the subject. WikiProject Computer science or the Computer science Portal may be able to help recruit one. (November 2008) |
A recurrent neural network (RNN) is a class of neural network where connections between units form a directed cycle. This creates an internal state of the network which allows it to exhibit dynamic temporal behavior.
Recurrent neural networks must be approached differently from feedforward neural networks, both when analyzing their behavior and training them. Recurrent neural networks can also behave chaotically. Usually, dynamical systems theory is used to model and analyze them. While a feedforward network propagates data linearly from input to output, recurrent networks (RN) also propagate data from later processing stages to earlier stages.
Contents |
Architectures
Some of the most common recurrent neural network architectures are described here. The Elman and Jordan networks are also known as "simple recurrent networks" (SRN).
Elman network
This variation on the multilayer perceptron was invented by Jeff Elman. A three-layer network is used, with the addition of a set of "context units" in the input layer. There are connections from the middle (hidden) layer to these context units fixed with a weight of one[1]. At each time step, the input is propagated in a standard feed-forward fashion, and then a learning rule is applied. The fixed back connections result in the context units always maintaining a copy of the previous values of the hidden units (since they propagate over the connections before the learning rule is applied). Thus the network can maintain a sort of state, allowing it to perform such tasks as sequence-prediction that are beyond the power of a standard multilayer perceptron.
Jordan network
This network architecture is similar to the Elman network. The context units are however fed from the output layer instead of the hidden layer.
MMC network
See [2]
Hopfield network
The Hopfield network is a recurrent neural network in which all connections are symmetric. Invented by John Hopfield in 1982, this network guarantees that its dynamics will converge. If the connections are trained using Hebbian learning then the Hopfield network can perform as robust content-addressable memory, resistant to connection alteration.
Echo state network
The echo state network (ESN) is a recurrent neural network with a sparsely connected random hidden layer. The weights of output neurons are the only part of the network that can change and be trained. ESN are good to (re)produce temporal patterns.
Long short term memory network
The Long short term memory (LSTM) is an artificial neural net structure that unlike traditional RNNs doesn't have the problem of vanishing gradients. It can therefore use long delays and can handle signals that have a mix of low and high frequency components.
RNN with parametric bias
In this setup the recurrent neural network with parametric bias (RNNPB) is trained to reproduce a sequence with a constant bias input. The network is capable of learning different sequences with different parametric biases. With a trained network is also possible to find the associated parameter for an observed sequence. The sequence is backpropagated through the network to recover the bias which would produce the given sequence.
Continuous-time RNN
(CTRNN)
Hierarchical RNN
Here RNN are sparsely connected together through bottlenecks with the idea to isolate different hierarchical functions to different parts of the composite network. [3] [4] [5]
Recurrent Multilayer Perceptron
(RMLP)[6]
Pollack’s Sequential Cascaded Networks
Training
Training in recurrent neural networks is generally very slow.
Backpropagation through time (BPTT)
In this approach the simple recurrent network is unfolded in time for some iterations and then trained through backpropagation as the feed forward network.
Real-time recurrent learning (RTRL)
Unlike BPTT this algorithm is local in time but not local in space [7]
Genetic algorithms
Since RNN learning is very slow, genetic algorithms are a feasible alternative for weight optimization, especially in unstructured networks[8].
Genetic algorithms come in handy for neural network training as the goal of neural training is to seek an optimal set of weights. Therefore, training neural network is seen as an optimization problem. Initially, the genetic algorithm is encoded with the neural network weights in a predefined manner where one gene in the chromosome represents one weight link, henceforth; the whole network is represented as a single chromosome. There are many chromosomes that make up the population; therefore, many different neural networks are evolved until a stopping criterion is satisfied. A common stopping scheme is: 1) when the neural network has learnt a certain percentage of the training data or 2) when the minimum value of the mean-squared-error is satisfied or 3) when the maximum number of training generations has been reached. The stopping criterion is evaluated by the fitness function as it gets the reciprocal of the mean-squared-error from each neural network during training. Therefore, the goal of the genetic algorithm is to maximize the fitness function, hence, reduce the mean-squared-error. The fitness function is evaluated as follows: each weight encoded in the chromosome is assigned to the respective weight link of the network. The training set of examples is then presented to the network which propagates the input signals forward and the mean-squared-error is returned to the fitness function which influences the genetic selection process. An overview of the entire process is shown in the algorithm below which assumes that two parents for a single child chromosome.
BEGIN Algorithm Genetic algorithm for neural network training
Initialise Population(P) of individuals (weights) representing N neural networks
while !termination
while i < P.size()
1) Evaluate fitness of each neural network
2) Selection of Parents
3) Crossover
4) Mutation
5) Increment i
end while
Update population
end while
Get the best individual and copy into Neural network
Load data for testing Neural network
END Algorithm
Simulated Annealing
Simulated annealing is a global optimization technique that is often used to seek a good set of weights.
Particle Swarm Optimization
Particle swarm optimization is a global optimization technique that is often used to seek a good set of weights.
References
- ^ Neural Networks as Cybernetic Systems 2nd and revised edition, Holk Cruse [1]
- ^ http://www.tech.plym.ac.uk/socce/ncpw9/Kuehn.pdf
- ^ Dynamic Representation of Movement Primitives in an Evolved Recurrent Neural Network
- ^ doi:10.1016/j.neunet.2004.08.005
- ^ http://adb.sagepub.com/cgi/reprint/13/3/211.pdf
- ^ http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.45.3527
- ^ Neural and Adaptive Systems: Fundamentals through Simulation. J.C. Principe, N.R. Euliano, W.C. Lefebvre
- ^ Applying Genetic Algorithms to Recurrent Neural Networks for Learning Network Parameters and Architecture. O. Syed, Y. Takefuji
- Mandic, D. & Chambers, J. (2001). Recurrent Neural Networks for Prediction: Learning Algorithms, Architectures and Stability. Wiley.
- Elman, J.L. (1990). "Finding Structure in Time". Cognitive Science 14: 179–211. doi:.
This entry is from Wikipedia, the leading user-contributed encyclopedia. It may not have been reviewed by professional editors (see full disclaimer)




