Methods
From aHuman Wiki
Revision as of 20:02, 21 June 2015 by Admin (Talk | contribs) (Automated page entry using MWPush.pl)
Algorithms and Methods
@@Home -> ArtificialIntelligenceDictionary -> methods
General Methods
- back-propagation (Werbos, 1974)
- backward modeling
- batch training
- batch back-propagation
- training by epoch
- boosting (Schapire, 2001)
- cascade algorithm
- constructive training (of neural network)
- convex optimization
- covariance training
- direct error minimization
- discriminative training methods
- dynamic algorithms
- ensemble learning (Krogh and Vedelsby, 1995; Diettrich, 2000)
- explorative algorithms
- exploitive algorithms
- forward modeling
- genetic algorithms (Goldberg, 1989)
- global optimization techniques
- gradient methods
- greedy learning
- incremental training (of neural network)
- incremental back-propagation
- online training
- training by pattern
- kernel methods
- learning
- learning-by-example
- mini-batch training (Wilson and Martinez, 2003)
- off-line learning
- off-policy learning
- on-policy learning
- online learning
- policy iteration algorithm (Kaelbling et al., 1996)
- reinforcement learning
- episodic reinforcement learning
- model-based reinforcement learning
- model-free reinforcement learning
- non-episodic reinforcement learning
- off-policy learning
- on-policy learning
- tabular reinforcement learning
- supervised training
- temporal difference learning algorithms
- training
- unsupervised training
- value iteration algorithm
- wake-sleep algorithm
- contrastive wake-sleep
- weight update algorithm
Common Parameters of Algorithms
- fixed step-size
- dynamic step-size
- patience parameter
- momentum
- steepness parameter
- step size
- fixed step-size
- dynamic step-size
- variational bound
Named Algorithms
- Backprop
- Bayesian techniques (Neal, 1996)
- Cascade 2
- Cascade 2 with caching
- Cascade Correlation (Prechelt, 1997)
- Casper algorithm (Treadgold and Gedeon, 1997)
- Cerebellar Model Articulation Controller (CMAC) (Albus, 1975; Glanz et al., 1991; Sutton and Barto, 1998)
- Contrastive Divergence Learning
- Dyna-Q (Sutton and Barto, 1998)
- Explicit Explore or Exploit (Kearns and Singh, 1998)
- Gibbs Sampling
- Alternating Gibbs Sampling
- K-Nearest Neighbor
- Learning Vector Quantization (LVQ)
- Levenberg-Marquardt (More, 1977)
- Locality-Sensitive Hashing (LSH)
- Maximum Likelihood (ML) Learning
- Model-Based Interval Estimation (MBIE) (Strehl and Littman, 2004)
- Model-Based Policy Gradient methods (MBPG) (Wang and Dietterich, 2003)
- Monte Carlo Algorithms
- N-Step Return Algorithm
- Neural Fitted Q Iteration (Riedmiller, 2005)
- NFQ-SARSA(L)
- Optimal Brain Damage (LeCun et al., 1990)
- Orthogonal Least Squares (OLS)
- Particle Swarm (Kennedy and Eberhart, 1995)
- Prioritized Sweeping (Sutton and Barto, 1998)
- Q-Learning
- Delayed Q-learning (Strehl et al., 2006)
- Generalized Policy Iteration (GPI) (Sutton and Barto, 1998)
- Naive Q(λ) (Sutton and Barto, 1998)
- One-Step Q-Learning
- Peng’s Q(λ) (Peng and Williams, 1994)
- Q(λ)
- Watkin’s Q(λ) (Watkins, 1989)
- Q-SARSA(λ)
- One-Step Q-SARSA Algorithm
- Quickprop (Fahlman, 1988)
- R-Max (Brafman and Tennenholtz, 2002)
- RPROP (Riedmiller and Braun, 1993)
- iRPROP- (Igel and Hsken, 2000)
- SARSA(λ)
- On-Policy SARSA-Learning
- SARSA with Linear Function Approximators (Gordon, 2000)
- Simulated Annealing (Kirkpatrick et al., 1987)
- Sliding Window Cache
- SONN (Tenorio and Lee, 1989)
- Temporal Difference (TD) Learning
- Monte Carlo Prediction
- TD(0) Algorithm (Sutton, 1988)
- TD(λ) Algorithm
- TD(λ) Algorithm With Linear Function Approximators (Tsitsiklis and Roy, 1996)
- Temporal Difference (TD) Prediction Algorithm
- λ-Return Approach