Methods
From aHuman Wiki
Revision as of 19:07, 28 November 2018 by Admin (Talk | contribs) (Automated page entry using MWPush.pl)
Algorithms and Methods
@@Home -> ArtificialIntelligenceDictionary -> methods
General Methods
- back-propagation (Werbos, 1974)
- backward modeling
- batch training
*# batch back-propagation *# training by epoch
- boosting (Schapire, 2001)
- cascade algorithm
- constructive training (of neural network)
- convex optimization
- covariance training
- direct error minimization
- discriminative training methods
- dynamic algorithms
- ensemble learning (Krogh and Vedelsby, 1995; Diettrich, 2000)
- explorative algorithms
- exploitive algorithms
- forward modeling
- genetic algorithms (Goldberg, 1989)
- global optimization techniques
- gradient methods
- greedy learning
- incremental training (of neural network)
*# incremental back-propagation *# online training *# training by pattern
- kernel methods
- learning
*# learning-by-example
- mini-batch training (Wilson and Martinez, 2003)
- off-line learning
- off-policy learning
- on-policy learning
- online learning
- policy iteration algorithm (Kaelbling et al., 1996)
- reinforcement learning
*# episodic reinforcement learning *# model-based reinforcement learning *# model-free reinforcement learning *# non-episodic reinforcement learning *# off-policy learning *# on-policy learning *# tabular reinforcement learning
- supervised training
- temporal difference learning algorithms
- training
- unsupervised training
- value iteration algorithm
- wake-sleep algorithm
*# contrastive wake-sleep
- weight update algorithm
Common Parameters of Algorithms
- fixed step-size
- dynamic step-size
- patience parameter
- momentum
- steepness parameter
- step size
*# fixed step-size *# dynamic step-size
- variational bound
Named Algorithms
- Backprop
- Bayesian techniques (Neal, 1996)
- Cascade 2
*# Cascade 2 with caching
- Cascade Correlation (Prechelt, 1997)
- Casper algorithm (Treadgold and Gedeon, 1997)
- Cerebellar Model Articulation Controller (CMAC) (Albus, 1975; Glanz et al., 1991; Sutton and Barto, 1998)
- Contrastive Divergence Learning
- Dyna-Q (Sutton and Barto, 1998)
- Explicit Explore or Exploit (Kearns and Singh, 1998)
- Gibbs Sampling
*# Alternating Gibbs Sampling
- K-Nearest Neighbor
- Learning Vector Quantization (LVQ)
- Levenberg-Marquardt (More, 1977)
- Locality-Sensitive Hashing (LSH)
- Maximum Likelihood (ML) Learning
- Model-Based Interval Estimation (MBIE) (Strehl and Littman, 2004)
- Model-Based Policy Gradient methods (MBPG) (Wang and Dietterich, 2003)
- Monte Carlo Algorithms
- N-Step Return Algorithm
- Neural Fitted Q Iteration (Riedmiller, 2005)
*# NFQ-SARSA(L)
- Optimal Brain Damage (LeCun et al., 1990)
- Orthogonal Least Squares (OLS)
- Particle Swarm (Kennedy and Eberhart, 1995)
- Prioritized Sweeping (Sutton and Barto, 1998)
- Q-Learning
*# Delayed Q-learning (Strehl et al., 2006) *# Generalized Policy Iteration (GPI) (Sutton and Barto, 1998) *# Naive Q(λ) (Sutton and Barto, 1998) *# One-Step Q-Learning *# Peng’s Q(λ) (Peng and Williams, 1994) *# Q(λ) *# Watkin’s Q(λ) (Watkins, 1989)
- Q-SARSA(λ)
*# One-Step Q-SARSA Algorithm
- Quickprop (Fahlman, 1988)
- R-Max (Brafman and Tennenholtz, 2002)
- RPROP (Riedmiller and Braun, 1993)
*# iRPROP- (Igel and Hsken, 2000)
- SARSA(λ)
*# On-Policy SARSA-Learning *# SARSA with Linear Function Approximators (Gordon, 2000)
- Simulated Annealing (Kirkpatrick et al., 1987)
- Sliding Window Cache
- SONN (Tenorio and Lee, 1989)
- Temporal Difference (TD) Learning
*# Monte Carlo Prediction *# TD(0) Algorithm (Sutton, 1988) *# TD(λ) Algorithm *# TD(λ) Algorithm With Linear Function Approximators (Tsitsiklis and Roy, 1996) *# Temporal Difference (TD) Prediction Algorithm
- λ-Return Approach