Methods

Algorithms and Methods

@@Home -> ArtificialIntelligenceDictionary -> methods

General Methods

back-propagation (Werbos, 1974)
backward modeling
batch training

  *# batch back-propagation
  *# training by epoch

boosting (Schapire, 2001)
cascade algorithm
constructive training (of neural network)
convex optimization
covariance training
direct error minimization
discriminative training methods
dynamic algorithms
ensemble learning (Krogh and Vedelsby, 1995; Diettrich, 2000)
explorative algorithms
exploitive algorithms
forward modeling
genetic algorithms (Goldberg, 1989)
global optimization techniques
gradient methods
greedy learning
incremental training (of neural network)

  *# incremental back-propagation
  *# online training
  *# training by pattern

kernel methods
learning

  *# learning-by-example

mini-batch training (Wilson and Martinez, 2003)
off-line learning
off-policy learning
on-policy learning
online learning
policy iteration algorithm (Kaelbling et al., 1996)
reinforcement learning

  *# episodic reinforcement learning
  *# model-based reinforcement learning
  *# model-free reinforcement learning
  *# non-episodic reinforcement learning
  *# off-policy learning
  *# on-policy learning
  *# tabular reinforcement learning

supervised training
temporal difference learning algorithms
training
unsupervised training
value iteration algorithm
wake-sleep algorithm

  *# contrastive wake-sleep

weight update algorithm

Common Parameters of Algorithms

fixed step-size
dynamic step-size
patience parameter
momentum
steepness parameter
step size

  *# fixed step-size
  *# dynamic step-size

variational bound

Named Algorithms

Backprop
Bayesian techniques (Neal, 1996)
Cascade 2

  *# Cascade 2 with caching

Cascade Correlation (Prechelt, 1997)
Casper algorithm (Treadgold and Gedeon, 1997)
Cerebellar Model Articulation Controller (CMAC) (Albus, 1975; Glanz et al., 1991; Sutton and Barto, 1998)
Contrastive Divergence Learning
Dyna-Q (Sutton and Barto, 1998)
Explicit Explore or Exploit (Kearns and Singh, 1998)
Gibbs Sampling

  *# Alternating Gibbs Sampling

K-Nearest Neighbor
Learning Vector Quantization (LVQ)
Levenberg-Marquardt (More, 1977)
Locality-Sensitive Hashing (LSH)
Maximum Likelihood (ML) Learning
Model-Based Interval Estimation (MBIE) (Strehl and Littman, 2004)
Model-Based Policy Gradient methods (MBPG) (Wang and Dietterich, 2003)
Monte Carlo Algorithms
N-Step Return Algorithm
Neural Fitted Q Iteration (Riedmiller, 2005)

  *# NFQ-SARSA(L)

Optimal Brain Damage (LeCun et al., 1990)
Orthogonal Least Squares (OLS)
Particle Swarm (Kennedy and Eberhart, 1995)
Prioritized Sweeping (Sutton and Barto, 1998)
Q-Learning

  *# Delayed Q-learning (Strehl et al., 2006)
  *# Generalized Policy Iteration (GPI) (Sutton and Barto, 1998)
  *# Naive Q(λ) (Sutton and Barto, 1998)
  *# One-Step Q-Learning
  *# Peng’s Q(λ) (Peng and Williams, 1994)
  *# Q(λ)
  *# Watkin’s Q(λ) (Watkins, 1989)

Q-SARSA(λ)

  *# One-Step Q-SARSA Algorithm

Quickprop (Fahlman, 1988)
R-Max (Brafman and Tennenholtz, 2002)
RPROP (Riedmiller and Braun, 1993)

  *# iRPROP- (Igel and Hsken, 2000)

SARSA(λ)

  *# On-Policy SARSA-Learning
  *# SARSA with Linear Function Approximators (Gordon, 2000)

Simulated Annealing (Kirkpatrick et al., 1987)
Sliding Window Cache
SONN (Tenorio and Lee, 1989)
Temporal Difference (TD) Learning

  *# Monte Carlo Prediction
  *# TD(0) Algorithm (Sutton, 1988)
  *# TD(λ) Algorithm
  *# TD(λ) Algorithm With Linear Function Approximators (Tsitsiklis and Roy, 1996)
  *# Temporal Difference (TD) Prediction Algorithm

λ-Return Approach

Methods

General Methods

Common Parameters of Algorithms

Named Algorithms

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Project

Tools