Difference between revisions of "Methods"

Latest revision as of 19:07, 28 November 2018

Algorithms and Methods

@@Home -> ArtificialIntelligenceDictionary -> methods

General Methods

back-propagation (Werbos, 1974)
backward modeling
batch training

  *# batch back-propagation
  *# training by epoch

boosting (Schapire, 2001)
cascade algorithm
constructive training (of neural network)
convex optimization
covariance training
direct error minimization
discriminative training methods
dynamic algorithms
ensemble learning (Krogh and Vedelsby, 1995; Diettrich, 2000)
explorative algorithms
exploitive algorithms
forward modeling
genetic algorithms (Goldberg, 1989)
global optimization techniques
gradient methods
greedy learning
incremental training (of neural network)

  *# incremental back-propagation
  *# online training
  *# training by pattern

kernel methods
learning

  *# learning-by-example

mini-batch training (Wilson and Martinez, 2003)
off-line learning
off-policy learning
on-policy learning
online learning
policy iteration algorithm (Kaelbling et al., 1996)
reinforcement learning

  *# episodic reinforcement learning
  *# model-based reinforcement learning
  *# model-free reinforcement learning
  *# non-episodic reinforcement learning
  *# off-policy learning
  *# on-policy learning
  *# tabular reinforcement learning

supervised training
temporal difference learning algorithms
training
unsupervised training
value iteration algorithm
wake-sleep algorithm

  *# contrastive wake-sleep

weight update algorithm

Common Parameters of Algorithms

fixed step-size
dynamic step-size
patience parameter
momentum
steepness parameter
step size

  *# fixed step-size
  *# dynamic step-size

variational bound

Named Algorithms

Backprop
Bayesian techniques (Neal, 1996)
Cascade 2

  *# Cascade 2 with caching

Cascade Correlation (Prechelt, 1997)
Casper algorithm (Treadgold and Gedeon, 1997)
Cerebellar Model Articulation Controller (CMAC) (Albus, 1975; Glanz et al., 1991; Sutton and Barto, 1998)
Contrastive Divergence Learning
Dyna-Q (Sutton and Barto, 1998)
Explicit Explore or Exploit (Kearns and Singh, 1998)
Gibbs Sampling

  *# Alternating Gibbs Sampling

K-Nearest Neighbor
Learning Vector Quantization (LVQ)
Levenberg-Marquardt (More, 1977)
Locality-Sensitive Hashing (LSH)
Maximum Likelihood (ML) Learning
Model-Based Interval Estimation (MBIE) (Strehl and Littman, 2004)
Model-Based Policy Gradient methods (MBPG) (Wang and Dietterich, 2003)
Monte Carlo Algorithms
N-Step Return Algorithm
Neural Fitted Q Iteration (Riedmiller, 2005)

  *# NFQ-SARSA(L)

Optimal Brain Damage (LeCun et al., 1990)
Orthogonal Least Squares (OLS)
Particle Swarm (Kennedy and Eberhart, 1995)
Prioritized Sweeping (Sutton and Barto, 1998)
Q-Learning

  *# Delayed Q-learning (Strehl et al., 2006)
  *# Generalized Policy Iteration (GPI) (Sutton and Barto, 1998)
  *# Naive Q(λ) (Sutton and Barto, 1998)
  *# One-Step Q-Learning
  *# Peng’s Q(λ) (Peng and Williams, 1994)
  *# Q(λ)
  *# Watkin’s Q(λ) (Watkins, 1989)

Q-SARSA(λ)

  *# One-Step Q-SARSA Algorithm

Quickprop (Fahlman, 1988)
R-Max (Brafman and Tennenholtz, 2002)
RPROP (Riedmiller and Braun, 1993)

  *# iRPROP- (Igel and Hsken, 2000)

SARSA(λ)

  *# On-Policy SARSA-Learning
  *# SARSA with Linear Function Approximators (Gordon, 2000)

Simulated Annealing (Kirkpatrick et al., 1987)
Sliding Window Cache
SONN (Tenorio and Lee, 1989)
Temporal Difference (TD) Learning

  *# Monte Carlo Prediction
  *# TD(0) Algorithm (Sutton, 1988)
  *# TD(λ) Algorithm
  *# TD(λ) Algorithm With Linear Function Approximators (Tsitsiklis and Roy, 1996)
  *# Temporal Difference (TD) Prediction Algorithm

λ-Return Approach

@@ Line 1: / Line 1: @@
 <pre style="color: green">Algorithms and Methods</pre>
@@ Line 10: / Line 9: @@
 * backward modeling
 * batch training
-**# batch back-propagation
+   *# batch back-propagation
-**# training by epoch
+   *# training by epoch
 * boosting (Schapire, 2001)
 * cascade algorithm
@@ Line 29: / Line 28: @@
 * greedy learning
 * incremental training (of neural network)
-**# incremental back-propagation
+   *# incremental back-propagation
-**# online training
+   *# online training
-**# training by pattern
+   *# training by pattern
 * kernel methods
 * learning
-**# learning-by-example
+   *# learning-by-example
 * mini-batch training (Wilson and Martinez, 2003)
 * off-line learning
@@ Line 42: / Line 41: @@
 * policy iteration algorithm (Kaelbling et al., 1996)
 * reinforcement learning
-**# episodic reinforcement learning
+   *# episodic reinforcement learning
-**# model-based reinforcement learning
+   *# model-based reinforcement learning
-**# model-free reinforcement learning
+   *# model-free reinforcement learning
-**# non-episodic reinforcement learning
+   *# non-episodic reinforcement learning
-**# off-policy learning
+   *# off-policy learning
-**# on-policy learning
+   *# on-policy learning
-**# tabular reinforcement learning
+   *# tabular reinforcement learning
 * supervised training
 * temporal difference learning algorithms
@@ Line 55: / Line 54: @@
 * value iteration algorithm
 * wake-sleep algorithm
-**# contrastive wake-sleep
+   *# contrastive wake-sleep
 * weight update algorithm
@@ Line 66: / Line 65: @@
 * steepness parameter
 * step size
-**# fixed step-size
+   *# fixed step-size
-**# dynamic step-size
+   *# dynamic step-size
 * variational bound
@@ Line 75: / Line 74: @@
 * Bayesian techniques (Neal, 1996)
 * Cascade 2
-**# Cascade 2 with caching
+   *# Cascade 2 with caching
 * Cascade Correlation (Prechelt, 1997)
 * Casper algorithm (Treadgold and Gedeon, 1997)
@@ Line 83: / Line 82: @@
 * Explicit Explore or Exploit (Kearns and Singh, 1998)
 * Gibbs Sampling
-**# Alternating Gibbs Sampling
+   *# Alternating Gibbs Sampling
 * K-Nearest Neighbor
 * Learning Vector Quantization (LVQ)
@@ Line 94: / Line 93: @@
 * N-Step Return Algorithm
 * Neural Fitted Q Iteration (Riedmiller, 2005)
-**# NFQ-SARSA(L)
+   *# NFQ-SARSA(L)
 * Optimal Brain Damage (LeCun et al., 1990)
 * Orthogonal Least Squares (OLS)
@@ Line 100: / Line 99: @@
 * Prioritized Sweeping (Sutton and Barto, 1998)
 * Q-Learning
-**# Delayed Q-learning (Strehl et al., 2006)
+   *# Delayed Q-learning (Strehl et al., 2006)
-**# Generalized Policy Iteration (GPI) (Sutton and Barto, 1998)
+   *# Generalized Policy Iteration (GPI) (Sutton and Barto, 1998)
-**# Naive Q(λ) (Sutton and Barto, 1998)
+   *# Naive Q(λ) (Sutton and Barto, 1998)
-**# One-Step Q-Learning
+   *# One-Step Q-Learning
-**# Peng’s Q(λ) (Peng and Williams, 1994)
+   *# Peng’s Q(λ) (Peng and Williams, 1994)
-**# Q(λ)
+   *# Q(λ)
-**# Watkin’s Q(λ) (Watkins, 1989)
+   *# Watkin’s Q(λ) (Watkins, 1989)
 * Q-SARSA(λ)
-**# One-Step Q-SARSA Algorithm
+   *# One-Step Q-SARSA Algorithm
 * Quickprop (Fahlman, 1988)
 * R-Max (Brafman and Tennenholtz, 2002)
 * RPROP (Riedmiller and Braun, 1993)
-**# iRPROP- (Igel and Hsken, 2000)
+   *# iRPROP- (Igel and Hsken, 2000)
 * SARSA(λ)
-**# On-Policy SARSA-Learning
+   *# On-Policy SARSA-Learning
-**# SARSA with Linear Function Approximators (Gordon, 2000)
+   *# SARSA with Linear Function Approximators (Gordon, 2000)
 * Simulated Annealing (Kirkpatrick et al., 1987)
 * Sliding Window Cache
 * SONN (Tenorio and Lee, 1989)
 * Temporal Difference (TD) Learning
-**# Monte Carlo Prediction
+   *# Monte Carlo Prediction
-**# TD(0) Algorithm (Sutton, 1988)
+   *# TD(0) Algorithm (Sutton, 1988)
-**# TD(λ) Algorithm
+   *# TD(λ) Algorithm
-**# TD(λ) Algorithm With Linear Function Approximators (Tsitsiklis and Roy, 1996)
+   *# TD(λ) Algorithm With Linear Function Approximators (Tsitsiklis and Roy, 1996)
-**# Temporal Difference (TD) Prediction Algorithm
+   *# Temporal Difference (TD) Prediction Algorithm
 * λ-Return Approach

Difference between revisions of "Methods"

Latest revision as of 19:07, 28 November 2018

General Methods

Common Parameters of Algorithms

Named Algorithms

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Project

Tools