Difference between revisions of "Methods"

From aHuman Wiki
Jump to: navigation, search
(Automated page entry using MWPush.pl)
(No difference)

Revision as of 20:02, 21 June 2015

Algorithms and Methods

@@Home -> ArtificialIntelligenceDictionary -> methods


General Methods

  • back-propagation (Werbos, 1974)
  • backward modeling
  • batch training
      1. batch back-propagation
      2. training by epoch
  • boosting (Schapire, 2001)
  • cascade algorithm
  • constructive training (of neural network)
  • convex optimization
  • covariance training
  • direct error minimization
  • discriminative training methods
  • dynamic algorithms
  • ensemble learning (Krogh and Vedelsby, 1995; Diettrich, 2000)
  • explorative algorithms
  • exploitive algorithms
  • forward modeling
  • genetic algorithms (Goldberg, 1989)
  • global optimization techniques
  • gradient methods
  • greedy learning
  • incremental training (of neural network)
      1. incremental back-propagation
      2. online training
      3. training by pattern
  • kernel methods
  • learning
      1. learning-by-example
  • mini-batch training (Wilson and Martinez, 2003)
  • off-line learning
  • off-policy learning
  • on-policy learning
  • online learning
  • policy iteration algorithm (Kaelbling et al., 1996)
  • reinforcement learning
      1. episodic reinforcement learning
      2. model-based reinforcement learning
      3. model-free reinforcement learning
      4. non-episodic reinforcement learning
      5. off-policy learning
      6. on-policy learning
      7. tabular reinforcement learning
  • supervised training
  • temporal difference learning algorithms
  • training
  • unsupervised training
  • value iteration algorithm
  • wake-sleep algorithm
      1. contrastive wake-sleep
  • weight update algorithm

Common Parameters of Algorithms

  • fixed step-size
  • dynamic step-size
  • patience parameter
  • momentum
  • steepness parameter
  • step size
      1. fixed step-size
      2. dynamic step-size
  • variational bound

Named Algorithms

  • Backprop
  • Bayesian techniques (Neal, 1996)
  • Cascade 2
      1. Cascade 2 with caching
  • Cascade Correlation (Prechelt, 1997)
  • Casper algorithm (Treadgold and Gedeon, 1997)
  • Cerebellar Model Articulation Controller (CMAC) (Albus, 1975; Glanz et al., 1991; Sutton and Barto, 1998)
  • Contrastive Divergence Learning
  • Dyna-Q (Sutton and Barto, 1998)
  • Explicit Explore or Exploit (Kearns and Singh, 1998)
  • Gibbs Sampling
      1. Alternating Gibbs Sampling
  • K-Nearest Neighbor
  • Learning Vector Quantization (LVQ)
  • Levenberg-Marquardt (More, 1977)
  • Locality-Sensitive Hashing (LSH)
  • Maximum Likelihood (ML) Learning
  • Model-Based Interval Estimation (MBIE) (Strehl and Littman, 2004)
  • Model-Based Policy Gradient methods (MBPG) (Wang and Dietterich, 2003)
  • Monte Carlo Algorithms
  • N-Step Return Algorithm
  • Neural Fitted Q Iteration (Riedmiller, 2005)
      1. NFQ-SARSA(L)
  • Optimal Brain Damage (LeCun et al., 1990)
  • Orthogonal Least Squares (OLS)
  • Particle Swarm (Kennedy and Eberhart, 1995)
  • Prioritized Sweeping (Sutton and Barto, 1998)
  • Q-Learning
      1. Delayed Q-learning (Strehl et al., 2006)
      2. Generalized Policy Iteration (GPI) (Sutton and Barto, 1998)
      3. Naive Q(λ) (Sutton and Barto, 1998)
      4. One-Step Q-Learning
      5. Peng’s Q(λ) (Peng and Williams, 1994)
      6. Q(λ)
      7. Watkin’s Q(λ) (Watkins, 1989)
  • Q-SARSA(λ)
      1. One-Step Q-SARSA Algorithm
  • Quickprop (Fahlman, 1988)
  • R-Max (Brafman and Tennenholtz, 2002)
  • RPROP (Riedmiller and Braun, 1993)
      1. iRPROP- (Igel and Hsken, 2000)
  • SARSA(λ)
      1. On-Policy SARSA-Learning
      2. SARSA with Linear Function Approximators (Gordon, 2000)
  • Simulated Annealing (Kirkpatrick et al., 1987)
  • Sliding Window Cache
  • SONN (Tenorio and Lee, 1989)
  • Temporal Difference (TD) Learning
      1. Monte Carlo Prediction
      2. TD(0) Algorithm (Sutton, 1988)
      3. TD(λ) Algorithm
      4. TD(λ) Algorithm With Linear Function Approximators (Tsitsiklis and Roy, 1996)
      5. Temporal Difference (TD) Prediction Algorithm
  • λ-Return Approach