Value iteration Optimization Foundations; Machine Learning
noun phrase
Definition: A dynamic-programming method, widely used in reinforcement learning and optimal control, that iteratively updates value estimates using Bellman-style backups until convergence to an optimal or near-optimal value function [Wang et al. 2025].
Example in context: “Another similar approach is value iteration (VI), or Q-iteration, which iterates over the value function instead of policy.” [Wang et al. 2025]
Synonyms: VI; Q-iteration
Related terms: policy iteration; Bellman update; dynamic programming; reinforcement learning; value function