Beam Search
Beam search is a search algorithm that explores a graph by expanding the most promising node in a limited set. Beam search is a modification of best-first search that reduces its memory requirement, where only a predetermined number of best partial solutions are kept as candidates.
At each level of the tree, it generates all successors of the states at the current level, sorting them in increasing order of heuristic cost. However, it only stores a predetermined number
With an infinite beam width, no states are pruned and beam search is identical to best-first search. Conversely, a beam width of 1 corresponds to a hill-climbing algorithm.
Analysis
The beam width bounds the memory required to perform search. Since a goal state could potentially be pruned, beam search sacrifices completeness. As such, beam search is not optimal.
Applications
Beam search is commonly used when the search tree is too large to explore exhaustively, but the successor states can be scored cheaply enough to rank partial solutions. Typical examples include:
- sequence decoding in language models, machine translation, speech recognition, and image captioning;
- approximate planning when a heuristic or learned value function can score partial plans;
- combinatorial optimization problems where the goal is to maintain a small set of diverse high-quality candidates rather than prove optimality.
In language-model decoding, each node is a partial token sequence and each edge appends one token. The model gives a probability distribution over the next token, so a partial sequence can be scored by its accumulated log-probability (often with length normalization). Beam search keeps the
Beam search is a natural fit for text generation because the branching factor is enormous (the vocabulary size), but the model already supplies a direct local ranking over next-token actions. Algorithms such as MCTS are more useful when rollouts or value estimates can reveal delayed rewards that are not captured by the immediate policy. For standard autoregressive generation, there is usually no cheap, reliable rollout reward beyond the model likelihood itself, so MCTS adds substantial computation without a clear signal to back up through the tree. Beam search instead exploits the model’s own token probabilities and gives a simple compute-quality tradeoff through the beam width.
Beam Search vs Policy Tree Search for Language Models
At first glance, policy-guided tree search methods such as Levin Tree Search seem like natural alternatives for language-model decoding. An autoregressive language model already defines a policy over next-token actions, so one might try to search the token tree using a policy-aware objective and return the best completed sequence found within a fixed compute budget.
The main difficulty is that ordinary text generation usually does not have a crisp goal predicate. In classical policy tree search, the policy is a guide toward some externally checkable solution: solve a puzzle, prove a theorem, synthesize a program, or reach a goal state. In standard language-model decoding, however, the terminal condition is often just sequence completion, and many completed sequences are valid. If the only score is model likelihood, then policy tree search largely reduces to another way of prioritizing high-probability continuations, which beam search already does directly.
Beam search also matches the hardware structure of language-model inference. All hypotheses in the beam have the same depth, so their next-token distributions can be computed in a single batched forward pass. More general policy tree search methods may expand nodes at different depths or revisit irregular parts of the tree, which can be less efficient on accelerators even when the search is theoretically well motivated.
Policy tree search methods become more attractive when generation has an external verifier or reward signal that is not captured by local token likelihood. Examples include code generation with unit tests, theorem proving, tool-use planning, constrained decoding, and mathematical reasoning with checkable answers. In those settings, the language model policy proposes plausible actions, while the search algorithm allocates compute toward paths that are likely to produce a verified solution. This is closer to the setting where methods like Levin Tree Search are designed to be useful.
Implementation
// Concept for environment state
template <typename T>
concept IsSearchEnv = std::equality_comparable<T> && requires (T t, const T ct) {
// apply action and return step cost
{ t.apply_action(std::declval<int>()) } -> std::same_as<double>;
{ ct.is_solution() } -> std::same_as<bool>;
{ ct.is_terminal() } -> std::same_as<bool>;
{ std::hash<T>{}(ct) } -> std::convertible_to<std::size_t>;
{ auto(T::num_actions) } -> std::same_as<int>;
};
// Concept for search node
template <typename T, IsSearchEnv Env>
concept IsSearchNode = requires (T t, const T ct) {
{ ct.generate_child(std::declval<int>()) } -> std::same_as<T>;
{ auto(t.parent) } -> std::same_as<T*>;
{ auto(t.state) } -> std::same_as<Env>;
{ auto(t.cost) } -> std::same_as<double>;
{ auto(t.generated_step) } -> std::same_as<int>;
{ auto(t.expanded_step) } -> std::same_as<int>;
};
// Node hashing
template <IsSearchEnv Env, typename Node>
requires IsSearchNode<Node, Env>
struct NodeHasher {
auto operator()(const Node *node) const -> std::size_t{
assert(node);
using State = std::remove_cvref_t<decltype(node->state)>;
return std::hash<State>{}(node->state);
}
};
// Node equality
template <IsSearchEnv Env, typename Node>
requires IsSearchNode<Node, Env>
struct NodeEquality {
auto operator()(const Node *lhs, const Node *rhs) const -> bool {
assert(lhs && rhs);
return lhs->state == rhs->state;
}
};
template <IsSearchEnv Env, typename Node>
requires IsSearchNode<Node, Env>
auto beam_search(
Node root,
std::size_t beam_width,
std::function<double(const Node&)> eval_func
) -> bool
{
assert(beam_width > 0);
// Search data structures
std::vector<std::unique_ptr<Node>> nodes;
std::vector<Node*> beam;
auto node_compare = [&](const Node *lhs, const Node *rhs) {
assert(lhs && rhs);
if (lhs->cost != rhs->cost) {
return lhs->cost < rhs->cost;
}
return lhs->generated_step < rhs->generated_step;
};
std::unordered_multiset<
Node *,
NodeHasher<Env, Node>,
NodeEquality<Env, Node>
> closed;
root.generated_step = 0;
root.expanded_step = 0;
int num_expanded = 0;
int num_generated = 0;
root.cost = eval_func(root);
nodes.emplace_back(std::make_unique<Node>(std::move(root)));
beam.push_back(nodes.back().get());
while (!beam.empty()) {
std::vector<Node*> candidates;
for (auto *current : beam) {
closed.insert(current);
current->expanded_step = ++num_expanded;
// Check for solution and terminal states
if (current->state.is_solution()) {
return true;
}
if (current->state.is_terminal()) {
continue;
}
// Generate children
for (auto a : std::views::indices(Env::num_actions)) {
auto child = std::make_unique<Node>(current->generate_child(a));
if (closed.contains(child.get())) {
continue;
}
child->generated_step = ++num_generated;
child->cost = eval_func(*child);
nodes.emplace_back(std::move(child));
candidates.push_back(nodes.back().get());
}
}
std::sort(candidates.begin(), candidates.end(), node_compare);
if (candidates.size() > beam_width) {
candidates.resize(beam_width);
}
beam = std::move(candidates);
}
return false;
}