Levin Tree Search (LTS)

Levin Tree Search (LTS) is a best-first search policy tree search method that uses a policy to guide the search. Rather than ordering nodes solely by the policy , it also incorporates the node depth to account for nodes which increases in depth but remain constant in path probability.

Definition

Levin Tree Search is obtained from best-first search by using

where is the path probability from the root to node , and is the depth of node .

Analysis

The novelty of policy tree search algorithms is that they come with a bound on the number of node expansions before expanding a goal node.

Theorem

The number of node expansions LTS requires before reaching the goal node with minimal Levin cost is bounded by

The Levin cost is not merely a priority rule; for a goal node , the quantity upper-bounds the amount of search effort required to read that node. In this sense, the evaluation function serves as a proxy for first-solution search effort rather than an unrelated priority score.

Implementation

Numerical Stability

The path probabilities may decrease exponentially with depth, which can lead to numerical underflow. Since applying a monotone transformation preserves the expansion order induced by the evaluation function, it is often preferable to compute Levin-style costs in log-space:

Moreover, if is a child of , then the log-path probability can be updated incrementally:

Avoiding Zero Probabilities

To avoid zero probabilities being assigned to various paths, one can mix the policy with a uniform policy:

Code

template <typename Env>
struct LTSNode {    
    auto compute_log_policy(const Node ) -> std::vector<double>;
    
    auto generate_child(int action) -> BFSNode {
        auto child = *this;
        child.parent = this;
        child.state.apply_action(action);
        ++child.depth;
        child.log_p += child.parent.log_policy[action];
        child.log_policy = compute_log_policy(child);
        return child;
    }
 
    Env state;
    BFSNode *parent = nullptr;
    double depth = 0;
    double log_p = 0;
    std::vector<double> log_policy;
    double cost = 0;
    int generated_step = 0;
    int expanded_step = 0;
};
 
template <typename Env>
auto lts_eval_func(const AStarNode<Env> &node) -> double {
    return std::log(depth) - log_p;
}
 
template <typename Env>
auto lts(Env initial_state) {
    LTSNode root_node = { .state=initial_state }
    return best_first_search(root_node, lts_eval_func);
}

Sources