So I've been trying to come up with something along those lines. I don't have anything I'm super happy with, but here are some thoughts. First, here's a theorem that says that the conditional SSP regret is bounded by the conditional cost-sensitive regret summed across subproblems encountered on the SSP regret minimizing path.
Theorem:Optimal Path Regret Bound
For fixed x, let p∗(x) denote the set of (k,v) pairs encountered on the SSP regret minimizing path of length n from source vertex s to target vertex t. Then for all SSP distributions and all cost-sensitive classifiers Ψ, rspath(Ψ)=Ex∼Dx[∑(k,v)∈p∗(x)qkv(Ψ|x)], with the definition ∀v,qn−1,v(Ψ|x)=qnv(Ψ|x)=0.
Proof: Let Υ∗kv(x) be the set of (k,v) pairs encountered on the SSP regret minimizing path of length n−k from source vertex v to target vertex t. Proof is inductive on the property rkv(Ψ|x)≤∑(k′,v′)∈Υ∗kv(x)qkv(Ψ|x).
When k=n−2, it is easily verified directly that rn−2,v(Ψ|x)=qn−2,v(Ψ|x).
For k∈[1,n−2), rkv(Ψ|x)≤qkv(Ψ|x)+rk+1,w∗(Ψ|x)≤qkv(Ψ|x)+∑(k′,v′)∈Υ∗k+1,w∗(x)qk′,v′(Ψ|x)=∑(k′,v′)∈Υ∗kv(x)qk′,v′(Ψ|x), where the first step is from the next step bound lemma, the second step leverages the induction hypothesis, and the third step is the from definition of Υ∗kv(x). Noting that Υ∗1s(x)=p∗(x) yields
rspath(Ψ|x)=r1s(Ψ|x)≤∑(k,v)∈p∗(x)qkv(Ψ|x).
Taking expectation with respect to Dx completes the proof.
Proof: Let Υ∗kv(x) be the set of (k,v) pairs encountered on the SSP regret minimizing path of length n−k from source vertex v to target vertex t. Proof is inductive on the property rkv(Ψ|x)≤∑(k′,v′)∈Υ∗kv(x)qkv(Ψ|x).
When k=n−2, it is easily verified directly that rn−2,v(Ψ|x)=qn−2,v(Ψ|x).
For k∈[1,n−2), rkv(Ψ|x)≤qkv(Ψ|x)+rk+1,w∗(Ψ|x)≤qkv(Ψ|x)+∑(k′,v′)∈Υ∗k+1,w∗(x)qk′,v′(Ψ|x)=∑(k′,v′)∈Υ∗kv(x)qk′,v′(Ψ|x), where the first step is from the next step bound lemma, the second step leverages the induction hypothesis, and the third step is the from definition of Υ∗kv(x). Noting that Υ∗1s(x)=p∗(x) yields
rspath(Ψ|x)=r1s(Ψ|x)≤∑(k,v)∈p∗(x)qkv(Ψ|x).
Taking expectation with respect to Dx completes the proof.
Returning to the question of a better bound, it's a short hop from the above theorem to the statement rspath(Ψ)≤(n−2)Ex∼Dx[maxkvqkv(Ψ|x)], which has the right flavor. For instance, if I reduce the individual CSMC subproblems to regression, I'd pick up an extra √2n leading factor and a √ϵkv dependence on the regression regret for the subproblem. That would give rspath(Ψ)≤(n−2)Ex∼Dx[maxkvqkv(Ψ|x)]≤(n−2)√2nEx∼Dx[maxkv√ϵkv(Ψ|x)]≤(n−2)√2n√Ex∼Dx[maxkvϵkv] where maxa√fa=√maxafa and Jensen's inequality combine in the last step. This somewhat looks like O(n3/2)√ϵ that I get from a direct reduction to regression. However I'm not satisfied, because I don't have a good intuition about how the expectation of a single regression subproblem relates to an ``L∞ style'' expectation over the maximum of a set of regression subproblems.
No comments:
Post a Comment