Result:Average CSMC Regression Regret
For all average constrained CSMC distributions D, and all cost-sensitive classifiers hg derived from a regressor g, rav(hg)≤√2|K|ϵL2(g), where ϵL2(g) is the L2 loss on the underlying regression problem.
Proof: See Average Analysis below.
Proof: See Average Analysis below.
Result:Minimax CSMC Regression Regret
For all minimax constrained CSMC distributions D, and all cost-sensitive classifiers hg derived from a regressor g, rmm(hg)≤√2|K|ϵL2(g), where ϵL2(g) is the L2 loss on the underlying regression problem.
Proof: See Minimax Analysis below.
Proof: See Minimax Analysis below.
Average Analysis
In this case, there is a distribution D=Dx×Dω|x×Dc|ω,x, where c:K→R takes values in the extended reals R=R∪{∞}, and the components of c which are ∞-valued for a particular instance are revealed as part of the problem instance via ω∈P(K) (i.e., ω is a subset of K). The regret of a particular classifier h:X×P(K)→K is given by rav(h)=E(x,ω)∼Dx×Dω|x[Ec∼Dc|ω,x[c(h(x,ω))−mink∈KEc∼Dc|ω,x[c(k)]]]. An argmax regression strategy to solve cost-sensitive multiclass classifier is a function g:X×P(K)×K→R which defines an associated cost-sensitive multiclass classifier hg:X×P(K)→K according to hg(x,ω)=argmink∈Kg(x,ω,k). I would like to bound rav(hg) in terms of the regret of g on the regression problem, ϵL(g)=qL(g)−mingqL(g), where q is the error on the regression problem qL(g)=E(x,ω,c)∼D[1|K|∑k∈KL(g(x,ω,k),c(k))], and L is a loss function for the regression problem (defined on the extended reals). I'll focus on L2 loss for the regressor defined on the extended reals via L2(∞,∞)=0, L2(∞,⋅)=∞, and L2(⋅,∞)=∞.Consider a single instance (x,ω) with associated conditional per-instance cost-vector distribution Dc|ω,x, and suppose our regressor has cost estimates which differ from a minimum error regressor's estimate c∗(x,ω,k) by δ(x,ω,k), g(x,ω,k)=c∗(x,ω,k)+δ(x,ω,k). For k∈ω, δ(x,ω,k)=0 since both c∗(x,ω,k) and our regressor g(x,ω,k) will be ∞. The associated classifier hg is hg(x,ω)=argmink∈K(c∗(x,ω,k)+δ(x,ω,k)). Imagine an adversary which attempts to create a certain amount of CSMC regret on this instance while minimizing the amount of regression regret on this instance. This adversary is faced with the following family of problems indexed by k∗∗∈K∖ω: minδ∑k∈Kδ(x,ω,k)2s.t.∀k≠k∗∗,c∗(x,ω,k∗∗)+δ(x,ω,k∗∗)≤c∗(x,ω,k)+δ(x,ω,k). This is the same as the unconstrained CSMC reduction to regression but with k∗∗ restricted to the set K∖ω. When |K∖ω|≤1, the CSMC regret is zero; otherwise the adversary's strategy is unchanged: perturb the k∗ and k∗∗ estimates and leave others alone. Thus leveraging the previous analysis yields rav(hg)≤√2|K|ϵL2(g). It should also be noted that the regression regret will be at most the regression regret in the unconstrained case, since the additional information contained in ω allows the regressor to have a perfect estimate for some values of k.
Minimax Analysis
In this case, there is a distribution D=Dx×D˜c|x, where ˜c:K→R takes values in the regular reals R. Then, an adversary comes in and manufactures a cost vector c in the extended reals R by setting some of the components to ∞; these choices are revealed via ω prior to a decision being elicited. In this case the regret of a particular classifier is given by rmm(h)=Ex∼Dx[E˜c∼D˜c|x[maxω∈P(K){c(h(x,ω))−mink∈KE˜c∼D˜c|x[c(k)]}]]. Consider a single instance x with associated conditional per-instance cost-vector distribution Dc|x; in addition the adversary can pick ω to construct the complete problem instance (x,ω). As above the regressor has cost estimates which differ from a minimum error regressor's estimate c∗(x,ω,k) by δ(x,ω,k) and when k∈ω the estimates are perfect, i.e., δ(x,ω,k)=0.Imagine an adversary which attempts to create a certain amount of CSMC regret on this instance while minimizing the amount of regression regret on this instance. This adversary is faced with the following family of problems indexed by ω and k∗∗∈K∖ω: minδ∑k∈Kδ(x,ω,k)2s.t.∀k≠k∗∗,c∗(x,ω,k∗∗)+δ(x,ω,k∗∗)≤c∗(x,ω,k)+δ(x,ω,k). Again when |K∖ω|≤1, the CSMC regret is zero; otherwise the adversary's strategy is the same for each ω and the leveraging the previous analysis yields rmm(hg)≤√2|K|ϵL2(hg).
No comments:
Post a Comment