Beta. Content is under active construction and has not been peer-reviewed. Report errors on GitHub.Disclaimer

Comparison

Weak Duality vs. Strong Duality

Weak duality always holds: dual optimal is at most primal optimal. Strong duality says they are equal, but requires constraint qualifications like Slater's condition. KKT conditions become necessary and sufficient under strong duality.

What Each States

Both weak and strong duality relate the primal optimization problem to its Lagrangian dual. Consider the primal problem:

minxf0(x)s.t.fi(x)0,  i=1,,m;hj(x)=0,  j=1,,p\min_{x} f_0(x) \quad \text{s.t.} \quad f_i(x) \leq 0, \; i = 1, \ldots, m; \quad h_j(x) = 0, \; j = 1, \ldots, p

and its Lagrangian dual:

maxλ,νg(λ,ν)=maxλ0,νinfx[f0(x)+iλifi(x)+jνjhj(x)]\max_{\lambda, \nu} g(\lambda, \nu) = \max_{\lambda \geq 0, \nu} \inf_{x} \left[ f_0(x) + \sum_i \lambda_i f_i(x) + \sum_j \nu_j h_j(x) \right]

Weak duality says the dual optimal value never exceeds the primal optimal value.

Strong duality says they are equal.

Side-by-Side Statement

Definition

Weak Duality

Let pp^* be the primal optimal value and dd^* the dual optimal value. Then:

dpd^* \leq p^*

This holds for any optimization problem (convex or not), with no additional assumptions. The quantity pd0p^* - d^* \geq 0 is called the duality gap.

Definition

Strong Duality

Strong duality holds when:

d=pd^* = p^*

The duality gap is zero. This requires additional conditions beyond convexity.

Why Weak Duality Always Holds

Theorem

Weak Duality

Statement

For any optimization problem with Lagrangian L(x,λ,ν)L(x, \lambda, \nu) and dual function g(λ,ν)=infxL(x,λ,ν)g(\lambda, \nu) = \inf_x L(x, \lambda, \nu), we have dpd^* \leq p^*.

Intuition

For any feasible primal point xx and any dual-feasible (λ0,ν)(\lambda \geq 0, \nu):

g(λ,ν)=infx~L(x~,λ,ν)L(x,λ,ν)f0(x)g(\lambda, \nu) = \inf_{\tilde{x}} L(\tilde{x}, \lambda, \nu) \leq L(x, \lambda, \nu) \leq f_0(x)

The first inequality is because the infimum is at most any particular value. The second inequality holds because at a feasible xx: fi(x)0f_i(x) \leq 0 and λi0\lambda_i \geq 0, so λifi(x)0\lambda_i f_i(x) \leq 0; and hj(x)=0h_j(x) = 0, so νjhj(x)=0\nu_j h_j(x) = 0. Maximizing over (λ,ν)(\lambda, \nu) and minimizing over xx preserves the inequality.

Failure Mode

Weak duality never fails. It is a universal property of the Lagrangian framework. The only question is whether the gap is zero (strong duality) or positive.

When Strong Duality Holds

Theorem

Strong Duality via Slater's Condition

Statement

If the primal problem is convex (f0,f1,,fmf_0, f_1, \ldots, f_m convex, hjh_j affine) and there exists a point x~\tilde{x} in the relative interior of the domain with fi(x~)<0f_i(\tilde{x}) < 0 for all ii (strict inequality), then strong duality holds: d=pd^* = p^*.

Intuition

Slater's condition requires a strictly feasible point: one that satisfies all inequality constraints with slack. This ensures the feasible set has nonempty interior, which prevents pathological geometric configurations where the constraint surface is "tangent" to the objective level set in a way that creates a gap.

Failure Mode

Slater's condition is sufficient, not necessary. Strong duality can hold without it (e.g., for linear programs, strong duality holds whenever the primal or dual is feasible and bounded). Strong duality fails when the feasible region has no relative interior point satisfying strict inequalities, which can happen with non-convex problems or degenerate convex problems.

When Strong Duality Fails

Strong duality can fail even for convex problems when constraint qualifications are violated.

Example

A convex problem with a duality gap

Consider:

minx1,x2x1s.t.x12/(x2)0,  x20\min_{x_1, x_2} x_1 \quad \text{s.t.} \quad x_1^2/(x_2) \leq 0, \; x_2 \geq 0

where the constraint function f1(x)=x12/x2f_1(x) = x_1^2/x_2 is convex on x2>0x_2 > 0. The feasible set is {(0,x2):x20}\{(0, x_2) : x_2 \geq 0\}, so p=0p^* = 0. The Lagrangian is L(x,λ)=x1+λx12/x2L(x, \lambda) = x_1 + \lambda x_1^2 / x_2, and infx2>0,x1L=\inf_{x_2 > 0, x_1} L = -\infty for every λ0\lambda \geq 0 (taking x1x_1 \to -\infty with x2x_2 large), giving d=d^* = -\infty. Slater's condition fails because there is no point with f1(x)<0f_1(x) < 0 and x2>0x_2 > 0. See Boyd and Vandenberghe (2004), Section 5.2.4.

For non-convex problems, duality gaps are the norm, not the exception. Integer programming problems routinely have large duality gaps. The LP relaxation dual gives a lower bound on the integer optimum, and the gap measures how far the relaxation is from exact.

KKT Conditions Under Strong Duality

When strong duality holds, the KKT conditions become necessary for optimality (for convex problems, they are also sufficient):

  1. Stationarity: f0(x)+iλifi(x)+jνjhj(x)=0\nabla f_0(x^*) + \sum_i \lambda_i^* \nabla f_i(x^*) + \sum_j \nu_j^* \nabla h_j(x^*) = 0
  2. Primal feasibility: fi(x)0f_i(x^*) \leq 0, hj(x)=0h_j(x^*) = 0
  3. Dual feasibility: λi0\lambda_i^* \geq 0
  4. Complementary slackness: λifi(x)=0\lambda_i^* f_i(x^*) = 0 for all ii

Without strong duality, a primal optimum may not correspond to any dual optimum, and the KKT conditions may have no solution.

Key Assumptions That Differ

Weak DualityStrong Duality
Statementdpd^* \leq p^*d=pd^* = p^*
AssumptionsNone (always holds)Constraint qualification (e.g., Slater)
Convexity required?NoTypically yes (for Slater)
Practical useLower bounds on optimal valueSolve dual instead of primal
KKTNot guaranteedKKT necessary and sufficient

Practical Consequences

When strong duality holds, you can solve the dual instead of the primal. This is useful when:

  1. The dual is easier. The SVM dual has nn variables (one per training point) with a simple constraint set, while the primal has dd variables (one per feature). When n<dn < d, the dual is cheaper.
  2. The dual reveals structure. The SVM dual shows that only support vectors matter (λi>0\lambda_i > 0 only for points on or inside the margin).
  3. You need a lower bound. Even if you cannot solve the primal exactly, any dual-feasible point gives a guaranteed lower bound on the primal optimum.

Common Confusions

Watch Out

Slater is sufficient, not necessary

Many textbooks present Slater's condition as the criterion for strong duality. But strong duality can hold without Slater. Linear programs always satisfy strong duality (when feasible and bounded) without needing a strictly feasible point. Slater is the most commonly used sufficient condition, not the only path to strong duality.

Watch Out

Strong duality does not mean the dual is easy to solve

Having d=pd^* = p^* means the dual has the same optimal value. It does not mean the dual problem is computationally easier. For some problems the dual is harder. The advantage of duality is structural, not automatically computational.

Watch Out

Complementary slackness requires strong duality

Complementary slackness (λifi(x)=0\lambda_i^* f_i(x^*) = 0) is a consequence of zero duality gap. If the gap is positive, there is no pair (x,λ,ν)(x^*, \lambda^*, \nu^*) satisfying all KKT conditions simultaneously.

References

Canonical:

Current: