Beta. Content is under active construction and has not been peer-reviewed. Report errors on GitHub.Disclaimer

Algorithms Foundations

Knapsack Problem

The canonical constrained optimization problem: 0/1 knapsack (NP-hard, pseudo-polynomial DP), fractional knapsack (greedy), FPTAS, and connections to Lagrangian relaxation in ML.

CoreTier 2Stable~40 min
0

Why This Matters

The knapsack problem is the prototype of constrained optimization: maximize value subject to a resource constraint. This pattern is everywhere in machine learning. Selecting features subject to a computation budget, allocating GPU memory across model components, choosing which data points to label given an annotation budget. these are all knapsack-type problems.

Understanding knapsack gives you three fundamental algorithm design techniques in one problem: dynamic programming (exact solution), greedy (fast approximation for the fractional case), and FPTAS (near-optimal in polynomial time).

Mental Model

You have a backpack with a weight limit WW. There are nn items, each with a weight and a value. You want to pack the most valuable subset that fits. The catch: you cannot split items (0/1 knapsack) or you can split them (fractional knapsack). This simple change. can you take fractions?. flips the problem from NP-hard to polynomial.

Problem Definitions

Definition

0/1 Knapsack Problem

Given nn items with weights w1,,wnw_1, \ldots, w_n and values v1,,vnv_1, \ldots, v_n, and a capacity WW, find a subset S{1,,n}S \subseteq \{1, \ldots, n\} that maximizes:

iSvisubject toiSwiW\sum_{i \in S} v_i \quad \text{subject to} \quad \sum_{i \in S} w_i \leq W

Each item is either included entirely or not at all.

Definition

Fractional Knapsack Problem

Same setup, but you may take any fraction xi[0,1]x_i \in [0, 1] of each item. Maximize:

i=1nxivisubject toi=1nxiwiW\sum_{i=1}^n x_i v_i \quad \text{subject to} \quad \sum_{i=1}^n x_i w_i \leq W

The 0/1 Knapsack: Dynamic Programming

Theorem

0/1 Knapsack DP Solution

Statement

Define dp[i][c]\text{dp}[i][c] as the maximum value achievable using items 1,,i1, \ldots, i with capacity cc. The recurrence is:

dp[i][c]={dp[i1][c]if wi>cmax(dp[i1][c],  dp[i1][cwi]+vi)if wic\text{dp}[i][c] = \begin{cases} \text{dp}[i-1][c] & \text{if } w_i > c \\ \max(\text{dp}[i-1][c], \; \text{dp}[i-1][c - w_i] + v_i) & \text{if } w_i \leq c \end{cases}

with base case dp[0][c]=0\text{dp}[0][c] = 0 for all cc. The optimal value is dp[n][W]\text{dp}[n][W]. This algorithm runs in O(nW)O(nW) time and O(nW)O(nW) space (reducible to O(W)O(W) space).

Intuition

For each item, you have two choices: skip it or take it. If you skip item ii, the best value with capacity cc is the same as with items 1,,i11, \ldots, i-1. If you take item ii, you gain viv_i but lose wiw_i capacity. The recurrence considers both choices and picks the better one.

Proof Sketch

By induction on ii. Base case: with zero items, the value is 0 for any capacity. Inductive step: assume dp[i1][]\text{dp}[i-1][\cdot] is correct. For item ii, any optimal solution either includes ii or not. If it includes ii, the remaining items form an optimal solution for items 1,,i11, \ldots, i-1 with capacity cwic - w_i (by cut-and-paste argument). If not, the remaining items form an optimal solution for items 1,,i11, \ldots, i-1 with capacity cc. The max of these two cases gives the optimal.

Why It Matters

This is the textbook example of pseudo-polynomial time: O(nW)O(nW) looks polynomial, but WW can be exponential in the input size (which is O(nlogW)O(n \log W) bits). This distinction between polynomial and pseudo-polynomial is essential for understanding NP-hardness and approximation algorithms.

Failure Mode

The algorithm requires integer weights. For real-valued weights, you must discretize (which introduces approximation error) or use a different approach. The O(nW)O(nW) runtime is impractical when WW is very large (e.g., W=1018W = 10^{18}).

The Fractional Knapsack: Greedy

Theorem

Greedy Optimality for Fractional Knapsack

Statement

Sort items by value-to-weight ratio vi/wiv_i / w_i in decreasing order. Greedily take as much as possible of each item in this order until the knapsack is full. This greedy algorithm produces an optimal solution to the fractional knapsack problem in O(nlogn)O(n \log n) time.

Intuition

Each "unit of weight" of item ii is worth vi/wiv_i / w_i. To maximize total value, you should fill the knapsack with the most valuable units first. Since you can take fractions, you can perfectly fill the knapsack by taking the highest-density items and splitting the last one.

Proof Sketch

Suppose the greedy solution is not optimal. Then there exists an optimal solution that takes less of some high-ratio item ii and more of some low-ratio item jj. Swap: replace some of item jj with the same weight of item ii. Since vi/wi>vj/wjv_i / w_i > v_j / w_j, the total value increases, contradicting optimality of the supposed better solution.

Why It Matters

This is a clean example of the greedy method working. The key insight: the fractional relaxation removes the combinatorial structure that makes 0/1 knapsack hard. This connection between fractional relaxation and greedy optimality appears throughout combinatorial optimization.

Failure Mode

The greedy algorithm is not optimal for 0/1 knapsack. Classic counterexample: items with (w,v)=(1,6),(2,10),(3,12)(w, v) = (1, 6), (2, 10), (3, 12) and W=5W = 5. Greedy by ratio takes items 1 and 2 (value 16), but optimal takes items 2 and 3 (value 22).

The FPTAS: Near-Optimal in Polynomial Time

Theorem

FPTAS for 0/1 Knapsack

Statement

There exists a fully polynomial-time approximation scheme (FPTAS) for 0/1 knapsack that, for any ϵ>0\epsilon > 0, produces a solution with value at least (1ϵ)OPT(1 - \epsilon) \cdot \text{OPT} in time O(n2/ϵ)O(n^2 / \epsilon).

Intuition

The DP runs in O(nW)O(nW) time. Instead of DP over weights, do DP over values: dp[i][v]\text{dp}[i][v] is the minimum weight to achieve value vv using items 1,,i1, \ldots, i. The issue is that values can be large. The FPTAS fixes this by rounding all values down to multiples of a small number K=ϵvmax/nK = \epsilon \cdot v_{\max} / n, reducing the number of distinct value levels to O(n/ϵ)O(n / \epsilon).

Proof Sketch

Rounding each value down by at most KK changes the optimal value by at most nK=ϵvmaxϵOPTnK = \epsilon \cdot v_{\max} \leq \epsilon \cdot \text{OPT}. So the rounded solution has value (1ϵ)OPT\geq (1 - \epsilon) \cdot \text{OPT}. The DP over rounded values has O(n/ϵ)O(n / \epsilon) value levels and nn items, giving O(n2/ϵ)O(n^2 / \epsilon) time. Since this is polynomial in both nn and 1/ϵ1/\epsilon, it is a FPTAS.

Why It Matters

The FPTAS shows that while 0/1 knapsack is NP-hard exactly, it is easy to approximate. You can get within any desired fraction of optimal in polynomial time. This is the gold standard for NP-hard problems. Not all NP-hard problems admit an FPTAS (unless P = NP).

Failure Mode

The O(n2/ϵ)O(n^2 / \epsilon) runtime can still be large. For ϵ=0.01\epsilon = 0.01 and n=106n = 10^6, you need 101410^{14} operations. In practice, branch-and-bound methods are often faster for real-world instances.

Connection to Lagrangian Relaxation

The constrained optimization structure of knapsack connects directly to Lagrangian methods used throughout ML. The Lagrangian relaxation of the 0/1 knapsack is:

L(λ)=maxx{0,1}ni=1nvixiλ(i=1nwixiW)L(\lambda) = \max_{x \in \{0,1\}^n} \sum_{i=1}^n v_i x_i - \lambda\left(\sum_{i=1}^n w_i x_i - W\right)

For a given multiplier λ0\lambda \geq 0, this decomposes into nn independent decisions: include item ii if viλwi>0v_i - \lambda w_i > 0, i.e., if the "value minus cost" is positive. The optimal λ\lambda is found by solving the dual problem minλ0L(λ)\min_{\lambda \geq 0} L(\lambda).

This Lagrangian perspective appears whenever you have a constrained optimization in ML: the multiplier λ\lambda plays the same role as the regularization parameter in regularized ERM, the KL penalty coefficient in RLHF, or the constraint threshold in constrained optimization.

Common Confusions

Watch Out

Pseudo-polynomial is not polynomial

The O(nW)O(nW) DP looks polynomial, but WW is an integer whose representation requires O(logW)O(\log W) bits. The runtime is exponential in the input size logW\log W. If someone gives you a knapsack instance with W=2100W = 2^{100}, the DP is utterly infeasible despite being "O(nW)."

Watch Out

Greedy works for fractional but not 0/1

The greedy algorithm (sort by value-to-weight ratio) is optimal for fractional knapsack but can be arbitrarily bad for 0/1 knapsack without modification. The integrality constraint changes the problem from continuous optimization (easy) to combinatorial optimization (hard).

Watch Out

NP-hard does not mean practically impossible

The FPTAS gives near-optimal solutions in polynomial time. Branch-and-bound solves most practical instances quickly. NP-hardness is a worst-case complexity statement, not a practical impossibility statement. Real-world knapsack instances with millions of items are routinely solved.

Summary

  • 0/1 knapsack: NP-hard, solved by DP in O(nW)O(nW) pseudo-polynomial time
  • Fractional knapsack: greedy (sort by vi/wiv_i/w_i) is optimal, O(nlogn)O(n \log n)
  • FPTAS: (1ϵ)(1-\epsilon)-approximate solution in O(n2/ϵ)O(n^2/\epsilon) time
  • Pseudo-polynomial is not polynomial: O(nW)O(nW) depends on the magnitude of WW
  • Lagrangian relaxation connects knapsack to regularized optimization in ML
  • The fractional/0/1 gap illustrates how integrality makes problems hard

Exercises

ExerciseCore

Problem

Solve the 0/1 knapsack instance: items (w,v)=(2,3),(3,4),(4,5),(5,6)(w, v) = (2, 3), (3, 4), (4, 5), (5, 6) with capacity W=7W = 7. Fill in the DP table and trace back the optimal solution.

ExerciseAdvanced

Problem

Prove that the greedy algorithm for 0/1 knapsack (take items in decreasing order of vi/wiv_i/w_i until one does not fit, then skip it) can achieve value as low as OPT/2\text{OPT}/2 in the worst case. Give a concrete example.

ExerciseAdvanced

Problem

Explain why the knapsack FPTAS does not violate the fact that 0/1 knapsack is NP-hard. What is the distinction between exact and approximate solutions?

References

Canonical:

  • Korte & Vygen, Combinatorial Optimization, Chapter 17
  • Garey & Johnson, Computers and Intractability (1979)

Current:

  • Kellerer, Pferschy, Pisinger, Knapsack Problems (2004)
  • Williamson & Shmoys, The Design of Approximation Algorithms (2011), Chapter 3

Next Topics

The natural next steps from knapsack:

  • Lagrangian relaxation: the general framework for constrained optimization
  • Integer programming: the broader class of problems knapsack belongs to
  • Approximation algorithms: systematic techniques for NP-hard problems

Last reviewed: April 2026

Prerequisites

Foundations this topic depends on.