Interior-point Method#

Inequality Constrained Minimization Problems#

The interior-point methods are usually for solving convex optimization problem that include inequality constraints:

(1)#\[\begin{align} \min & f_0(x) \\ \text{s.t.} & f_i(x) \le 0, i = 1, ..., m \\ & Ax = b \end{align}\]

where \(f_0, f_i\) are convex and twice continuously differentiable.

If the problem is strictly feasible, based on Slater’s constraint qualification, there exist dual optimal \(\lambda^*, \mu^*\), which together with \(x^*\) statisfy the KKT conditions:

\[\begin{gather*} Ax^* = b\\ f_i(x^*) \le 0\\ \lambda^* \ge 0 \\ \nabla f_0(x^*) + \sum_{i=1}^{m} \lambda_i \nabla f_i(x^*) + A^Tv^* = 0 \\ \lambda^* f_i(x^*) = 0, i = 1, ..., m\\ \end{gather*}\]

Solving the orginal optimizaiton problem is equvelant to solving its KKT conditions. Interior-point method applies Newton's method to solve the modified version of the KKT conditions or a sequence of equality constrained problems transformed from the origin inequality constrained problem.

Two types of methods can be used: Primal-dual interior-point or Barrier method.

Barrier Fucntion#

A straight-forward way to solve the orignal problem is to approxmimately formulate the inequality constrained problem as an equality constrained problem to which Newton's method can be applied directly. The original problem can be rewritten as:

\[\begin{gather*} \min & f_0(x) + \sum_{i=1}^m I_{-}(f_i(x))\\ \text{s.t.} & Ax = b\\ \end{gather*}\]

where \(I_{-}\) is the indicator method for the nonpositive reals,

\[\begin{equation*} I_{-}(u) = \begin{cases} 0, & u \le 0\\ \infty, & u \gt 0 \end{cases} \end{equation*}\]

Logarithmic Barrier Function#

The basic idea of the barrier method is to approximate the indicator function by the logorithmic function:

(2)#\[\begin{equation} \hat I_{-}(u) = -(1/t) log(-u) \end{equation}\]

where \(t>0\) is a parameter that sets the accuracy of the approximation. The \(\hat I_{-}\) is convex and nondecreasing, and differentiable unlike \(I_{-}\). Therefore the following approximation holds:

\[\begin{gather*} \min & f_0(x) + \sum_{i=1}^m -(1/t)\log(-f_i(x))\\ \text{s.t.} & Ax = b\\ \end{gather*}\]

The function

(3)#\[\begin{equation} \phi(x) = -\sum_{i=1}^m \log(-f_i(x)) \end{equation}\]

is called the logarithmic barrier, whose gradient and Hessian are given by:

\[\begin{gather*} \nabla\phi(x) = \sum_{i=1}^m \frac{1}{-f_i(x)} \nabla f_i(x) \\ \nabla^2\phi(x) = \sum_{i=1}^m \frac{1}{f_i^2(x)} \nabla f_i(x)^2 + \sum_{i=1}^m \frac{1}{-f_i(x)} \nabla^2 f_i(x) \\ \end{gather*}\]

Central Path#

By arranging t we have the following equavelant problem, which has the same minimizers.

\[\begin{gather*} \min & tf_0(x) + \phi(x)\\ \text{s.t.} & Ax = b\\ \end{gather*}\]

If the problem has a unique solution for each \(t>0\), then the solution is associated with the choice of t, which can be denoted by \(x^*(t)\). \(x^*(t)\) is called the central points, which characterize a central path.

Modified KKT Conditions#

The moodified KKT conditions for the original problem with the barrier function approximation are as follows, which is also called centrality conditions in this background:

\[\begin{gather*} Ax^* = b\\ f_i(x^*) \le 0\\ \lambda^* \ge 0 \\ \nabla f_0(x^*) + \sum_{i=1}^{m} \lambda_i \nabla f_i(x^*) + A^Tv^* = 0 \\ -\lambda^* f_i(x^*) = 1/t, i = 1, ..., m\\ \end{gather*}\]

The difference between the KKT condition and the modifided KKT is that the complementary condition \(-\lambda^* f_i(x^*) = 0\) is replaced by the condition \(-\lambda^* f_i(x^*) = 1/t\). In particular, for large \(t, x^*(t)\) and the associated dual point \(\lambda^*(t), \mu^*(t)\) almost satisfy the KKT optimality conditions.

Primal-dual Interior-point Method#

Primal-dual interior-point method solves the modified KKT conditions using Newton's method by finding a pair of solutions \((x^*, \lambda^*, \nu^*)\).

Newton's method updates at each iteration the solution by a Newton step.

We define a residual function from the modified KKT conditions as:

(4)#\[\begin{gather} r_t(x, \lambda, \nu) = \begin{bmatrix} \nabla f_0(x) + Df(x)^T\lambda + A^T \nu \\ - diag(\lambda)f(x) - (1/t)I \\ Ax-b \end{bmatrix} \end{gather}\]

where

(5)#\[\begin{gather} f(x) = \begin{bmatrix} f_1(x) \\ ... \\ f_m(x) \end{bmatrix}, Df(x) = \begin{bmatrix} \nabla f_1(x) \\ ... \\ \nabla f_m(x) \end{bmatrix} \end{gather}\]

The first element in the the residual function is the dual residual, the middle element is centrality residual, and the last element is primal residual.

The Newton's method can be used to find \((x^*, \lambda^*, \nu^*)\) so that the residual function is 0.

The Newton step for the residual function can be:

(6)#\[\begin{equation} -\frac{r_t}{\nabla r_t} \end{equation}\]