KKT systems

KKT systems are linear system with a special KKT structure. MadNLP uses a special structure AbstractKKTSystem to represent internally the KKT system. The AbstractKKTSystem fulfills two goals:

Store the values of the Hessian of the Lagrangian and of the Jacobian.
Assemble the corresponding KKT matrix $K$.

A brief look at the math

We have seen previously that MadNLP reformulates the inequality constraints as equality constraints, by introducing slack variables. In what follows, we denote by $w = (x, s)$ the primal iterate concatenating the original decision variable $x$ and the slack variable $s$. Once reformulated, the problem has the following structure:

\[ \begin{aligned} \min_{w} \; & f(w) \; , \\ \text{subject to} \quad & c(w) = 0 \; , \quad w \geq 0 \; . \end{aligned}\]

At each iteration, MadNLP aims at solving the following system of nonlinear equations, parameterized by a positive barrier parameter $\mu$:

\[\begin{aligned} & \nabla f(w) + \nabla c(w)^\top y - z = 0 \; , \\ & c(w) = 0 \; , \\ & WZe = \mu e \;, \; (w, z) > 0 \; . \end{aligned}\]

MadNLP solves this system using the Newton method, and computes a descent direction $(\Delta w, \Delta y, \Delta z)$ as solution of

\[\begin{bmatrix} H_k +\delta_x I & J_k^\top & -I \\ J_k & -\delta_y & 0 \\ Z_k & 0 & W_k \end{bmatrix} \begin{bmatrix} \Delta w \\ \Delta y \\ \Delta z \end{bmatrix} = - \begin{bmatrix} \nabla f(w_k) + \nabla c(w_k)^\top y_k - z_k \\ c(w_k) \\ W_k Z_k e - \mu e \end{bmatrix} \; ,\]

with $\delta_x$ and $\delta_y$ appropriate primal-dual regularization terms that guarantee $(\Delta w, \Delta y, \Delta z)$ is a descent direction for the current filter. The system needs evaluating the Jacobian of the constraints $J_k = \nabla c(w_k)^\top$ and the Hessian of the Lagrangian $H_k = \nabla_{xx}^2 L(w_k, y_k, z_k)$.

We note by $(r_1, r_2, r_3)$ the right-hand-side. The primal-dual KKT system can be symmetrized as

\[\begin{bmatrix} H_k +\delta_x I & J_k^\top & Z_k^{1/2} \\ J_k & -\delta_y & 0 \\ Z_k^{1/2} & 0 & -W_k \end{bmatrix} \begin{bmatrix} \Delta x \\ \Delta y \\ -Z_k^{-1/2} \Delta z \end{bmatrix} = \begin{bmatrix} r_1 \\ r_2 \\ Z_k^{-1/2} r_3 \end{bmatrix}\]

This system is implemented as an AbstractUnreducedKKTSystem in MadNLP.

We can obtain a smaller augmented KKT system by eliminating the last block of rows associated to $\Delta z = W_k^{-1} (r_3 - Z_k \Delta w)$:

\[\begin{bmatrix} H_k + \Sigma_k + \delta_x I & J_k^\top \\ J_k & -\delta_y \end{bmatrix} \begin{bmatrix} \Delta w \\ \Delta y \end{bmatrix} = \begin{bmatrix} r_1 + W_k^{-1} r_3 \\ r_2 \end{bmatrix}\]

with $\Sigma_k = W_k^{-1} Z_k$. This system is implemented as an AbstractReducedKKTSystem in MadNLP.

Assembling a KKT system, step by step

The primal-dual KKT systems depend on the Hessian of the Lagrangian $H_k$ and the Jacobian $J_k$. Hence, we have to update the values in the KKT system at each iteration of the interior-point algorithm.

By default, MadNLP stores the KKT system as a SparseKKTSystem. The KKT system takes as input a SparseCallback wrapping a given NLPModel nlp. We instantiate the callback cb with the function create_callback:

cb = MadNLP.create_callback(
    MadNLP.SparseCallback,
    nlp,
)

MadNLP.SparseCallback{Float64, Vector{Float64}, Vector{Int64}, MadNLPTests.HS15Model{Float64}, MadNLP.NoFixedVariables, MadNLP.EnforceEquality}(MadNLPTests.HS15Model{Float64}
  Problem name: Generic
   All variables: ████████████████████ 2      All constraints: ████████████████████ 2     
            free: ██████████⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 1                 free: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0     
           lower: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0                lower: ████████████████████ 2     
           upper: ██████████⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 1                upper: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0     
         low/upp: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0              low/upp: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0     
           fixed: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0                fixed: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0     
          infeas: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0               infeas: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0     
            nnzh: (  0.00% sparsity)   3               linear: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0     
                                                    nonlinear: ████████████████████ 2     
                                                         nnzj: (  0.00% sparsity)   4     
                                                     lin_nnzj: (------% sparsity)         
                                                     nln_nnzj: (  0.00% sparsity)   4     

  Counters:
             obj: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0                 grad: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0                 cons: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0     
        cons_lin: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0             cons_nln: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0                 jcon: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0     
           jgrad: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0                  jac: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0              jac_lin: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0     
         jac_nln: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0                jprod: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0            jprod_lin: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0     
       jprod_nln: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0               jtprod: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0           jtprod_lin: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0     
      jtprod_nln: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0                 hess: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0                hprod: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0     
           jhess: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0               jhprod: ⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ 0     
, 2, 2, 4, 3, [0.0, 0.0], [0.0, 0.0, 0.0, 0.0], [0.0, 0.0], [0.0, 0.0, 0.0], [1, 1, 2, 2], [1, 2, 1, 2], [1, 2, 2], [1, 1, 2], Base.RefValue{Float64}(1.0), 1.0, [1.0, 1.0], [1.0, 1.0, 1.0, 1.0], MadNLP.NoFixedVariables(), MadNLP.EnforceEquality(), Int64[], [1, 2], Int64[], [3, 4], [1], Int64[], [1])

Initializing a KKT system

The size of the KKT system depends directly on the problem's characteristics (number of variables, number of of equality and inequality constraints). A SparseKKTSystem stores the Hessian and the Jacobian in sparse (COO) format. The KKT matrix can be factorized using either a dense or a sparse linear solvers. Here we use the solver provided in LAPACK:

linear_solver = LapackCPUSolver

LapackCPUSolver

We can instantiate a SparseKKTSystem using the function create_kkt_system:

kkt = MadNLP.create_kkt_system(
    MadNLP.SparseKKTSystem,
    cb,
    linear_solver,
)

MadNLP.SparseKKTSystem{Float64, Vector{Float64}, SparseArrays.SparseMatrixCSC{Float64, Int32}, MadNLP.ExactHessian{Float64, Vector{Float64}}, LapackCPUSolver{Float64, SparseArrays.SparseMatrixCSC{Float64, Int32}}, Vector{Int64}, Vector{Int32}}([0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0], MadNLP.ExactHessian{Float64, Vector{Float64}}(), [2.0e-323, 6.34436211249205e-310, 6.3443652045335e-310, 6.3443652045999e-310], [0.0, 0.0, 0.0, 0.0], [0.0, 0.0], [6.34437148191976e-310, 6.34437148191976e-310], [6.3442752171701e-310], [6.34427528331247e-310, 6.34430136495017e-310], [6.34427522052977e-310], [0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0; … ; 0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0], sparse(Int32[1, 2, 5, 6, 2, 5, 6, 3, 5, 4, 6, 5, 6], Int32[1, 1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 5, 6], [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0, 13.0], 6, 6), [1, 5, 8, 10, 1, 2, 5, 3, 6, 4, 7, 9, 11, 12, 13], [0.0 0.0 0.0 0.0; 0.0 0.0 0.0 0.0; 0.0 0.0 0.0 0.0; 0.0 0.0 0.0 0.0], sparse(Int32[1, 2, 2], Int32[1, 1, 2], [1.0, 2.0, 3.0], 4, 4), [1, 2, 3], [0.0 0.0 0.0 0.0; 0.0 0.0 0.0 0.0], sparse(Int32[1, 2, 1, 2, 1, 2], Int32[1, 1, 2, 2, 3, 4], [1.0, 2.0, 3.0, 4.0, 5.0, 6.0], 2, 4), [1, 3, 2, 4, 5, 6], LapackCPUSolver{Float64, SparseArrays.SparseMatrixCSC{Float64, Int32}}(sparse(Int32[1, 2, 5, 6, 2, 5, 6, 3, 5, 4, 6, 5, 6], Int32[1, 1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 5, 6], [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0, 13.0], 6, 6), [6.3443751916674e-310 6.34430040933314e-310 … 6.3443651934443e-310 NaN; 6.3443713372534e-310 6.34437195755003e-310 … 2.5e-323 6.34437325290557e-310; … ; 6.3443004186113e-310 6.34436900948325e-310 … 6.3443651934443e-310 6.34436900948325e-310; 5.0e-324 6.3443732524518e-310 … 2.5e-323 6.3443721236952e-310], 6, Float64[], Float64[], Float64[], [384.0, 0.0, 3.466909703e-315, 6.3443794058355e-310, 0.0, 1.51e-321, 5.0e-324, 0.0, 0.0, 3.466909703e-315  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 8.0e-323, 3.88e-321], 384, [128411514206256], -1, Base.RefValue{Int64}(0), [1, 1, 0, 0, 128411514810656, 128411589880704], MadNLP.LapackOptions(MadNLP.BUNCHKAUFMAN), MadNLP.MadNLPLogger(MadNLP.INFO, MadNLP.INFO, nothing)), [1, 2], [3, 4], [1])

Once the KKT system built, one has to initialize it to use it inside the interior-point algorithm:

MadNLP.initialize!(kkt);

3-element Vector{Float64}:
 0.0
 0.0
 0.0

The user can query the KKT matrix inside kkt, simply as

kkt_matrix = MadNLP.get_kkt(kkt)

6×6 SparseArrays.SparseMatrixCSC{Float64, Int32} with 13 stored entries:
 1.0   ⋅    ⋅     ⋅     ⋅     ⋅ 
 2.0  5.0   ⋅     ⋅     ⋅     ⋅ 
  ⋅    ⋅   8.0    ⋅     ⋅     ⋅ 
  ⋅    ⋅    ⋅   10.0    ⋅     ⋅ 
 3.0  6.0  9.0    ⋅   12.0    ⋅ 
 4.0  7.0   ⋅   11.0    ⋅   13.0

This returns a reference to the KKT matrix stores internally inside kkt. Each time the matrix is assembled inside kkt, kkt_matrix is updated automatically.

Updating a KKT system

We suppose now we want to refresh the values stored in the KKT system.

Updating the values of the Hessian

The Hessian part of the KKT system can be queried as

hess_values = MadNLP.get_hessian(kkt)

3-element Vector{Float64}:
 0.0
 0.0
 0.0

For a SparseKKTSystem, hess_values is a Vector{Float64} storing the nonzero values of the Hessian. Then, one can update the vector hess_values by using NLPModels.jl:

n = NLPModels.get_nvar(nlp)
m = NLPModels.get_ncon(nlp)
x = NLPModels.get_x0(nlp) # primal variables
l = zeros(m) # dual variables

NLPModels.hess_coord!(nlp, x, l, hess_values)

3-element Vector{Float64}:
   2.0
   0.0
 200.0

Eventually, a post-processing step is applied to refresh all the values internally:

MadNLP.compress_hessian!(kkt)

Note

By default, the function compress_hessian! does nothing. But it can be required for very specific use-case, for instance building internally a Schur complement matrix.

Updating the values of the Jacobian

We proceed exaclty the same way to update the values in the Jacobian. One queries the Jacobian values in the KKT system as

jac_values = MadNLP.get_jacobian(kkt)

4-element Vector{Float64}:
 0.0
 0.0
 0.0
 0.0

We can refresh the values with NLPModels as

NLPModels.jac_coord!(nlp, x, jac_values)

4-element Vector{Float64}:
 0.0
 0.0
 1.0
 0.0

And then applies a post-processing step as

MadNLP.compress_jacobian!(kkt)

Updating the values of the diagonal matrices

Once the Hessian and the Jacobian updated, the algorithm can apply primal and dual regularization terms on the diagonal of the KKT system, to improve the numerical behavior in the linear solver. This operation is implemented inside the regularize_diagonal! function:

pr_value = 1.0
du_value = 0.0

MadNLP.regularize_diagonal!(kkt, pr_value, du_value)

2-element Vector{Float64}:
 0.0
 0.0

Assembling the KKT matrix

Once the values updated, one can assemble the resulting KKT matrix. This translates to

MadNLP.build_kkt!(kkt)

By doing so, the values stored inside kkt will be transferred to the KKT matrix kkt_matrix (as returned by the function get_kkt):

kkt_matrix

6×6 SparseArrays.SparseMatrixCSC{Float64, Int32} with 13 stored entries:
 4.0     ⋅     ⋅     ⋅    ⋅    ⋅ 
 0.0  202.0    ⋅     ⋅    ⋅    ⋅ 
  ⋅      ⋅    2.0    ⋅    ⋅    ⋅ 
  ⋅      ⋅     ⋅    2.0   ⋅    ⋅ 
 0.0    0.0  -1.0    ⋅   0.0   ⋅ 
 1.0    0.0    ⋅   -1.0   ⋅   0.0

Internally, a SparseKKTSystem stores the KKT system in a sparse COO format. When build_kkt! is called, the sparse COO matrix is transferred to SparseMatrixCSC if the linear solver is sparse, or alternatively to a Matrix if the linear solver is dense.

Note

The KKT system stores only the lower-triangular part of the KKT system, as it is symmetric.

Solution of the KKT system

Now the KKT system is assembled in a matrix $K$ (here stored in kkt_matrix), we want to solve a linear system $K x = b$, for instance to evaluate the next descent direction. To do so, we use the linear solver stored internally inside kkt (here an instance of LapackCPUSolver).

We start by factorizing the KKT matrix $K$:

MadNLP.factorize!(kkt.linear_solver)

LapackCPUSolver{Float64, SparseArrays.SparseMatrixCSC{Float64, Int32}}(sparse(Int32[1, 2, 5, 6, 2, 5, 6, 3, 5, 4, 6, 5, 6], Int32[1, 1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 5, 6], [4.0, 0.0, 0.0, 1.0, 202.0, 0.0, 0.0, 2.0, -1.0, 2.0, -1.0, 0.0, 0.0], 6, 6), [4.0 0.0 … 0.0 0.0; 0.0 202.0 … 0.0 0.0; … ; 0.0 0.0 … -0.5 0.0; 0.25 0.0 … -0.0 -0.75], 6, Float64[], Float64[], Float64[], [384.0, 0.0, 3.466909703e-315, 6.3443794058355e-310, 0.0, 1.51e-321, 5.0e-324, 0.0, 0.0, 3.466909703e-315  …  0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 8.0e-323, 3.88e-321], 384, [128411514206256], -1, Base.RefValue{Int64}(0), [1, 2, 3, 4, 5, 6], MadNLP.LapackOptions(MadNLP.BUNCHKAUFMAN), MadNLP.MadNLPLogger(MadNLP.INFO, MadNLP.INFO, nothing))

By default, MadNLP uses a LBL factorization to decompose the symmetric indefinite KKT matrix.

Once the KKT matrix has been factorized, we can compute the solution of the linear system with a backsolve. The function takes as input a AbstractKKTVector, an object used to do algebraic manipulation with a AbstractKKTSystem. We start by instantiating two UnreducedKKTVector (encoding respectively the right-hand-side and the solution):

b = MadNLP.UnreducedKKTVector(kkt)
fill!(MadNLP.full(b), 1.0)
x = copy(b)

MadNLP.UnreducedKKTVector{Float64, Vector{Float64}, Vector{Int64}}([1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0], [1.0, 1.0, 1.0, 1.0, 1.0, 1.0], [1.0, 1.0, 1.0, 1.0], [1.0, 1.0], [1.0], [1.0, 1.0], [1.0, 1.0], [1.0])

The right-hand-side encodes a vector of 1:

MadNLP.full(b)

9-element Vector{Float64}:
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0

We solve the system $K x = b$ using the solve_kkt! function:

MadNLP.solve_kkt!(kkt, x)
MadNLP.full(x)

9-element Vector{Float64}:
  0.3333333333333333
  0.0049504950495049506
 -1.0
 -0.6666666666666666
 -2.0
 -1.3333333333333333
 -1.0
 -1.0
  1.0

We verify that the solution is correct by multiplying it on the left with the KKT system, using mul!:

mul!(b, kkt, x) # overwrite b!
MadNLP.full(b)

9-element Vector{Float64}:
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0
 1.0

We recover a vector filled with 1, which was the initial right-hand-side.