In a previous post, we introduced the method of Lagrange multipliers to find local minima or local maxima of a function with equality constraints. The same method can be applied to those with inequality constraints as well.
In this tutorial, you will discover the method of Lagrange multipliers applied to find the local minimum or maximum of a function when inequality constraints are present, optionally together with equality constraints.
After completing this tutorial, you will know
- How to find points of local maximum or minimum of a function with equality constraints
- Method of Lagrange multipliers with equality constraints
Let’s get started.
Prerequisites
For this tutorial, we assume that you already have reviewed:
- Derivative of functions
- Function of several variables, partial derivatives and gradient vectors
- A gentle introduction to optimization
- Gradient descent
as well as
You can review these concepts by clicking on the links above.
Constrained Optimization and Lagrangians
Extending from our previous post, a constrained optimization problem can be generally considered as
where
The equality constraints are easy to handle but the inequality constraints are not. Therefore, one way to make it easier to tackle is to convert the inequalities into equalities, by introducing slack variables:
When something is negative, adding a certain positive quantity into it will make it equal to zero, and vice versa. That quantity is the slack variable; the
With the slack variables introduced, we can use the Lagrange multipliers approach to solve it, in which the Lagrangian is defined as:
It is useful to know that, for the optimal solution
The Complementary Slackness Condition
The reason we need to know whether a constraint is active or not is because of the Krush-Kuhn-Tucker (KKT) conditions. Precisely, the KKT conditions describe what happens when
- The gradient of the Lagrangian function is zero
- All constraints are satisfied
- The inequality constraints satisfied complementary slackness condition
The most important of them is the complementary slackness condition. While we learned that optimization problem with equality constraint can be solved using Lagrange multiplier which the gradient of the Lagrangian is zero at the optimal solution, the complementary slackness condition extends this to the case of inequality constraint by saying that at the optimal solution
The use of complementary slackness condition is to help us explore different cases in solving the optimization problem. It is the best to be explained with an example.
Example 1: Mean-variance portfolio optimization
This is an example from finance. If we have 1 dollar and were to engage in two different investments, in which their return is modeled as a bi-variate Gaussian distribution. How much should we invest in each to minimize the overall variance in return?
This optimization problem, also known as Markowitz mean-variance portfolio optimization, is formulated as:
which the last two are to bound the weight of each investment to between 0 and 1 dollar. Let’s assume
and we have the gradients:
From this point onward, the complementary slackness condition have to be considered. We have two slack variables
and , but , and , but , and , and , and
For case 1, using
which we get
For case 2, with
For case 3, with
For case 4, we get
Comparing the objective function from case 2 and case 3, we see that the value from case 2 is lower. Hence that is taken as our solution to the optimization problem, with the optimal solution attained at
As an exercise, you can retry the above with
Want to Get Started With Calculus for Machine Learning?
Take my free 7-day email crash course now (with sample code).
Click to sign-up and also get a free PDF Ebook version of the course.
Example 2: Water-filling algorithm
This is an example from communication engineering. If we have a channel (say, a wireless bandwidth) in which the noise power is
Assume we are using a battery that can give only 1 watt of power and this power have to distribute to the
For convenience of differentiation, we notice
Assume we have
We have three inequality constraints here. The Lagrangian function is defined as
The gradient is therefore
But now we have 3 slack variables and we have to consider 8 cases:
, hence none of are zero but , hence only but , hence only but , hence only but non-zero, hence only but non-zero, hence only but non-zero, hence only - all of
are non-zero, hence
Immediately we can tell case 8 is infeasible since from
For case 1, we have
from
For case 2, we have
from
Similarly in case 3,
In case 4, we have
Case 5 we have
Similarly in case 6 and case 7, we have
Comparing all these cases, we found that the maximum value that the objective function attained is in case 1. Hence the solution to this optimization problem is
Extensions and Further Reading
While in the above example, we introduced the slack variables into the Lagrangian function, some books may prefer not to add the slack variables but to limit the Lagrange multipliers for inequality constraints as positive. In that case you may see the Lagrangian function written as
but requires
The Lagrangian function is also useful to apply to primal-dual approach for finding the maximum or minimum. This is particularly helpful if the objectives or constraints are non-linear, which the solution may not be easily found.
Some books that covers this topic are:
- Convex Optimization by Stephen Boyd and Lieven Vandenberghe, 2004
- Chapter 4 of Deep Learning by Ian Goodfellow et al, 2016
Summary
In this tutorial, you discovered how the method of Lagrange multipliers can be applied to inequality constraints. Specifically, you learned:
- Lagrange multipliers and the Lagrange function in presence of inequality constraints
- How to use KKT conditions to solve an optimization problem when inequality constraints are given
No comments:
Post a Comment