Lecture 9 Constrained Optimization
Text References: Course notes pp. 37-52 & Rogawski 14.8, 15.1-15.2
9.1 Recap
Last time, we learned how to find and classify the critical points of a function \(f(x,y)\).
Exercise 9.1 Find and classify the critical points of \(f(x,y)=(x^2+y^2)e^{-x}\).
Solution. To find the critical points, we calculate the gradient of the function and set it equal to zero: \[\begin{align*} \nabla f (x,y) &= (e^{-x}(2x-x^2-y^2), 2ye^{-x})=(0,0) \\ & \implies e^{-x}(2x-x^2-y^2)=0 \quad\mbox{(1)}\quad \mbox{and} \quad 2ye^{-x}=0\quad\mbox{(2)} \end{align*}\] From \((2)\), we get \(y=0\). Substituting into \((1)\), we find \(e^{-x}(2x-x^2)=0\) which has solutions \(x=0\) and \(x=2\).
Therefore, the critical points are \((0,0)\) and \((2,0)\).
Next, we use the Second Derivative Test to classify the points. We have \[Hf(x,y)=\begin{bmatrix}e^{-x}(2-4x+x^2+y^2) & 2e^{-x}\\2e^{-x} & -2ye^{-x}\end{bmatrix}\]
Evaluating at \((0,0)\) and taking the determinant, we find \((2)(2)-0^2=4 >0\). Since \(f_{xx}=2>0\), we have a local minimum.
Evaluating at \((2,0)\) and taking the determinant, we find \((-2e^{-2})(2e^{-2}-0^2)=-4e^{-4}<0\), so we have a saddle point.
9.2 Learning Objectives
- Given a function of several variables, use the Method of Lagrange to find its local extrema subject to a constraint \(g(x,y)=K\).
9.3 Method of Lagrange
It’s common to want to find the local extrema of a function \(f(x.y)\) subject to a constraint. The most general way to express such a constraint is by defining a curve \(g(x,y)=K\). Let’s think for a moment about what we’re after. When we were finding the extrema of a function without constraints, we were looking for points \((a,b)\) where \(\nabla f (a,b)=0\). Now that we have constraints, it’s possible that these points \((a,b)\) are not along the constraint curve, so we need to come up with a different idea.
What we’re looking for are points along the constraint curve where infinitesimal movements in any direction don’t result in either an increase or a decrease in the gradient of the function. This is the type of behaviour we expect from the extrema of a function: at a max, for example, there’s no direction in which we can move to increase the function.
Another way to say this is that we want points \((c,d)\) at which the directional derivative in the direction tangent to the constraint curve is equal to zero, i.e.\(D_{\vec{u}}(c,d)=\nabla f(c,d)\cdot\vec{u} =0\). This is a dot product, which means that the vector \(\vec{u}\) is orthogonal to \(\nabla f\).
Let’s make one last observation: if we think of the constraint curve as being a particular level curve of some function \(g(x,y)\), then \(\nabla g\) will always be orthogonal to the constraint curve. This means that when \(\vec{u}\) is orthogonal to \(\nabla f\), it is also orthogonal to \(\nabla g\), so \(\nabla f\) and \(\nabla g\) are parallel, that is, \(\nabla f = \lambda \nabla g\) for some \(\lambda \in\mathbb{R}\).
An interactive example is shown in this applet. The level curves of a function are given along with a constraint curve. Use the checkboxes to display \(\nabla f\) and \(\nabla g\). Observe the points at which the direction of the tangent at \(A\) is orthogonal to \(\nabla f\): how is \(\nabla g\) behaving at these points?
We have one final thing to consider: What if \(\nabla g=\vec{0}\) at some point on the constraint curve? This means that the constraint curve \(g(x,y)\) has a critical point. We need to check the value at this point, since we can’t be sure whether they are actually extrema.
Putting all of what we’ve gathered together, we have the Method of Lagrange:
Theorem 9.1 To find the critical points of \(f(x,y)\) subject to a constraint \(g(x,y)=K\) for some constant \(K\), we must find the values of \(x\) and \(y\) for which:
- \(\nabla f(x,y)=\lambda \nabla g(x,y)\) and \(g(x,y)=K\) for some constant \(\lambda\); or
- \(\nabla g(x,y)=(0,0)\) and \(g(x,y)=K\)
A few notes on applying the Method of Lagrange:
- If the constraint is not given in the form \(g(x,y)=K\), we must rewrite it in that form
- The proportionality constant \(\lambda\) is called a Lagrange multiplier and can be interpreted as the rate of change of \(f\) with respect to changes in the constraint value. In other words, \(\lambda\) is the additional \(f\) that is obtained by relaxing the constraint by \(1\) unit.
- We can extend this method to functions of more than two variables and for problems with multiple constraints.
- This method is used to locate critical points, and not to classify them. In order to classify the critical points, we must evaluate the function at those points and compare values. Depending on the type of constraint curve, we distinguish two cases:
- If the constraint curve is closed (i.e. has no endpoints), then we also need to consider the limits of \(f(x,y)\) in the two directions along the curve to determine whether we have any extrema.
- If the constraint curve has endpoints, we must also calculate the values of \(f(x,y)\) at those endpoints.
Exercise 9.2 Find the maximum value of \(6x+4y-7\) subject to \(3x^2+y^2=28\).
Solution. We have \(f(x,y)=6x+4y+7\) and so \(\nabla f (x,y)= (6, 4)\). We also have \(g(x,y)=3x^2+y^2\) and so \(\nabla g(x,y) = (6x, 2y)\).
- We find the points at which \(\nabla f(x,y)=\lambda \nabla g(x,y)\) and \(g(x,y)=28\). This gives us the following system: \[\begin{align*} 6 &= 6 \lambda x \quad \mbox{(1)}\\ 4 &= 2 \lambda y \quad \mbox{(2)}\\ 3x^2+y^2 &= 28 \quad \mbox{(3)}\\ \end{align*}\]
Equation (1) tells us that \(x\neq 0\), so we can solve for \(\lambda\) to get \(\lambda=\frac{1}{x}\).
Substituting \(\lambda=\frac{1}{x}\) into equation (2), we find \(y=2x\); substituting that into equation (3) and solving for \(x\) gives \(x=\pm2\). When \(x=2\), we get \(y=4\); when \(x=-2\), we get \(y=-4\).
Therefore the critical points are \((2,4)\) and \((-2,-4)\).
- We check the points at which \(\nabla g = (0,0)\) and \(g(x,y)=28\).
We have \(\ \nabla g(x,y)=(6x, 2y)\), which is equal to \((0,0)\) when \(x=y=0\); however, \(g(0,0)\neq 28\) so this point does not satisfy the constraint. Therefore, there are no critical points in this step.
Finally, we must evaluate \(f(x,y)\) at the critical points found above: \[f(2,4)=21 \quad \mbox{and} \quad f(-2,-4)=-35\] Therefore, the maximum values of \(f(x,y)\) on the constraint curve is \(21\) and occurs at the point \((2,4)\).
This interactive applet shows the level curves of \(f(x,y)\) are shown along with the constraint curve. Click and drag the point along the constraint curve to determine at which points \(f\) reaches its maximum or its minimum on the curve. Observe what happens to the gradients of \(f\) and \(g\) at these points.