1
$\begingroup$

Why is gradient the direction of steepest ascent?

In this question and it's many answer, people argue gradient is direction of steepest ascent by defining a random unit vector and then saying that dot product of this with gradient is maximum when both of them are in the same direction. But how does this prove that gradient is direction of steepest ascent

And, it's not just me, the comments also involve asking the person who answered how the heck his arguement proves it. So, why is it so?

Further in the book, Electricity and magnetism by purcell, in page 64 end to page 65 start, he talks of gradient of a function dependent only distance from origin i.e a radial function f(r) and argues the shortest step which we can make to change f(r) to f(r+dr) is move in the radial direction. Ok, this some what makes sense to me but how do I extend this intuition for regular derivatives and further use this to understand the answer I showed before?

$\endgroup$

2 Answers 2

2
$\begingroup$

We can first ask what is meant by the direction of "steepest ascent" for a function $f(\vec{x})$ around $\vec{x}_0$? What we usually mean by this is the direction given by a vector $\vec{\delta}$ so that a minute (infinitesimal step) in that direction gives the greatest increase of the function while keeping the length of the minute step fixed. You can think of the following process as one that converges to finding the direction of steepest ascent:

  1. First start with all unit vectors $\vec{\delta}$, and ask: of all the possible $\vec{\delta}$, which one maximizes $f(\vec{x}_0 + \vec{\delta})$? Let the answer be $\vec{\delta}_1$.

  2. Then reduce the length of $\vec{\delta}$ to say, $0.1$ and ask the same question: of all such possible $\vec{\delta}$, which one maximizes $f(\vec{x}_0 + \vec{\delta})$? Let the answer be $\vec{\delta}_1$.

  3. Continue to reduce the length of $\vec{\delta}$ to be fixed at some number arbitrarily close (but not equal) to zero, and ask the same question, thereby constructing an infinite sequence $$\vec{\delta}_1, \vec{\delta}_2, \vec{\delta}_3, \cdots$$ For differentiable functions, the direction each $\vec{\delta}_i$ as $i \to \infty$ points in will approach a well-defined limit!

The limit of this infinite process can then be thought of the direction of steepest ascent. It maximizes the increase of the function per unit distance traveled around the point $\vec{x}_0$, in the limit where the distance traveled from $\vec{x}_0$ is very small.

To find exactly what direction $\vec{\delta}$ approaches, we can use the idea behind defining the derivative, which is to provide a linear approximation of a function around a point. The idea is that for a multivariate function $f(\vec{x})$ that has a gradient $\nabla f$, moving an infinitesimal amount $\vec{\delta}$ from $\vec{x}_0$ to $\vec{x}_0 + \vec{\delta}$ changes the value of the function from $f(\vec{x}_0)$ to $f(\vec{x}_0) + \nabla f(\vec{x}_0) \cdot \vec{\delta}$. Even though there will always be corrections on the order of $\vec{\delta}^2$ and beyond, they are negligible in the limit as $\delta \to \vec{0}$. That's because every differentiable function can be approximated as linear in the neighborhood of any arbitrary point in its domain.

The problem then becomes to find $\vec{\delta}$ such that $\nabla f(\vec{x}_0) \cdot \vec{\delta}$ is maximized (since this is the increase $f(\vec{x}_0 + \vec{\delta}) - f(\vec{x}_0)$ that is recorded. But remember there's a caveat: the length of all $\vec{\delta}$ must be held fixed. Because of this, finding the vector $\vec{\delta}$ that maximizes $\nabla f(\vec{x}_0) \cdot \vec{\delta}$ is equivalent to finding the unit vector $\hat{\delta}$ that maximizes $\nabla f(\vec{x}_0) \cdot \hat{\delta}$. From there you can check that choosing $\hat{\delta}$ to be the unit vector in the direction of $\nabla f (\vec{x}_0)$ maximizes this quantity.

$\endgroup$
3
  • $\begingroup$ Really nice answer using an iterative limit. +1 from me $\endgroup$
    – K.defaoite
    Commented Jul 9, 2020 at 21:02
  • $\begingroup$ this answer changes everything $\endgroup$ Commented Jul 9, 2020 at 21:48
  • $\begingroup$ I wish there was a continuation to the last part: "choosing $\hat{\delta}$ to be the unit vector in the direction of $\nabla f (\vec{x}_0)$ maximizes this quantity" - why? $\endgroup$
    – HeyJude
    Commented Oct 14, 2023 at 23:19
1
$\begingroup$

Given a differentiable function $f$ of several variables and a point $P,$ recall that the directional derivative of $f$ at the point $P$ in the direction of a unit vector $\mathbf u$ is defined to be $D_\mathbf u f(P) = \nabla f_P \cdot \mathbf u,$ where $\cdot$ is the usual dot product and $\nabla f_P$ is the gradient of $f$ evaluated at the point $P.$ By the geometric interpretation of the dot product, we have that $D_\mathbf u f(P) = ||\nabla f_P|| \cos \theta,$ where $\theta$ is the angle between $\nabla f_P$ and $\mathbf u.$

Consequently, the directional derivative $D_\mathbf u f(P)$ is maximized whenever $\cos \theta = 1.$ But this says that the angle between $\nabla f_P$ and $\mathbf u$ is $\theta = 0,$ i.e., $\nabla f_P$ and $\mathbf u$ point in the same direction.

Crucially, given a surface $z = f(x, y),$ the directional derivative of $f$ at the point $P$ in the direction of a unit vector $\mathbf u$ can also be written as $D_\mathbf u f(P) = \tan \psi,$ where $\psi$ is the angle of inclination from the point $P$ on the surface $z = f(x, y).$ Under this interpretation, maximizing $D_\mathbf u f(P)$ also maximizes the angle of inclination from the point $P$ on the surface: as $\mathbf u = \frac{\nabla f_P}{||\nabla f_P||}$ maximizes $D_\mathbf u f(P),$ it maximizes the steepness of ascent.

$\endgroup$
2
  • $\begingroup$ The directional derivative relies on $\nabla$ in the definition though... This argument seems rather circular. $\endgroup$
    – K.defaoite
    Commented Jul 9, 2020 at 21:03
  • $\begingroup$ Yes, the directional derivative is defined in terms of the gradient. What is the problem? $\endgroup$ Commented Jul 9, 2020 at 21:04

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .