The usage of chain rule in physics

Question

I often see in physics that, we say that we can multiply infinitesimals to use chain rule. For example,

$$ \frac{dv}{dt} = \frac{dv}{dx} \cdot v(t)$$

But, what bothers me about this is that it raises some serious existence questions for me; when we say that we take the derivative of $v$ velocity with respect to distance, that means we can write velocity as a function of distance. But, how do we know that this is always possible? As in, when we do these multiplications of differentials we are implicitly assuming that $v$ can be changed from a function of time into a function of displacement.

I see this used ubiquitously, and there are some crazier variations I've seen of literally swapping differentials like $ dv \frac{dm}{dt} = dm \frac{dv}{dt}$ , as shown by the answer of user "Fakemod" in this stack post.

I usually see quantities like $\frac{dv}{dx}$ in the context of continuum mechanics, where $v(x)$ is a velocity field and $x$ is interpreted as indexing a specific particle in the continuum ("material coordinates") or specifying a location in space ("spatial coordinates"). The $v \cdot \nabla v$ nonlinear term in the Navier-Stokes is a famous instance of this. The answers below seem to be hung up on questions of differentiability, but I think the real question is about the physical meaning of $x$ in such derivatives. — jnez71, Commented Dec 11, 2020 at 17:26
To some extent it does. I guess what it's missing is saying that, especially in continuum / field theories, $v(x,t)$ means "what is the velocity of the particle that happens to currently be at position $x$" (if $x$ is spatial coordinates) or "what is now the velocity of the particle that was at $x$ initially" (when $x$ is material coordinates). Thus $\frac{dv}{dx}(x,t)$ means "how different right now is the velocity of the particle at $x$ (spatial) or named $x$ (material) from its infinitesimally close neighbor?" — jnez71, Commented Dec 11, 2020 at 18:18
$\frac{dv}{dx}$ isn't some imprecise idea that physics textbooks use without checking validity of. Nor does it need the implicit function theorem to define. It is a perfectly fine mathematical construct when $v(x)$ is a velocity field. For discrete / finite particle systems, there is no concept of "infinitesimally close neighbor" so you won't see $\frac{dv}{dx}$ in those contexts. You will see it when describing flow though! Remember, $\frac{dm}{dt}$ is not the rate-of-change of a single particle's mass (Newtonian particles don't change mass). It is a flow of particles through a volume. — jnez71, Commented Dec 11, 2020 at 18:26

WillO · Accepted Answer · 2020-08-13 17:10:19Z

You are correct that you cannot (globally) write velocity as a function of distance. For example, as one commenter has already mentioned, throw a ball directly up in the air and wait for it to come down. When the ball is at height $h$ on the way up, it has a positive (upward directed) velocity. When it is at the same height $h$ on the way down, it has a negative (downward directed) velocity. So velocity is definitely not a (global) function of distance.

But this much is true: For any height $h$ except for the maximum height the ball ever reaches, there is some open interval around $h$ --- some range of heights from $h-\epsilon$ to $h+\epsilon$ --- in which you can treat velocity as a well-defined function of height while the ball is on its way up, and another well-defined function of height while the ball is on its way back down. And moreover that function is differentiable and obeys the chain rule. All of this is part of the content of the implicit function theorem, which you can google for.

If you just write velocity as a function of height, you do have to be careful to make it clear from context which of the two functions --- the "on the way up" function and the "on the way down" function --- you're referring to. You also have to make sure you don't try to pull this stunt when the ball is at the very top of its trajectory (or more generally, at points where its velocity is zero). Many books take it for granted that you're being careful about this, so they don't have to worry about it on your behalf.

Ryan T. Grimm · Accepted Answer · 2020-08-15 03:47:11Z

15

Well, this is the most common thing for which mathematicians make fun of physicists. Because we don't bother to cancel out derivatives, and we "NEVER" check if we can imply some rule on our equations. The thing is, that almost all functions, which can appear in nature or real life systems are, in most times, continuous and differentiable. There are, for sure, some special cases. But for most simple tasks, eg. mechanic, this is quite valid.

So in the case of $v$. In order to define velocity, the object has to change its position in some amount of time. And furthermore, we don't have infinite speed in real life. This implys, that $dx/dt$ has allways that some non infinite value. From this it follows, that $v$ can be rewritten as function of either $t$ or $x$.

I am not sure if there is a special case or not, but for physicists it is not important, because in 99.9% this will be true. If there are special cases, they could be "obviously strange". You should have in mind, that at least in theory, we always check our calculations with experiment, so we have an experimental proof instead of a mathematical one (generally).

edited Aug 15, 2020 at 3:47

Ryan T. Grimm

1052 bronze badges

answered Aug 13, 2020 at 7:56

Vid

9765 silver badges23 bronze badges

12

$\begingroup$ I think your argument is flawed... just because $dx/dt$ is never infinite doesn't mean you can always write $v$ as a well defined function of $x$. A simple counterexample: throw a ball straight up in the air and then catch it. You would need to define separate functions $v(x)$ to fully describe that, as there are different velocities at the same point in space at different times. $\endgroup$
– BioPhysicist
Commented Aug 13, 2020 at 12:59
1

$\begingroup$ @BioPhysicist and also v(x) would not be defined for the points in space higher than the ball reached when thrown up. $\endgroup$
– Peteris
Commented Aug 13, 2020 at 16:16
$\begingroup$ @BioPhysicist I believe that can be resolved by this Math SE answer. In your example, the relation between $v$ and $x$ looks like a rotated absolute value function. That's not a function, but if you pick a point on the curve, it locally looks like a function (except at the apex). That lets $dv/dx$ be defined; the difficulty is now locating the points, since $x$ doesn't work. We can evaluate $dv/dx$ given $t$, and if we already have a point we can consider small displacements $dx$. Like this answer says, in physics, this should usually work out. $\endgroup$
– HTNW
Commented Aug 13, 2020 at 17:28
$\begingroup$ @BioPhysicist generaly speaking, I should than describe $v$ as $\frac{d\textbf{r}}{dt}$. But scince we are talking about the ways mathematic is used for calculations in physics, the system is normaly described in as few dimensions as possible. I can allways say, that $x$ points directl upwards... If we want to be more fundamental we should describe system with generalised coordinates $q_1, q_2, .. q_n$... $\endgroup$
– Vid
Commented Aug 13, 2020 at 17:31
1

$\begingroup$ “that almost all functions, which can appear in nature or real life systems are in most times continuous and differentiable” – definitely exaggeration. Lots of processes like collisions between hard objects, phase transitions and chemical reactions are discontinuous (at least unless you zoom in very closely, which may be a theoretical escape but is generally completely impractical). Just, physicists don't tend to worry about that: they restrict themselves to some subdomain or submanifold where the behaviour is continuous, and leave the general case for engineers/chemists etc. to worry about. $\endgroup$
– leftaroundabout
Commented Aug 13, 2020 at 22:59

| Show 1 more comment

John Alexiou · Accepted Answer · 2020-08-13 12:48:03Z

It is true, that in nature there is only one true independent variable, time. All others are "pseudo-independent". They are variables humans bless as independent in order to answer what-if scenarios and to establish mathematical models of systems byways of separation of variables. The common term for these "pseudo-independent" quantities is generalized coordinates.

Looking at a complex mechanical system, like a human launching a ball while riding on a skateboard. First, we decide what the degrees of freedom are and assign generalized coordinates to them. These are simple measurable quantities of distance, angle or something else geometrical forming a generalized coordinate vector $$\boldsymbol{q} = \pmatrix{x_1 \\ \theta_2 \\ \vdots \\ q_j \\ \vdots} \tag{1}$$ In this example there are $n$ degrees of freedom. All the positions of important points on our mechanisms can be found from these $n$ quantities. If there are $k$ kinematic hardpoints (such as joints, geometric centers, etc) then the $i=1 \ldots k$ cartesian position vector is some function of the generalized coordinates and time $$ \boldsymbol{r}_i = \boldsymbol{\mathrm{pos}}_i(t,\, \boldsymbol{q}) \tag{2}$$

Here comes the chain rule part. With the assumption that (2) is differentiable with respect to the generalized coordinates, and that contact conditions do not change due to separation, or loss of traction, the velocity vectors of each of the hardpoints is found by the chain rule

$$ \boldsymbol{v}_i = \boldsymbol{\mathrm{vel}}_i(t,\,\boldsymbol{q},\,\boldsymbol{\dot{q}}) = \frac{\partial \boldsymbol{r}_i}{\partial t} + \frac{\partial \boldsymbol{r}_i }{\partial x_1} \dot{x}_1 + \frac{\partial \boldsymbol{r}_i }{\partial \theta_2} \dot{\theta}_2 + \ldots + \frac{\partial \boldsymbol{r}_i }{\partial q_j} \dot{q}_j + \ldots \tag{3} $$ where $q_j$ is the j-th element of $\boldsymbol{q}$, and $\dot{q}_j$ its speed (being linear or angular).

The above is not a division of infinitesimals, but the multiplication of a partial derivative $\tfrac{\partial \boldsymbol{r}_i }{\partial q_j}$ with the particular coordinate degree of freedom speed $\dot{q}_j$.

Maybe you are more comfortable with this more rigorous notation using partial derivatives that what you have seen so far. The term partial derivative means, take the derivative by varying only one quantity and holding all others constant. This is what allows us to use pseudo-independent quantities $q_j$ for the evaluation of the true derivative with time (the one actual independent quantity).

The same logic is applied to higher derivatives as well

$$ \boldsymbol{a}_i = \boldsymbol{\rm acc}_i(t,\boldsymbol{q},\boldsymbol{\dot q}) = \frac{\partial \boldsymbol{v}_i}{\partial t} + \ldots + \frac{ \partial \boldsymbol{v}_i}{\partial q_j}\, \dot{q}_j + \ldots + \frac{ \partial \boldsymbol{v}_i}{\partial \dot{q}_j} \,\ddot{q}_j \tag{4} $$

The last part might be a bit confusing, but when you express it in terms of actual degrees of freedom it might be clear. Consider the degree of freedom $\theta_2$ and its time derivatives $\omega_2$ and $\alpha_2$. Then the terms $\frac{ \partial \boldsymbol{v}_i}{\partial \theta_2} \omega_2 $ and $\frac{ \partial \boldsymbol{v}_i}{\partial \omega_2} \alpha_2 $ are more clear I hope, as $\boldsymbol{v}_i$ depends on both the position $\theta_2$ and the speed $\omega_2$.

I sort of get the equations but I don't get how it connects to the question I had asked, it would be nice if you could leave a statement in the end about the big picture overview of what the equations mean — Cathartic Encephalopathy, Commented Aug 13, 2020 at 21:41
You feel reservations in changing $v$ from a function of time to a function of displacement. I share this reservation, but I also show how it is useful mathematically by braking apart the complex problem of time evolution into multiple solvable problems of time and configuration evolution and putting all back together with the chain rule. I feel like the paragraph between (2) and (3) attempts to directly address your concerns. — John Alexiou, Commented Aug 15, 2020 at 0:57
Have to -1 despite appreciating the bulk of the explanation. The concern's the strong statements about time being a "true"-independent variable, suggesting that there's some fundamental quality distinguishing time. — Nat, Commented Aug 15, 2020 at 3:04
@Nat - I guess it is "proper" time I am talking about and it irreversibility that makes it independent. — John Alexiou, Commented Aug 15, 2020 at 17:35

Brick · Accepted Answer · 2020-08-14 14:25:31Z

I like this question and there already some good answers. I'm not going to repeat those, but I wanted to add a couple of points focused on the second part of your question regarding "swapping" differentials.

The first is that the presence of a differential quantity is an abstraction that is usually only useful as an intermediate step in calculating something else. By that, I mean you never measure something like $\rho\ dV$ directly. You can only hope to measure:

The integral of that quantity $\int \rho\ dV$ over some volume (equivalently, you back off of the abstraction and measure $\rho \Delta V$ for some finite volume $\Delta V$) --OR--
The "ratio of differentials" (being deliberately loose for the moment), which in the limit is a derivative. So an expression like $f(t) dt = g(x) dx$ gets "divided through" to be $f(t) = g(x) (dx/dt) = g(x)v(t)$. We believe we know how to measure changes in quantities and gradients.

That's relevant to the second part of your question about "swapping" differentials because when that's done legitimately, it typically works because you're ultimately going to put that expression under an integral sign, and the notation conveniently reflects (some might prefer to say the notation is easily abused when applying) the integration-by-substitution rule $$ \int_a^b f(g(x)) g'(x) dx = \int_{g(a)}^{g(b)} f(u) du$$ which you could rewrite in Leibnitz notation for $u = g(x)$ and get the appearance that you are swapping or canceling differentials.

Since the integration-by-substitution rule is basically the chain rule in reverse, however, all of this begs your initial question of why the chain rule is valid in physics. For that, I refer back to the other already-good answers.

user541686 · Accepted Answer · 2020-08-14 19:16:41Z

that means we can write velocity as a function of distance

That's not quite the intended meaning. Rather, the intended meaning is:

If you could write velocity as a function of distance in the domain of interest, then the equation would hold.

It's up to you to deduce whether that assumption can be met satisfactorily in the problem, but usually it's quite obvious that it can.

One way to see this is that you can artificially restrict the domain to the portion of space and time that is of interest and disregard the rest of the domain, and then argue that this assumption would hold there.
(Notice I basically just rephrased the continuity notion of a limit here.)

The only way for this to be false in your particular example is to have multiple velocities at a given point in time (or no velocity at all), which generally wouldn't make sense in the (continuous) everyday world we're familiar with.

And if the discussion is about some unusual boundary condition where you can't take a limit on all sides and show the problem is continuous, then you wouldn't read such a claim about that situation without some kind of other (implicit or explicit) indication as to why it's true.

I really appreciate this answer as the only real answer to an ill-posed question. I would say that the real problem among physicists is not the manipulation of differentials but discussing functions without ever specifying the domain or functional dependencies. — GiorgioP-DoomsdayClockIsAt-90, Commented Dec 24, 2020 at 15:06

AccidentalTaylorExpansion · Accepted Answer · 2020-11-17 19:52:32Z

In situations like these it might be good to take a step back and consider what we are actually looking at. In this case we are looking at some function $x$ as a function of time. So starting from this the only functions that are well-defined are \begin{align} x:\quad t\rightarrow &x(t)\\ v:\quad t\rightarrow &v(t)=x'(t) \end{align} We can rewrite our culprit $\frac{dv}{dx}$ in terms of these functions. \begin{align} \frac{dv}{dx}=\frac{dv}{dt}\frac{dt}{dx}=v'(t)\left[x'(t)\right]^{-1} \end{align} Here we used the chain rule and the fact that the derivative of an inverse is the reciprocal of the original function i.e. $dy/dx=(dx/dy)^{-1}$. Immediately we can see two illuminating things: firstly we can define the derivative $\frac{dv}{dx}$ because we can write $v$ as a function of $t$ and we can also write $t$ as a function of $x$. Secondly this derivative is only defined if $x'(t)\neq 0$ so there are some constraints in doing this.

Now let's take as an example $x(t)=bt^2$. We can calculate this in two ways. The first way is to first substitute $t(x)$ and then differentiate with respect to $x$: \begin{align} t&=\pm\sqrt{\frac x b}\\ \implies v(x)&=v(t(x))=\pm 2\sqrt{bx}\\ \implies \frac{dv}{dx}&=\pm\sqrt{\frac b x} \end{align} The second way is to use the chain rule. From the second equation \begin{align} \frac{dv}{dx}&=v'(t)\left[x'(t)\right]^{-1}\\ &=2b[2bt]^{-1}\\ &=\frac 1 t\\ &=\pm\sqrt{\frac b x} \end{align} Perhaps unsurprisingly these methods are equal. The second method makes it really explicit which functions are used but the first method can be obscured sometimes when $t$ is not mentioned like in your question.

The main takeaway of this answer is that these tricks have a formal proof behind them but often the author leaves this out for brevity. This way we can do more physics more quickly but these tricks shouldn't go at the expense of your fundamental understanding. When you feel this happens it might be useful to write down the functions you are using and on which parameters they depend and then you can try to proof these tricks. A nice summary of these tricks is 'differentials are not algebraic entities so you can't just switch them around in fractions but it turns out in most cases you can switch them around like that'.

Cathartic Encephalopathy · Accepted Answer · 2022-03-15 15:16:45Z

0

What we are really saying is that there is some function $f$ which when composed with position gives velocity. We have:

$$ v(t) = f \circ x(t)$$

Taking the derivative:

$$ \frac{dv}{dt} = \left[ \frac{df}{dx} \circ x(t) \right] \frac{dx}{dt}= v \left[ \frac{df}{dt} \circ x(t)\right]$$

answered Mar 15, 2022 at 15:16

Cathartic Encephalopathy

7,9802 gold badges34 silver badges77 bronze badges

$\begingroup$ Deeply related is the parametric derivative and Frechet derivative see here $\endgroup$
– Cathartic Encephalopathy
Commented Mar 15, 2022 at 15:17

Add a comment |

Stack Exchange Network

The usage of chain rule in physics

7 Answers 7

Your Answer

Not the answer you're looking for? Browse other questions tagged
kinematics
velocity
differentiation
calculus
or ask your own question.

Linked

Hot Network Questions

The usage of chain rule in physics

7 Answers 7

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged kinematicsvelocitydifferentiationcalculus or ask your own question.

Linked

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
kinematics
velocity
differentiation
calculus
or ask your own question.