Math20512

Official homepage

Organization

FAQ

How does the chain rule work in 2 variables?

Single variable case:

Recall from A-level that if f(x) is a function of x and x is a function of t, then the chain rule says that \[\frac{\mathrm{d} f}{\mathsf{d} t} = \frac{\mathsf{d} f}{\mathsf{d} x}\;\frac{\mathsf{d}x}{\mathsf{d}t},\] as I hope you remember.

2 variable case:

Let f(x, y) be a function of 2 variables, and suppose each of x, y depend on time t. Then \[\frac{\mathrm{d}}{\mathsf{d}t}f(x,y) = \frac{\mathrm{d}}{\mathsf{d}t}f(x(t),y(t)) = \frac{\partial f}{\partial x}\;\frac{\mathsf{d}x}{\mathsf{d}t} + \frac{\partial f}{\partial y}\;\frac{\mathsf{d}y}{\mathsf{d}t}.\] NB It's important to distinguish between partial derivatives and ordinary derivatives. A partial \(\partial\) should only be used if you are holding some of the variables constant. And you are not doing that when you ask for \(\frac{\mathrm{d}}{\mathsf{d}t}f(x,y)\), but in \(\frac{\partial f}{\partial x}\) you are holding \(y\) constant. (If it's not clear, come and ask me about it - it's important!)

n variables:

Extending from 2 to n variables doesn't really change anything:
Let f(x1, . . . , xn) be a function of n variables, and suppose each xi depends on time t. Then \[\frac{\mathsf{d}}{\mathsf{d}t}f(x_1,\dots,x_n) = \sum_{j=1}^n \frac{\partial f}{\partial x_j}\;\frac{\mathsf{d}x_j}{\mathsf{d}t}.\]

Simple!

Then we can recognise this last expression as the row vector \(\mathrm{d}f\) times the column vector \(\dot{\mathbf{x}}\), so that \[\frac{\mathsf{d}}{\mathsf{d}t}f(\mathbf{x}) = \mathsf{d}\!f\, \dot{\mathbf{x}}\] where \(\mathsf{d}f = (\partial f/\partial x_1,\dots, \partial f/\partial x_n)\) (a row vector) and \(\dot{\mathbf{x}}=(\dot{x}_1, \dot{x}_2,\dots,\dot{x}_n)^T\) (a column vector because of the transpose).

Even simpler!

So the 2-variable case above can be written

\[\frac{\mathsf{d}}{\mathsf{d}t}f(\mathbf{x}) = \mathsf{d}\!f\, \dot{\mathbf{x}} = \begin{pmatrix}\frac{\partial f}{\partial x}& \frac{\partial f}{\partial y}\end{pmatrix}\begin{pmatrix}\dot x\cr \dot y\end{pmatrix}.\]

Note \(\mathrm{d}f\) is often written \(\nabla f\) (grad \(f\)). [Strictly speaking, there is a difference though: namely, \(\mathrm{d}f\) is a row vector, while \(\nabla f\) is a column vector, so each is the transpose of the other.]

Page last modified on Tue, 22 January 2013, at 16:12 (GMT)
Search jam - wiki