Differentiation in many variables
You all know how to differentiate, even when we are talking about many variables (partial differentiation). So the main difficulty here is the notation. Here are a few explanations which I hope will be helpful:
Degrees of freedom
- Each particle in a system has a position vector. So N particles means N position vectors: r1, r2, . . ., rN. If the particles move in R3, then each ri ∈ R3, say ri = (xi, yi, zi) giving 3N coordinates altogether. If the particles move in the plane R2 then each ri = (xi, yi) ∈ R2, so there are 2N coordinates altogether.
- The total number of coordinates needed to describe the configuration of the system is called the number of degrees of freedom of that sytem. For example, the (usual plane) pendulum has 1 degree of freedom: to specify its configuration you only need give the angle between the rod and the downward vertical (say).
- We often write the configuration as q = (q1, q2, ... , qd), when the system has d degrees of freedom. For example, for 2 particles in space (so 6 degrees of freedom),
q = (
q1,
q2,
q3,
q4,
q5,
q6) = (
x1,
y1,
z1,
x2,
y2,
z2) = (
r1,
r2).
In fact one should probably write (r1, r2) = ((x1, y1, z1), (x2, y2, z2)) (with the extra brackets) but writers are not usually so fussy.
Gradients and differentiation
- If q = (r1, r2, . . ., rN) then ∇jV(q) (which is the same as gradj V(q)) means the gradient with respect to the components of rj, that is,
\(\nabla_j V = \left(\frac{\partial V}{\partial x_j},\, \frac{\partial V}{\partial y_j},\, \frac{\partial V}{\partial z_j}\right).\)
- In Lagrange's equations: With q = (q1, q2, ... , qd), let v = (v1, v2, ... , vd) be the corresponding velocity, then the equation
\[\frac{\mathrm{d}}{\mathrm{d}t}\left(\frac{\partial L}{\partial \mathbf{v}}\right) = \frac{\partial L}{\partial \mathbf{q}}\]
which is a vector equation, means that, for each j = 1, ..., d,
\[\frac{\mathrm{d}}{\mathrm{d}t}\left(\frac{\partial L}{\partial v_j}\right) = \frac{\partial L}{\partial q_j}\]
which is d scalar equations.
Example:
Find \(\partial f/\partial \mathbf{r}\) where \(f(\mathbf{r}) = \|\mathbf{r}-\mathbf{a}\|\), and a is some given fixed vector.
Solution: \(\partial f / \partial \mathbf{r}\) is the same as \(\nabla f\).
First find \(\nabla(f^2) = \nabla\|\mathbf{r - a}\|^2\) (similar approach to sheet 1, qu1.2 except there it was d/dt, not ∇, but it's all differentiation!)
Now, if we write \(\mathbf{r}= (x,y,z)\) and \(\mathbf{a} = (a, b, c)\), then \(\|\mathbf{r - a}\|^2 = (x - a)^2 + (y - b)^2 + (z - c)^2\) (call this \(g\))
so,
\[\frac{\partial g}{\partial x} = 2(x-a),\; \frac{\partial g}{\partial y} = 2(y-b) \quad \mbox{and}\quad\frac{\partial g}{\partial z} = 2(z-c).\]
In other words, as vectors,
\(\nabla g = 2(\mathbf{r - a})\).
On the other hand, \(g=f^2\), so
\(\nabla g = 2f\,\nabla f\) (the chain rule).
Consequently, \(\nabla f = \nabla g / 2f\), and finally
\[\nabla f = \frac{\mathbf{r}-\mathbf{a}}{\|\mathbf{r}-\mathbf{a}\|}\]
as required.
Hessian matrix
In place of the usual second derivative in one variable (useful for example in determining whether a critical point is a local min or a local max), in \(n\) variables one defines the Hessian matrix, which is the square \(n\times n\) matrix of all second partial derivatives:
\( H_f = D^2f = \left(\frac{\partial^2 f}{\partial x_i\partial x_j}\right).\)
For examlpe, with \(n=3\), let \(f=x^3+yz^2\). Then
\(H_f = \pmatrix{6x&0&0\cr 0&0&2z\cr 0&2z&2y}.\)
Here we use \(\partial^2 f/\partial z^2=2y\) and \(\partial^2f/\partial y\partial z=2z\).
Note that the Hessian matrix is symmetric because
\(\frac{\partial^2f}{\partial x_i\partial x_j } = \frac{\partial^2f}{\partial x_j\partial x_i }.\)