The chain rule in calculus states that if one variable y depends on a second variable u which in turn depends on a third variable x, then the rate of change of y with respect to x can be computed as the product of the rate of change of y with respect to u times the rate of change of u with respect to x.
In Leibniz' symbolism, this can be written as
- dy/dx = (dy/du) * (du/dx)
A real world example will show that this rule makes sense: suppose you are climbing up a mountain and you are gaining elevation at a rate of 0.5 kilometers an hour. The temperature is lower at higher elevations; suppose the rate at which it decreases is 6 degrees per kilometer. How fast do you get colder? Well, we have to multiply: 6 degrees per kilometer times 0.5 kilometers per hour makes 3 degrees per hour. Every hour, you'll get three degrees colder. That is the heart of the chain rule.
In the modern treatment, the chain rule is seen as a formula for the derivative of the composition of two functions. Suppose the real-valued function f is defined on some open subset of the real numbers containing the number x, and g is defined on some open subset of the reals containing f(x). If f is differentiable at x and g is differentiable at f(x), then the composition g o f is differentiable at x and the derivative can be computed as
- (g o f)'(x) = g'(f(x)) * f '(x)
For example, in order to differentiate
- h(x) = sin(x2),
we write h(x) = g(f(x)) with g(u) = sin(u) and f(x) = x2 and the chain rule then yields
- h'(x) = cos(x2) * 2x
since g'(u) = cos(u) and f '(x) = 2 x.
The chain rule is a fundamental property of all definitions of derivative and is therefore valid in much more general contexts. For instance, if E, F and G are Banach spaces (which includes Euclidean space) and f : E -> F and g : F -> G are functions, and if x is an element of E such that f is differentiable at x and g is differentiable at f(x), then the derivative of the composition g o f at the point x is given by
- Dx(g o f) = Df(x)(g) o Dx(f)
A particularly nice formulation of the chain rule can be achieved in the most general setting: let M, N and P be Ck manifolds (or even Banach-manifolds) and let f : M -> N and g : N -> P be differentiable maps. The derivative of f, denoted by df, is then a map from the tangent bundle of M to the tangent bundle of N, and we may write
- d(g o f) = dg o df