Lagrangian mechanics is based on the *calculus of variations*,
which is the subject of optimization over a space of paths. One of
the most famous variational problems involves constraining a particle
to travel along a curve (imagine that the particle slides along a
frictionless track). The problem is to find the curve for which the
ball travels from one point to the other, starting at rest, and being
accelerated only by gravity. The solution is a cycloid function called the *Brachistochrone curve*
[841]. Before this problem is described further, recall the
classical optimization problem from calculus in which the task is to
find extremal values (minima and maxima) of a function. Let
denote a smooth function from
to
, and let denote
its value for any
. From standard calculus, the extremal
values of
are all
for which
. Suppose
that at some
,
achieves a local minimum. To
serve as a local minimum, tiny perturbations of should result in
larger function values. Thus, there exists some such that
for any
. Each
represents a possible perturbation of .

The calculus of variations addresses a harder problem in which
optimization occurs over a space of functions. For each function, a
value is assigned by a criterion called a
*functional*.^{13.10} A procedure analogous to taking the derivative
of the function and setting it to zero will be performed. This will
be arrived at by considering tiny perturbations of an entire function,
as opposed to the perturbations mentioned above. Each
perturbation is itself a function, which is called a *variation*. For a function to minimize
a functional, any small enough perturbation of it must yield a larger
functional value. In the case of optimizing a function of one
variable, there are only two directions for the perturbation:
. See Figure 13.12. In the calculus of
variations, there are many different ``directions'' because of the
uncountably infinite number of ways to construct a small variation
function that perturbs the original function (the set of all
variations is an infinite-dimensional function space; recall
Example 8.5).

Let
denote a smooth function from
into
.
The functional is defined by integrating a function over the domain of
. Let be a smooth, real-valued function of three
variables, , , and .^{13.11} The arguments of may be any
and to yield , but each has a special
interpretation. For some smooth function
, is used to
evaluate it at a particular to obtain
. A
*functional* is constructed using to evaluate the
whole function
as

The problem is to select an that optimizes . The approach is to take the derivative of and set it equal to zero, just as in standard calculus; however, differentiating with respect to is not standard calculus. This usually requires special conditions on the class of possible functions (e.g., smoothness) and on the vector space of variations, which are implicitly assumed to hold for the problems considered in this section.

(13.115) |

When evaluated on a function , this yields the arc length of the path.

Let be a smooth function over , and let
be a
small constant. Consider the function defined as
for all
. If
, then
(13.114) remains the same. As is
increased or decreased, then
may change.
The function is like the ``direction'' in a directional
derivative. If for any smooth function , their exists some
such that the value
increases, then
is called an *extremal* of . Any small perturbation to
causes the
value of to increase. Therefore,
behaves like a local
minimum in a standard optimization problem.

Let for some and function . The differential of a functional can be approximated as [39]

in which represents higher order terms that will vanish in the limit. The last step follows from integration by parts:

(13.117) |

which is just . Consider the value of (13.116) as becomes small, and assume that . For to be an extremal function, the change expressed in (13.116) should tend to zero as the variations approach zero. Based on further technical assumptions, including the

is obtained as a necessary condition for to be an extremum. Intuition can be gained by studying the last line of (13.116). The integral attains a zero value precisely when (13.118) is satisfied. The other terms vanish because , and higher order terms disappear in the limit process.

The partial derivatives of with respect to and are defined using standard calculus. The derivative is evaluated by treating as an ordinary variable (i.e., as when the variables are named as in ). Following this, the derivative of with respect to is taken. To illustrate this process, consider the following example.

(13.119) |

The partial derivatives with respect to and are

(13.120) |

and

Taking the time derivative of (13.121) yields

(13.122) |

Substituting these into the Euler-Lagrange equation (13.118) yields

This represents a second-order differential constraint that constrains the acceleration as . By constructing a 2D phase space, the constraint could be expressed using first-order differential equations.

Steven M LaValle 2012-04-20