Analysis Theorems

4 Derivatives

Let \(\FF \) from now on denote \(\CC \) or \(\RR \).

Proposition 4.1. A function \(f:\RR ^n \to \RR \) that has a local extremum at a point \(a \in U\) and is differentiable there has a critical point.

Proof. It suffices to consider \(n=1\), as this implies each of the \(\partial _if(a)\) in the general case is \(0\). In this case, note that if \(|f'(a)|>0\) then for small enough \(\ee \) we have \(\frac {|\ee |}{|h|}<|f'(a)|\) so the function decreases on one side and increases on the other. □

Proposition 4.2 (Mean Value Theorem). If \(f,g:[a,b]\to \RR \) are continuous, and differentiable on the interior, then there is a \(c\) so that

\[ (g(b)-g(a))f'(c)=(f(b)-f(a))g'(c) \]

Proof. Consider the function \(h = (g(b)-g(a))f(x)-(f(b)-f(a))g(x)\). \(h(a)=g(b)f(a)-f(b)g(a)=h(b)\), so either \(h\) is constant for which any point works, or it attains some maximum/minimum on the interior of the interval, which by Proposition 4.1 yields a point \(c\) with \(h'(c) = 0\). □

Note that the Mean Value Theorem is most often used in the case where \(g(x) = x\).

The chain rule is the theorem that in the category \(\Diff \), \(d\) is an endofunctor that takes a manifold to its tangent space, and a map \(f\) to a map \(df\) (the pushforward, total derivative, or derivative) on tangent spaces. In the case of a map between open sets of \(\FF ^n\), the tangent space is canonically identified with \(\FF ^n \times \FF ^n\). Here is the classical statement:

Theorem 4.3 (Chain Rule). Suppose we have \(U \subset \FF ^l, V \subset \FF ^m, W \subset \FF ^n\), and there are maps \(f:U \to V, g:V \to W\) such that \(f\) is differentiable at \(a\), and \(g\) is differentiable at \(f(a)\). Then \(g \circ f\) is differentiable and \((g \circ f)'(a) = g'(f(a))f'(a)\).

Proof. We define \(k = f(a+h)-f(a)\). Then:

\[ (g\circ f)(a+h)-(g\circ f)(a) = g(f(a+h))-g(f(a)) = g(f(a)+k)-g(f(a)) \]

\[ = g'(f(a))k + \ee _g(k) = g'(f(a))(f(a+h)-f(a)) + \ee _g(k). = g'(f(a))(f'(a)h+\ee _f(h)) + \ee _g(k) \]

\[=g'(f(a))f'(a)h + \ee \]

where \(\ee = g'(f(a))\ee _f(h) + \ee _g(k)\) so it suffices to show \(\frac {\Vert \ee \Vert }{\Vert h\Vert } \to 0\) as \(h \to 0\). With \(\Vert \cdot \Vert _o\) denoting the operator norm we have:

\[ \frac {\Vert \ee \Vert }{\Vert h\Vert } \leq \frac {\Vert \ g'(f(a))\Vert _o \Vert e_f(h)\Vert }{\Vert h\Vert } + \frac {\Vert e_g(k)\Vert }{\Vert k\Vert }\frac {\Vert f(a+h)-f(a)\Vert }{\Vert h \Vert } \]

\[= \frac {\Vert \ g'(f(a))\Vert _o \Vert e_f(h)\Vert }{\Vert h\Vert } + \frac {\Vert e_g(k)\Vert }{\Vert k\Vert }\bigg (\frac {\Vert f'(a)\Vert _o\Vert h \Vert }{\Vert h \Vert } + \frac {\Vert \ee _f(h)\Vert }{\Vert h \Vert }\bigg ) \]

which tends to \(0\) by hypothesis. □

Corollary 4.4 (Product & Quotient Rules).

\[\partial _i(f(x)g(x)) = \partial _if(x)g(x) + \partial _ig(x)f(x)\]

\[\partial _i\frac {f(x)}{g(x)} = \frac {\partial _if(x)g(x)-\partial _ig(x)f(x)}{g(x)^2}\]

Proof. Use the chain rule on the composite \(x \mapsto (f(x),g(x)) \mapsto f(x)g(x)\) and \(x \mapsto (f(x),g(x)) \mapsto \frac {f(x)}{g(x)}\). □

Corollary 4.5. If the inverse of \(f\) is differentiable, \((f^{-1})'(f(x)) = f'(x)^{-1}\).

Proof. Apply the chain rule to \(f^{-1}\circ f\). □

Corollary 4.6 (Mean Value Theorem Many Variables). If \(f:\RR ^n \to \RR \) is continuous at \(a,b\) and differentiable on a neighborhood which contains the line segment strictly between \(a\) and \(b\), then there is a \(c\) on this line segment satisfying \(f(b)-f(a) = f'(c)(b-a)\).

Proof. Consider the function \(g:[0,1] \to U\) going to the straight line between \(a\) and \(b\). By Proposition 4.2 we have \((f\circ g)'(c) = f(b)-f(a)\), and we can use the chain rule to get \((f\circ g)'(c)= f'(g(c))g'(c)=f'(g(c))(b-a)\). □

Corollary 4.7 (Mean Value Inequality). If \(f:\RR ^m \to \RR ^n\) is continuous at \(a,b\) and differentiable on a neighborhood which contains the line segment strictly between \(a\) and \(b\), then there is a \(c\) on the line segment such that \(\Vert f(b)-f(a)\Vert _2 \leq \Vert f'(c)\Vert _o \Vert b-a\Vert _2\).

Proof. Let \(u\) be the unit vector in the direction of \(f(b)-f(a)\). Then let \(U(x) = u \cdot x\), so \(U\circ f\) is a real valued function to which we can apply Corollary 4.6. We get \((U\circ f)(b)-(U\circ f)(a) = (U\circ f)'(c)(b-a) = U'(f(c))f'(c)(b-a) = u \cdot (f'(c)(b-a))\). Then we have via Cauchy-Schwarz inequality,

\[ \Vert f(b)-f(a)\Vert _2 = |(U\circ f)(b)-(U\circ f)(a)| = |u \cdot (f'(c)(b-a))| \]

\[ \leq \Vert u \Vert _2\Vert f'(c)(b-a)\Vert \leq \Vert f'(c)\Vert _o\Vert b-a\Vert _2 \]

□

Corollary 4.8. If \(f:\RR ^m \to \RR ^n\) is differentiable in a convex bounded open set \(U\), and \(\Vert f'(x)\Vert _o\) is bounded on \(U\), then \(f\) is uniformly continuous.

Proof. Corollary 4.7 gives \(\Vert f(x)-f(y) \Vert _2 \leq M\Vert x-y\Vert \). □

Corollary 4.9 (L’Hôpital’s Rule). If \(f,g:\RR ^n \to \RR \) are continuous, differentiable in a deleted neighborhood of \(0\), \(f(0) = g(0) = 0\), and \(\lim _{x \to 0}g'(x) \neq 0,\lim _{x \to 0}f'(x)\) exist, then

\[ \lim _{x \to 0}\frac {f(x)}{g(x)} = \lim _{x \to 0}\frac {f'(x)}{g'(x)} \]

Proof. Use Proposition 4.2 on a small interval \((-\delta ,\delta )\) around the origin, to get \(\frac {f(x)}{g(x)} = \frac {f(x)-f(0)}{g(x)-g(0)} = \frac {f'(x_0)}{g'(x_0)}\) for some \(x_0\) in the interval. Letting \(\delta \to 0\) we are done. □

Theorem 4.10. If \(f:\FF ^m \to \FF ^n\) has all partial derivatives which are continuous near a point \(a\), then \(f'(a)\) exists.

Proof. It suffices to prove this when \(n=1\) as being differentiable is equivalent to being differentiable in each component. Now fix an \(h\) in a small enough ball. We have by Proposition 4.2:

\[f(a_1+h_1,\dots ,a_i+h_i,a_{i+1},\dots ,a_n) - f(a_1+h_1,\dots ,a_{i-1}+h_{i-1},a_{i},\dots ,a_n)\]

\[= h_i\partial _if(a_1+h_1,\dots ,a_{i-1}+h_i, \alpha _i,a_{i+1},\dots ,a_n)\]

where \(\alpha _i \in (a_i,a_i+h_i)\). Now we have

\[f(a+h)-f(a) = \sum _ih_i\partial _if(a_1+h_1,\dots a_{i-1}+h_{i-1},\alpha _i,a_{i+1}+h_{i+1},\dots ,a_n) \]

\[=h(\partial _if(a)) + \ee \]

where

\[\frac {\Vert \ee \Vert }{\Vert h\Vert } \leq \sum _i\Vert \partial _if(a_i+h_i,\dots ,\alpha _i,\dots ,a_n)-\partial _if(a)\Vert \]

which tends to \(0\) as \(h \to 0\) by continuity of the partial derivatives. □

Proposition 4.11. If \(f:\FF ^2\to \FF \) is continuous near a point \(a\), has partials \(\partial _1f,\partial _2f,\partial _1\partial _2f,\partial _2\partial _1f\) near \(a\), and the mixed partials \(\partial _1\partial _2f,\partial _2\partial _1f\) are continuous near \(a\), then they are equal at \(a\).

Proof. By Proposition 4.2, we have

\[\partial _1\partial _2f(x'',y'') = \partial _2f(x+h,y'')-\partial _2f(x,y'') \]

\[= f(x+h,y+k)-f(x+h,y)-f(x,y+k)+f(x,y)\]

\[ = f(x+h,y+k)-f(x,y+k)-f(x+h,y)+f(x,y)\]

\[ = \partial _1f(x',y+k)-\partial _1f(x',y) = \partial _2\partial _1f(x',y') \]

so letting \(k,h \to 0\) by continuity we are done. □

Proposition 4.12 (One-variable Taylor Expansion). If \(f:\RR \to \RR \) has continuous partial derivatives up to \(k+1^{th}\) order near \(a\), then for small enough \(h\) we have

\[f(a+h) = \sum _0^k\frac {1}{i!}f^{(i)}(a)h^i + \frac {1}{(k+1)!}f^{(k+1)}(\alpha )h^{k+1} \]

with \(\alpha \in (a,a+h)\).

Proof. Let \(R(x)\) be \(f(a+x)-\sum _0^k\frac {1}{i!}f^{(i)}(a)x^i - \frac {1}{(k+1)!}cx^{h+1}\) such that \(c\) is chosen so that \(R(h)=0\). We want to show \(c = f^{(k+1)}(\alpha )\) as above. To do this, use Proposition 4.2 many times, first on the fact that \(R(0)=R(h) = 0\) to get a \(a_1\) where \(R'(a_1)=0\), and then repeating this on \(R'(0)=R'(a_1)=0\). Then finally we will have a \(a_{k+1}\) with \(R^{(k+1)}(a_{k+1}) = 0\), at which point taking the \(k+1^{th}\) derivative yields \(0=R^{(k+1)}(a_k) = f^{(k+1)}(a+a_{k+1})-c\), so we are done. □

Corollary 4.13 (Many-variable Taylor Expansion). If \(f:\RR ^n \to \RR \) has continuous partial derivatives up to \(k+1^{th}\) order near \(a\), for small enough \(h\) we have

\[f(a+h) = \sum _0^k\sum _{j_1,\dots ,j_i = 1}^n\partial _{j_1}\dots \partial _{j_i}f(a)\prod _{l=1}^ih_l + \sum _{j_1,\dots ,j_{k+1} = 1}^n\partial _{j_1}\dots \partial _{j_{k+1}}f(\alpha )\prod _{l=1}^{k+1}h_l\]

with \(\alpha \) in the line segment between \(a\) and \(a+h\).

Proof. Apply the chain rule to extend the one-variable case by composing \(f\) with the line between \(a\) and \(a+h\). □