Analysis Theorems

6 Endomorphisms

Lemma 6.1 (Adjugates). Let \(M\) be a free \(R\)-module. Given \(f \in \End _R(M)\), there is an element \(\adj (f)\) such that \(f\adj (f) = \adj (f)f=1_M\).

Proof. It suffices to prove this in the universal ring \(R=\ZZ [x_{ij}]\), \(1\leq i,j\leq n\), with an endomorphism \(f\) given by the matrix \((x_{ij})\). We have a natural isomorphism \(j:M \cong \Hom (\wedge ^{n-1}M,\wedge ^{n}M)\) so that \(j(x_1)(x_2\wedge \dots \wedge x_n) = x_1\wedge \dots \wedge x_n\). To get \(\adj f\), take the endomorphism corresponding to \(\Hom (\wedge ^{n-1}f,1_{\wedge ^{n}M})\). Now we have \(j(\adj (f)f)\) takes \(x_2\wedge \dots \wedge x_n \mapsto f(x_2)\wedge \dots \wedge f(x_n) \mapsto f(x_1)\wedge \dots \wedge f(x_n)\) which is \(j(det(f)1_M)\). Note that since \(\det (f)\) is nonzero, \(\adj (f)\) has nonzero determinant, so is injective as \(R\) is an integral domain. Now we have \(\adj (f)f\adj (f)=\adj (f)\det (f)\) so by injectivity \(f\adj (f) = \det (f)1_M\). □

Lemma 6.2. If \(Av = \lambda v\) then for any polynomial \(p\), \(p(A)v = p(\lambda )v\).

Proof. \(p(A)v=p(\lambda )v\) is a linear combination of \(A^nv=\lambda ^nv\). □

Theorem 6.3 (Cayley-Hamilton). Let \(M\) be a finitely generated free \(R\)-module, and \(f \in \End (M)\). Then \(\chi _f(f)\equiv 0\)

Proof. \(f\) turns \(M\) into a \(R[T]\)-module, and we can extend scalars via \(R' = R[x]\otimes _R R[T]\) and \(M' = R[x]\otimes _R M\). Then \(\chi _f(x)\) is the determinant of \(y = x\otimes 1_M - 1\otimes T \in R'\). \(\adj (y)\) commutes with \(y\) by Lemma 6.1, and since \(x\otimes 1_M\) is central it commutes with \(1\otimes T\) as well, but then it commutes with all of \(R'\). Now we look at \(R'/(y),M'/(y)M'\) which substitutes \(x\) as \(T\), and note that \(\adj (y)\) has a well-defined action on the quotient as it commutes with \(R'\). \(M'/(y)M'\cong 1\otimes M\) since \(g(x)\otimes m = (g(x)\otimes 1)(1\otimes m) = (1 \otimes g(T))(1\otimes m) = 1 \otimes g(T)m\). Then since \(y\) annihilates \(M'/(y)M'\), \(y\adj (y)\) does as well, but this is multiplication by \(\chi _f(x)\otimes 1=1\otimes \chi _f(T)\), which is the action of \(\chi _f(f)\). □

Corollary 6.4 (Determinant Trick). If \(f\) is an endomorphism of \(M\), an \(R\)-module generated by \(n\) elements, and \(fM\subset IM\), \(f\) satisfies \(f^n+a_{1}f^{n-1}+\dots +a_n \equiv 0\) where \(a_i \in I^i\).

Proof. By projectivity of free modules, it suffices to consider a free module, but then this follows from Theorem 6.3 by noting that the coefficients of \(\chi _f(x)\) are of the form described. □

Corollary 6.5 (Nakayama’s Lemma). If \(M\) is a finitely generated \(R\)-module and \(IM=M\), then there is an \(a\equiv 1\pmod {I}\) such that \(aM = 0\).

Proof. Apply Corollary 6.4 to the identity map \(1_M\) and use the fact that \(1_MM\subset IM\). □

Corollary 6.6 (Nakayama’s Lemma). If \(M\) is a finitely generated module over a local ring \(R\) with maximal ideal \(m\) and \(mM=M\), then \(M=0\).

Proof. By Corollary 6.5 \(aM=0\) for \(a\equiv 1\pmod {m}\) but then \(aM=M\). □

Corollary 6.7 (Nakayama’s Lemma). If \(M\) is a finitely generated module over a local ring \(R\) with maximal ideal \(m\) and \(R\) and \(M = N+mM\) then \(M=N\).

Proof. Apply Corollary 6.6 to \(M/N\). □

Corollary 6.8 (Nakayama’s Lemma). If \(M\) is a finitely generated module over a local ring \(R\) with maximal ideal \(m\) and the image of \(m_1,\dots ,m_n\) generate \(M/mM\), then \(m_1,\dots ,m_n\) generate \(M\).

Proof. Apply Corollary 6.7 with \(N = \sum _1^nm_iM\). □

Note if the ring is not local, we can replace \(m\) by the Jacobson radical and Nakayama’s Lemma still holds.

Proposition 6.9. Every endomorphism \(f:V\to V\) on a finite dimensional vector space \(V\) over \(F\) has a minimal polynomial \(\mu _f\), satisfying \(\mu _f(f)=0\), \(g(f)=0 \implies \mu _f|g\), and its roots are the eigenvalues.

Proof. Viewing \(V\) as a \(F[T]\) module, since \(F[T]\) is a PID, everything is immediate except the last part, which follows since \(f(v)=\lambda v\) so by Lemma 6.2 \(\mu _f(f)(v)=\mu _f(\lambda )v\) but the LHS is \(0\) and \(v\) is not so we are done. □

Proposition 6.10. Submodules \(M\) of \(R^n\), a finite generated free module over a PID are after a change of basis of the form \(\bigoplus _1^n x_ir_iR\) with \(x_i|x_{i+1} \in R\) and \(\bigoplus _1^n r_iR = R^n\) (the \(r_i\) are the change of basis). This representation is unique up to units and the \(x_i\) are called the invariant factors.

Proof. Choose a map \(f_1:R^n\to R\) where the image of \(M\) is maximized (this uses PID). Let \(y_1\) be an element sent to a generator of the image, \(x_1\), which WLOG is nonzero. Now if \(\pi _i\) is the \(i^{th}\) projection, then \(x_1|\pi _i(y_1)\) for all \(x \in R^n\) by maximality of \(f_1\), so we can let \(r_1 = \frac {y_1}{x_1}\). Now \(r_1\) gets sent to \(1\) by \(f\), so we can project orthogonal to \(r_1\) via a section \(s_1:R \to R^n\) taking \(1 \mapsto r_1\). Our projection \(o_1(x) = x-s_1\circ f_1(x)\). This section gives \(R^n = r_1R\oplus o_1(R)\). The projection to \(o_1(R)\) is surjective, and by removing an appropriate generator and localizing at \((0)\), we see that our new module must be free of rank \(n-1\). Now we apply induction to get \(y_2,\dots ,y_n\) and \(r_2,\dots ,r_n\), and by looking at \(f_1\) we get \(x_1|x_2\). Uniqueness also follows from induction. □

Corollary 6.11 (Smith Canonical Form). A map \(f:M \to N\) between finitely generated free modules over a PID of ranks \(n\) and \(m\) has a Smith Canonical Form, ie. is represented by a matrix of the form

\[\begin {pmatrix} x_1 & &\\ & \ddots & \\ & & x_{\min {(n,m)}} \end {pmatrix}\]

with \(x_i|x_{i+1}\). This representation is unique up to units.

Proof. The image is a submodule of \(R^m\) so this is a reformulation of Proposition 6.10. □

Corollary 6.12 (Finitely Generated Modules over PIDs). A finitely generated module over a PID is of the form \(R^m \oplus \bigoplus _1^nR/(d_i)\) where \(d_i|d_{i+1}\). Moreover \(m\) is unique, and the \(d_i\) are unique up to units.

Proof. A finitely generated module over a PID is a quotient of a finite rank free module, which has the correct form according to Proposition 6.10. □

Corollary 6.13 (Rational Canonical Form). Every endomorphism \(f:V\to V\) of a finite dimensional vector space over \(F\) has a unique Rational Canonical Form, ie. is represented by a matrix of the form

\[\bigoplus _{i=1}^n\begin {pmatrix} 0&0&0&\dots &-a_0\\ 1&0&0&\dots &-a_1\\ 0&1&0&\dots &-a_2\\ \vdots &\vdots &\vdots &\ddots &\vdots \\ 0&0&0&\dots &-a_{k_i-1} \end {pmatrix} \]

where the monic polynomials (invariant factors) \(f_i(x)=\sum _1^{k_i}a_ix^i\) satisfy \(f_i|f_{i+1}\). \(f_n\) is \(\mu _f(x)\) and \(\prod _if_i\) is \(\chi _f(x)\)

Proof. View \(V\) as a module over \(F[T]\), and we get \(V \cong \sum _1^nF[T]/(f_i)\) from Corollary 6.12. Then we are done by picking \(1,T,\dots ,T^{k_i-1}\) as a basis. □

Corollary 6.14. If \(A\) a matrix over \(F\), the invariant factors can be computed by finding the Smith Canonical Form of \(xI-A\).

Proof. If \(V\) is dimension \(n\) we can consider the \(F[T]\) module homomorphism \(F[T]^n \to V\) mapping the generators \(r_i\) surjectively onto an \(F\)-basis \(v_i\) of \(V\). Now the elements \(y_i=Tr_j-\sum _1^i(a_{ij}r_i)\) are in the kernel but note that \(\sum _iy_iF[T] + \sum _ir_iF = \sum _ir_iF[T] = F[T]^n\), so \(y_i\) actually generate the kernel. The \(y_i\) have the relations matrix \(xI-A^{t}\), so by Corollary 6.11 after a change of basis it is in Smith Normal Form with invariant factors \(f_1,\dots ,f_n\), so the kernel is of this form for an appropriate set of generators, and \(V \cong \bigoplus _1^nF[T]/(f_n)\). □

Note that Corollary 6.12 also can be represented as \(R^m \oplus \bigoplus R/(p^i)\) where \(p\) varies over primes, and similarly Corollary 6.13 has a representation in this way.

Corollary 6.15 (Jordan Canonical Form). Every endomorphism \(f:V\to V\) of a finite dimensional vector space over \(F\) has a unique Jordan Canonical Form after extending scalars to an algebraic closure, ie. is represented by a matrix of the form

\[\bigoplus _{i=1}^n \begin {pmatrix} \lambda _i&1&0&\dots &0\\ 0&\lambda _i&1&\dots &0\\ 0&0&\lambda _i&\dots &0\\ \vdots &\vdots &\vdots &\ddots &\vdots \\ 0&0&0&\dots &\lambda _i \end {pmatrix} \]

Each summand is called a Jordan block.

Proof. As an \(F[T]\)-module, decompose \(V \cong \bigoplus _{i=1}^n F[T]/(T-\lambda _i)^j_i\), and choose as a basis for each summand \(1,T-\lambda _i,\dots ,(T-\lambda _i)^{j_i-1}\). □

Corollary 6.16 (Diagonalization Theorem). A matrix is diagonalizable iff its Jordan Canonical Form is diagonal iff its minimal polynomial is separable.

Proof. By uniqueness of the Jordan Canonical Form the first statement is true, and since the minimal polynomial is the LCM of the minimal polynomials of the Jordan blocks, so it must have distinct roots. □

Proposition 6.17. Commuting diagonalizable endomorphisms \(A,B\) are simultaneously diagonalizable.

Proof. If \(v\) has eigenvalue \(\lambda \) for \(A\), then \(BAv = ABv = \lambda Av\), so \(B\)’s \(\lambda \)-eigenspace is \(A\)-invariant, so we can simultaneously diagonalize. □