1 Introduction
A real algebraic integer $\lambda \geq 1$ is Perron if $\lambda $ is strictly larger than all its other Galois conjugates in absolute value. In what follows, all matrices are considered to be square size. For a real matrix A, by $A>0$ we mean that all entries of A are positive. A non-negative real matrix A is primitive (or aperiodic) if there is some natural number k such that $A^k>0$ . A non-negative real matrix is irreducible if for any two indices i and j, there is some natural number $k=k(i,j)$ such that $(A^k)_{ij}>0$ . The Perron–Frobenius theorem implies that:
-
(1) the spectral radius of any integral primitive matrix is a Perron number; and
-
(2) the spectral radius of any integral irreducible matrix is equal to the nth root of a Perron number where n is the period of the irreducible matrix. Moreover, the spectral radius of an integral non-negative non-nilpotent matrix is equal to the spectral radius of an integral irreducible one.
Lind [Reference LindLin84, Theorem 1] proved a converse to the ‘integer’ Perron–Frobenius theorem, namely for every Perron number $\lambda $ there is an integral primitive matrix with spectral radius equal to $\lambda $ . It readily follows that for any Perron number $\lambda $ and any natural number n, there is an integral irreducible matrix with spectral radius equal to $\sqrt [n]{\lambda }$ ; see [Reference LindLin84, Theorem 3].
Associated to a non-negative and non-degenerate (that is, with no zero rows or columns) integral matrix $A = [a_{ij}]$ with spectral radius $\lambda $ is a shift of finite type with entropy equal to $\log (\lambda )$ , which is defined as the edge shift on a directed finite graph $G_A$ as follows. The graph $G_A$ has one vertex for each row of the matrix A, and there are exactly $a_{ij}$ oriented edges from the vertex $v_i$ to the vertex $v_j$ . In particular, the dimension of the matrix A (that is, its number of rows or columns) is equal to the number of vertices of the graph $G_A$ . The matrix is primitive if and only if the associated shift of finite type is topologically mixing. If the matrix is irreducible, then the corresponding shift of finite type is called irreducible. Irreducible shifts of finite type are those which are topologically transitive. See e.g. [Reference Lind and MarcusLM21].
Given a Perron number $\lambda $ , the Perron–Frobenius degree of $\lambda $ , $d_{\mathrm {PF}}(\lambda )$ , is the least dimension of an integral primitive matrix with spectral radius equal to $\lambda $ . Clearly we have
where d denotes the algebraic degree of $\lambda $ as an algebraic integer. It is easy to see that the equality happens if $\lambda $ is quadratic (see [Reference YazdiYaz21, Remark 3.1]), so we consider the case of ${d \geq 3}$ . Lind observed that if the trace (that is, the sum of Galois conjugates) of $\lambda $ is negative, then $d_{\mathrm {PF}}(\lambda )$ is strictly larger than d; see [Reference LindLin84, p. 289]. In [Reference YazdiYaz21], using an idea of Lind, we gave a lower bound for $d_{\mathrm {PF}}(\lambda )$ in terms of the layout of the two largest (in absolute value) Galois conjugates of $\lambda $ in the complex plane. As a corollary, it was shown that there are examples of cubic Perron numbers with arbitrarily large Perron–Frobenius degrees, a result previously known to Lind, McMullen, and Thurston, although unpublished.
Definition 1.1. Given a Perron number $\lambda $ , define the spectral ratio of $\lambda $ as $\max _i ({|\lambda _i|}/{\lambda })$ , where $\lambda _i \neq \lambda $ are the remaining Galois conjugates of $\lambda $ .
Notation 1.2. For a Perron number $\lambda $ , let $d_{\mathrm {PF}}^{\mathrm {irr}}(\lambda )$ be the smallest dimension of an integral irreducible matrix with spectral radius $\lambda $ .
In this paper, we give an explicit upper bound for $d_{\mathrm {PF}}^{\mathrm {irr}}(\lambda )$ in terms of the algebraic degree of $\lambda $ , the spectral ratio of $\lambda $ , and arithmetic information about the ring of integers $\mathcal {O}_{\mathbb {K}}$ of the number field $\mathbb {K}:= \mathbb {Q}(\lambda )$ . This arithmetic quantity, which we call the minimal Hermite-like thickness and denote it by $\tau _{\min }(\mathcal {O}_{\mathbb {K}})$ , was previously defined by Bayer Fluckiger [Reference Bayer FluckigerBay06] in relation to Minkowski’s conjecture; see Definition 3.1. Intuitively, once an inner product is chosen on $\mathbb {R}^d$ , the Hermite-like thickness is defined as the square of the covering radius, normalized properly, for the inclusion of the lattice $\mathcal {O}_{\mathbb {K}}$ in $\mathbb {R}^d$ . The minimal Hermite-like thickness is then defined by taking the infimum of Hermite-like thickness over an appropriate space of inner products on $\mathbb {R}^d$ . As a corollary of our main result, and using an inequality due to Banaszczyk and Bayer Fluckiger (see inequality (3.25)), we obtain a similar bound in terms of the discriminant $D_{\mathbb {K}}$ of $\mathcal {O}_{\mathbb {K}}$ instead of $\tau _{\min }(\mathcal {O}_{\mathbb {K}})$ . See Definition 3.3.
Theorem 1.3. Let $\lambda $ be a Perron number of algebraic degree $d \geq 3$ and spectral ratio $\rho $ . Set $\mathbb {K} := \mathbb {Q}(\lambda )$ . Let $\mathcal {O}_{\mathbb {K}}$ be the ring of integers of $\mathbb {K}$ , and denote the discriminant and the minimal Hermite-like thickness of $\mathbb {K}$ by, respectively, $D_{\mathbb {K}}$ and $\tau _{\min }(\mathcal {O}_{\mathbb {K}})$ . Then $d_{\mathrm {PF}}^{\mathrm {irr}}(\lambda )$ is bounded above by each of
and
Remark 1.4. Given a natural number n and an integral irreducible matrix A with spectral radius $\lambda $ , one can readily construct an integral irreducible matrix B with spectral radius $\sqrt [n]{\lambda }$ such that $\dim (B) = n \dim (A)$ . Hence, Theorem 1.3 can be used to give an upper bound for $d_{\mathrm {PF}}^{\mathrm {irr}}(\sqrt [n]{\lambda })$ . See e.g. the proof of [Reference LindLin84, Theorem 3].
Theorem 1.3 immediately translates into the context of irreducible shifts of finite type with a given entropy. We can derive an upper bound for the Perron–Frobenius degree using Theorem 1.3. See also Remark 3.9 and Question 5.1. First we need to introduce a notation.
Notation 1.5. Let $\mathbb {K}$ be a real number field (that is, with at least one real place), and $\rho \in (0,1)$ . Set
and denote
Note that for any $M>1$ , there are only finitely many Perron numbers in the interval $[1,M]$ with algebraic degree at most d. Hence, $\kappa (\mathbb {K}, \rho )$ can be computed in theory.
Theorem 1.6. Let $\mathbb {\lambda }$ be a Perron number of algebraic degree $d \geq 3$ and spectral ratio $\rho $ . Set $\mathbb {K}:= \mathbb {Q}(\lambda )$ . Denote the bound from Theorem 1.3 by $B(\mathbb {K}, \rho )$ ; note that $d=[\mathbb {K}:\mathbb {Q}]$ is uniquely determined by $\mathbb {K}$ . The Perron–Frobenius degree of $\lambda $ is bounded above by
1.1 Previous work
In [Reference LindLin84], Lind gave a method for producing all integral primitive matrices with a given Perron number $\lambda $ as their spectral radius. Let $B \colon \mathbb {R}^d \rightarrow \mathbb {R}^d$ be the companion matrix associated with $\lambda $ . In what follows, unless otherwise specified, all eigenvectors are column (right) eigenvectors; similarly for eigenspaces. Pick an eigenvector v for the eigenvalue $\lambda $ , and denote the one-dimensional eigenspace of $\lambda $ by $E_\lambda $ . Let E be the positive half-space associated with v, that is, if $\pi _1 \colon \mathbb {R}^d \rightarrow E_\lambda $ is the projection along the complementary invariant subspace, then
Theorem 1.7. (Lind)
Let $\lambda $ be a Perron number of algebraic degree d, and B and E be as above. If $A = [a_{ij}]$ is an n-dimensional primitive non-negative integral matrix with spectral radius $\lambda $ , then there are $z_i \in \mathbb {Z}^d \cap E$ for $1 \leq i \leq n$ such that $Bz_j = \sum _{i=1}^{n} a_{ij}z_i$ .
Conversely, if $\lambda $ , B, and E are as above, and the points $z_i \in \mathbb {Z}^d \cap E$ and a non-negative integral matrix $A=[a_{ij}]$ satisfy $Bz_j = \sum _{i=1}^{n} a_{ij}z_i$ , then every irreducible component of A has its spectral radius equal to $\lambda $ .
The above theorem of Lind gives a practical way to produce an integral primitive matrix with a given spectral radius; see [Reference LindLin84, p. 289]. However, it does not tell us how to find such a matrix with smallest (or close to smallest) dimension, since we are not given control over the size of the coordinates of $z_i$ . Nevertheless, the referee has kindly mentioned to me that there exists a simple, but not necessarily practical, algorithm that computes the Perron–Frobenius degree of a Perron number: Assume that $\lambda $ is given by its minimal polynomial, and denote the algebraic degree of $\lambda $ by d. For every positive integer n, there are only finitely many primitive $n \times n $ integral matrices with spectral radius less than or equal to $\lambda $ . One can algorithmically enumerate these. For each of them, one can algorithmically determine whether the spectral radius is equal to $\lambda $ . So for $n = d$ , we can check whether there is an integral primitive matrix of dimension n which has $\lambda $ as the spectral radius. Recursively, if we fail at n, then we try at $n+1$ . By Lind’s theorem, we eventually find an n where we succeed; that n is equal to $d_{\mathrm {PF}}(\lambda )$ .
For matrices with non-negative integral polynomial entries, the situation is different. See the work of Boyle and Lind, which gives a uniform upper bound (in fact, a 2 by 2 matrix) in this context [Reference Boyle and LindBL02]. For the related topic of the inverse spectral problem for non-negative integral matrices, see the works of Boyle and Handelman [Reference Boyle and HandelmanBH91] and Kim, Ormes and Roush [Reference Kim, Ormes and RoushKOR00] and the references therein.
1.2 Idea of the proof
In [Reference Thurston, Bonifant, Lyubich and SutherlandThu14], Thurston gave a simpler proof of Lind’s converse to the integer Perron–Frobenius theorem. Our proof of Theorem 1.3 follows Thurston’s approach, while controlling the dimension of a constructed matrix.
The tensor product $\mathbb {Q}(\lambda ) \otimes _{\mathbb {Q}} \mathbb {R}$ can be identified with $\mathbb {R}^d$ . Let $M_\lambda $ be the linear endomorphism of $\mathbb {R}^d \cong \mathbb {Q}(\lambda ) \otimes _{\mathbb {Q}} \mathbb {R}$ induced by multiplication by $\lambda $ in $\mathbb {Q}(\lambda )$ . The eigenvalues of $M_\lambda $ are the Galois conjugates of $\lambda $ . Then $\mathbb {R}^d$ decomposes into invariant subspaces of $M_\lambda $ corresponding to real places and pairs of conjugate complex places of $\lambda $ ; see the opening paragraphs to §3. Fix an eigenvector for $M_\lambda $ with eigenvalue $\lambda $ , and consider the positive half-space corresponding to $\lambda $ .
We start with a polygonal cone with apex at the origin that lies in the positive half-space and is invariant under $M_\lambda $ . We then perturb the vertices of the cone to obtain an invariant polygonal cone $\mathcal {C}$ with integral vertices; see Steps 2–4 of the proof of Theorem 1.3. It is during this perturbation that the minimal Hermite-like thickness appears in the picture. Since the polygonal cone has integral vertices, the semigroup S generated by the set of integral points in the cone $\mathcal {C}$ under addition of vectors is finitely generated; see Proposition 2.1. The cardinality of a generating set for the semigroup S gives an upper bound for the dimension of an integral non-negative matrix A with spectral radius $\lambda $ . Moreover, after possibly passing to an irreducible component of A, an integral irreducible matrix with spectral radius $\lambda $ is obtained; see Step 5. Finally we give an upper bound for the dimension of A; see Step 6.
2 Preliminaries
2.1 Lattice points
Throughout this article, by a lattice $\Lambda \subset \mathbb {R}^d$ we mean a lattice of full rank; that is, a discrete subgroup of $\mathbb {R}^d$ isomorphic to $\mathbb {Z}^d$ .
Proposition 2.1. Let $\Lambda \subset \mathbb {R}^d$ be a lattice, and $x_1 , \ldots , x_k \in \Lambda $ . Denote by $\mathcal {C}$ the convex cone over the points $x_i$ inside $\mathbb {R}^d$ , that is,
Let S be the semigroup generated by elements of $\mathcal {C} \cap \Lambda $ under vector addition. Define the compact set C, and the finite set $C_{\Lambda } \subset C$ as
Then $C_{\Lambda }$ is a finite generating set for the semigroup S.
Proof For $\alpha \in \mathbb {R}$ , denote the fractional part of $\alpha $ by $\{ \alpha \}$ , and let $\lfloor \alpha \rfloor = \alpha - \{ \alpha \}$ . Any point $y \in S$ can be written as
where $\alpha _i , \lfloor \alpha _i \rfloor \geq 0$ for each i. Note that we have $x_i \in C_{\Lambda }$ for each i, so the first parenthesis is a sum of elements of $C_{\Lambda }$ . As y and the first parenthesis are both in $\Lambda $ , the second parenthesis should represent a point in $\Lambda $ as well. On the other hand, the coefficients of the second parenthesis are in the interval $[0,1)$ , and so the second parenthesis lies in $C_\Lambda = C \cap \Lambda $ . We have written y as a sum of elements of $C_{\Lambda }$ , so $C_{\Lambda }$ is a generating set for the semigroup S. Since C is compact and $\Lambda $ is a lattice, $C_{\Lambda }= C \cap \Lambda $ is a finite set.
By a Euclidean space of dimension d, we mean a d-dimensional vector space $\mathbb {R}^d$ equipped with an inner product. A polytope is the convex hull of finitely many points in $\mathbb {R}^d$ . Let $\Lambda \subset \mathbb {R}^d$ be a lattice. If $\{ v_1, \ldots , v_d \} $ is a basis for $\Lambda \cong \mathbb {Z}^d$ , then a fundamental domain for $\Lambda $ is
If $\mathbb {R}^d$ is a Euclidean space, then the covolume of $\Lambda $ is defined as the volume of any fundamental domain for $\Lambda $ . A lattice polytope is a polytope whose vertices are lattice points.
Proposition 2.2. Let $\mathbb {R}^d$ be a Euclidean space, $\Lambda \subset \mathbb {R}^d$ be a lattice with covolume $\mathrm {Covol}(\Lambda )$ , and $P \subset \mathbb {R}^d$ be a d-dimensional lattice polytope with volume $\mathrm {Vol}(P)$ . Denote the number of lattice points in P by $|P \cap \Lambda |$ . Then
The equality happens exactly when P is a d-simplex with $|P \cap \Lambda |=d+1$ .
Proof First consider the special case that P has exactly $d+1$ vertices, and every lattice point in P is a vertex of P. If $v_0, v_1, \ldots , v_d \in \mathbb {R}^d$ are the vertices of P, then the volume of the parallelepiped formed by the vectors $v_1 - v_0 , \ldots , v_d - v_0$ is equal to $d! \times \mathrm {Vol}(P)$ . Since the volume of this parallelepiped is at least as large as the volume of a fundamental domain for $\Lambda$ , we have
implying that
In general, decompose P into d-simplices $\Delta _1, \ldots , \Delta _n$ with disjoint interiors such that each simplex $\Delta _i$ contains no lattice point except for its vertices. This can be done for example as follows. Decompose P into smaller polyhedra by coning off from one of the vertices of P. Here, by coning off from a point $v \in P$ , we mean that for every facet F of P, the polyhedron which is the convex hull of $F \cup v$ is added unless its dimension is strictly smaller than that of P; for example, if the starting polyhedron is a polygon, then coning off from a vertex v is just decomposing the polygon into triangles via adding all the diagonals emanating from v. For any of the resulting polyhedra, successively take a lattice point inside or on the boundary, and cone off from that lattice point. After finitely many repetitions, we arrive at the decomposition into $\Delta _i$ .
The desired inequality follows from adding up the corresponding inequalities for simplices $\Delta _1, \ldots , \Delta _n$ . Note that if P is not a d-simplex or $|P \cap \Lambda |> d+1$ , then at least one lattice point in $P \cap \Lambda $ is counted for more than one simplex $\Delta _i$ , and so the inequality is strict.
2.2 Minkowski sum and difference
For sets $A , B \subset \mathbb {R}^d$ , define their Minkowski sum as
Intuitively, $A + B$ is the union of all translates of A by elements of B
Define the Minkowski difference of A and B by
If B is empty, $A \div B$ is, by convention, equal to $\mathbb {R}^d$ . Intuitively, $A \div B$ is the intersection of all translates of A by the antipodes of elements of B
We have used the rather odd notation $\div $ for the Minkowski difference, in order to distinguish it from the set
The Minkowski sum and difference are not in general the inverse of each other.
Lemma 2.3. The following properties hold for sets $A , B, C \subset \mathbb {R}^d$ :
Moreover, if A and B are non-empty compact, convex sets, then
Proof The first three properties directly follow from the definition. We sketch the proof of the last implication, and refer the reader to e.g. [Reference SchneiderSch13, Lemma 3.1.11 and Section 1.7] for details. Pick an inner product $\langle \cdot , \cdot \rangle $ on $\mathbb {R}^d$ , and denote the support functions for A and B by, respectively, $h_A$ and $h_B$ . By definition, for any $x \in \mathbb {R}^d$ ,
and $h_B$ is defined similarly. By the separation theorem for convex sets, for a non-empty compact convex set $A $ , we have
Assuming $x \in (A+B)\div B$ , we would like to show that $x \in A$ . By hypothesis, $x + B \subset A + B$ . Equivalently, the support function for $x+B$ does not exceed that of $A+B$ pointwise. This implies
using the fact that $h_{A+B} = h_A + h_B$ for non-empty compact convex sets A and B; see [Reference SchneiderSch13, Theorem 1.7.5]. Cancelling $h_B$ from both sides gives us the inequality $h_{\{x\}} \leq h_A$ , which implies that $x \in A$ .
3 Proof of Theorem 1.3
We follow Bayer Fluckiger [Reference Bayer FluckigerBay06] and Jarvis [Reference JarvisJar14] for the definitions below. Let $\lambda $ be an algebraic integer, and $\mathbb {K} = \mathbb {Q}(\lambda )$ be the number field obtained by adjoining $\lambda $ to $\mathbb {Q}$ . We may interpret the points of $\mathbb {K}$ as lying in a d-dimensional real linear space as follows. Let $\sigma _1 , \ldots , \sigma _r$ be the real embeddings of $\mathbb {K}$ , and $\sigma _{r+1}, \overline {\sigma }_{r+1}, \ldots , \sigma _{r+s}, \overline {\sigma }_{r+s}$ be the pairwise conjugate complex embeddings of $\mathbb {K}$ , where
Consider the embedding
We may identify $\mathbb {C}$ with $\mathbb {R}^2$ , by identifying $a+bi$ with $(a,b)$ , then $\sigma $ becomes an embedding
The mapping $\sigma \colon \mathbb {K} \rightarrow \mathbb {R}^d$ identifies the vector space $\mathbb {R}^d$ with the tensor product $\mathbb {K}_{\mathbb {R}} : = \mathbb {K} \otimes _{\mathbb {Q}} \mathbb {R}$ ,
Note that $\mathbb {R}^r \times \mathbb {C}^s$ has a canonical involution, which is identity on $\mathbb {R}^r$ and complex conjugation on $\mathbb {C}^s$ . Let
Given $\alpha \in \mathfrak {B}$ , define the symmetric positive definite bilinear form $q_\alpha $ on $\mathbb {K}_{\mathbb {R}}$ by
Here, $\text {Trace}(x_1 , \ldots , x_{r+s}) := \sum _{i=1}^{r} x_i + \sum _{j=r+1}^{r+s} (x_j + \overline {x}_j)$ denotes the trace of $\mathbb {K}_{\mathbb {R}} \cong \mathbb {R}^r \times \mathbb {C}^s$ , and $\alpha x \bar {y}$ denotes component-wise product in $\mathbb {R}^r \times \mathbb {C}^s$ . Then $q_\alpha $ induces the norm $|\cdot |_\alpha $ on $\mathbb {K}_{\mathbb {R}}$ given by the following formula, where $\alpha = (\alpha _1, \ldots , \alpha _{r+s})$ with $\alpha _i \in \mathbb {R}^{>0}$ :
Denote the ring of integers of $\mathbb {K}$ by $\mathcal {O}_{\mathbb {K}}$ . Let $(\mathcal {O}_{\mathbb {K}}, q_\alpha )$ denote the lattice $\mathcal {O}_{\mathbb {K}}$ equipped with the inner product $q_\alpha $ . The maximum of $(\mathcal {O}_{\mathbb {K}}, q_\alpha )$ , denoted by $\max(\mathcal{O}_\mathbb{K}, q_\alpha)$ , is defined as
The covering radius of $(\mathcal {O}_{\mathbb {K}}, q_\alpha )$ is, by definition, the square root of $\max (\mathcal {O}_{\mathbb {K}}, q_\alpha )$ . Define the determinant of $(\mathcal {O}_{\mathbb {K}}, q_\alpha )$ as the determinant of the matrix of $q_\alpha $ in a $\mathbb {Z}$ -basis of $\mathcal {O}_{\mathbb {K}}$ ; that is, if $\omega _1, \ldots , \omega _d$ is a basis for the abelian group $\mathcal {O}_{\mathbb {K}} \cong \mathbb {Z}^d$ (under addition), then $\det (\mathcal {O}_{\mathbb {K}},q_\alpha )$ is the determinant of the $d \times d$ matrix $(q_\alpha (w_i, w_j))$ . With this definition, the determinant of $(\mathcal {O}_{\mathbb {K}}, q_\alpha )$ is equal to the square of the volume of any fundamental domain for the lattice $(\mathcal {O}_{\mathbb {K}},q_\alpha )$ . We remark that some texts define the determinant as the volume of a fundamental domain, but we preferred to follow Bayer Fluckiger’s convention as in [Reference Bayer FluckigerBay06].
Definition 3.1. Define the Hermite-like thickness $\tau (\mathcal {O}_{\mathbb {K}}, q_\alpha )$ of $(\mathcal {O}_{\mathbb {K}}, q_\alpha )$ as
Define the minimal Hermite-like thickness as
where $\mathfrak {B}$ is as in equation (3.2).
Remark 3.2. Although we called $\tau _{\min }(\mathcal {O}_{\mathbb {K}})$ the minimal Hermite-like thickness, it should be noted that the minimum is taken over the set of inner products coming from elements of $\mathfrak {B}$ and not all possible inner products on $\mathbb {R}^d$ .
It is clear that the concepts of maximum, covering radius, and Hermite-like thickness can be defined more generally for a lattice in a Euclidean space; see [Reference Bayer FluckigerBay06].
Definition 3.3. Let $\mathbb {K}$ be a number field. Assume that $\omega _1, \ldots , \omega _d$ is any integral basis for $\mathcal {O}_{\mathbb {K}}$ . Denote the complete list of places of $\mathbb {K}$ by $\sigma _1, \ldots , \sigma _d$ . The discriminant of $\mathbb {K}$ is defined as the square of the determinant of the $d \times d$ matrix $(\sigma _i(\omega _j))$ .
See [Reference JarvisJar14, Chs 3 and 7] for further properties of the discriminant as well as the embedding of $\mathcal {O}_{\mathbb {K}}$ in $\mathbb {R}^d$ .
Theorem 1.3. Let $\lambda $ be a Perron number of algebraic degree $d \geq 3$ and spectral ratio $\rho $ . Set $\mathbb {K} := \mathbb {Q}(\lambda )$ . Let $\mathcal {O}_{\mathbb {K}}$ be the ring of integers of $\mathbb {K}$ , and denote the discriminant and the minimal Hermite-like thickness of $\mathbb {K}$ by, respectively, $D_{\mathbb {K}}$ and $\tau _{\min }(\mathcal {O}_{\mathbb {K}})$ . Then $d_{\mathrm {PF}}^{\mathrm {irr}}(\lambda )$ is bounded above by each of
and
Proof Let $\mathbb {K}_{\mathbb {R}}:= \mathbb {K} \otimes _{\mathbb {Q}} \mathbb {R}$ . Let $\sigma _1 , \ldots , \sigma _{r+s}$ be as before, and identify $\mathbb {K}_{\mathbb {R}}$ with $\mathbb {R}^r \times \mathbb {C}^s \cong \mathbb {R}^d$ . A place $\sigma _j$ is a field homomorphism $\sigma _j \colon \mathbb {Q}(\lambda ) \rightarrow (\mathbb {R} \text { or } \mathbb {C})$ and so $\sigma _j$ is completely determined by $\sigma _j(\lambda )$ , which is one of the Galois conjugates of $\lambda $ . Assume that $\sigma _1$ is the real place corresponding to $\lambda $ itself; that is, $\sigma _1(\lambda )=\lambda $ . Let $M_\lambda $ be the linear endomorphisms of $\mathbb {K}_{\mathbb {R}}$ induced by multiplication by $\lambda $ in $\mathbb {K}$ . The eigenvalues of $M_\lambda $ are the Galois conjugates of $\lambda $ . For any Galois conjugate $\lambda _i$ , denote by $E_i$ the invariant subspace for $M_\lambda $ with eigenvalue $\lambda _i$ , and let $\pi _i \colon \mathbb {K}_{\mathbb {R}} \rightarrow E_i$ be the projection along the complementary invariant subspace. Therefore, $\pi _i$ is the projection onto the ith factor under the identification $\mathbb {K}_{\mathbb {R}} \cong \mathbb {R}^r \times \mathbb {C}^s$ .
Before going into the details, we explain the main idea when $\lambda $ is a cubic algebraic integer that is not totally real. In order to find a non-negative integral matrix with spectral radius $\lambda $ , we follow Thurston’s proof of Lind’s theorem. See [Reference Thurston, Bonifant, Lyubich and SutherlandThu14, pp. 353–354] and [Reference LindLin84]. Let $E_1$ be the one-dimensional invariant subspace for $M_\lambda $ with eigenvalue $\lambda $ , and $E_2$ be the two-dimensional invariant subspace corresponding to the pair of complex Galois conjugates $\{ \delta , \overline {\delta } \} $ of $\lambda $ . The endomorphism $M_\lambda $ of $\mathbb {K}_{\mathbb {R}}$ leaves $E_1$ and $E_2$ invariant, and it acts on $E_1 \cong \mathbb {R}$ and $E_2 \cong \mathbb {C}$ by multiplication by, respectively, the numbers $\lambda $ and $\delta $ . Pick a large positive integer $N = N(\delta , \lambda )$ such that if $P_\delta $ is a regular N-gon inscribed in a circle of radius R around the origin in $E_2$ , then $P_\delta \subset E_2 \cong \mathbb {C}$ is invariant under multiplication by the complex number $\delta /\lambda $ . Such an integer N exists since, by the Perron condition, the absolute value of $\delta /\lambda $ is strictly less than $1$ . Let v be an eigenvector of $M_\lambda $ with eigenvalue $\lambda $ , and $E_{2}^v$ be the affine plane $Rv + E_2$ . Then $E_2^v$ lies in the positive half-space E corresponding to $\lambda $ and v.
Denote the vertices of the shifted polygon $P_v := Rv+ P_\delta \subset E_2^v$ by $v_1 , \ldots , v_k$ , and choose integral points $z_1, \ldots , z_k \subset \mathbb {R}^3$ such that the distance between $z_i$ and $v_i$ is ‘small’. Since the cone over the points $v_1, \ldots , v_k$ lies in the positive half-space and is invariant under $M_\lambda $ , and the distance between $z_i$ and $v_i$ is small, it is reasonable to expect that the cone $\mathcal {C}$ over the points $z_i$ also lies in the positive half-space E and is invariant under $M_\lambda $ for ‘large’ R. Let S be the semigroup generated by the set of integral points in the cone $\mathcal {C}$ under vector addition. Then $M_\lambda $ preserves S and induces an action $M_\lambda ^S$ on S. Moreover, S has a finite generating set, and we can estimate an upper bound for the size $|G|$ of a generating set G using Proposition 2.1. If we write the action of $M_\lambda ^S$ on S in the generating set G, we obtain a non-negative integral matrix of size $|G|$ whose spectral radius is equal to $\lambda $ . The details of the proof are as follows.
Step 1: Choosing an inner product on $\mathbb {R}^r \times \mathbb {C}^s$ . Define $\mathfrak {B}$ as in equation (3.2). Pick $\alpha \in \mathfrak {B}$ and equip $\mathbb {R}^r \times \mathbb {C}^s$ with the inner product $q_\alpha $ . Note that for the norm $|\cdot |_\alpha $ and for any $x \in \mathbb {K}_{\mathbb {R}}$ ,
and so
Let $\ell $ be the covering radius of $(\mathcal {O}_{\mathbb {K}}, q_\alpha )$ . Then
Step 2: Defining the polygon $P_j$ in the invariant subspace $E_j$ for each $j>1$ . Define
Hence
By the Perron condition, $\rho _j \in (0,1)$ for each $j>1$ . Set
Remark 3.4. Clearly $R_j$ does not depend on $1<j \leq r$ . However, we decided that this notation would be more suitable if one would like to improve the bounds in the article by substituting $\rho _j$ instead of $\rho $ in the definition of $R_j$ . Similarly, in what follows, $R_j$ for $j>r$ will not depend on j.
For each real place $\sigma _j$ with $j>1$ , define $P_j$ as the set of points of distance at most $R_j$ from the origin in $E_j$ ; in particular, $P_j$ is an interval. Note, for future use, that
For $j>r$ , define the natural number $N_j \geq 3$ as the smallest positive integer satisfying
Note, for later use, that
In the above, we used the fact that the smallest whole square exceeding $u \geq 16$ is less than or equal to $(\sqrt {u}+1)^2 \leq 3u/2 +1$ . Set
For $j>r$ , define the solid polygon $P_j \subset E_j$ as a regular $N_j$ -gon inscribed in the circle of radius $R_j$ around the origin. Define the polytope P as the product of $P_j$ for $j>1$ ,
The number of vertices of P is equal to the product of the number of vertices of $P_j$ for $j>1$ , which is equal to $2^{r-1} \times \prod _{j>r} N_j$ .
Step 3: Defining the cone $\mathcal {C}$ in the positive half-space of $\lambda $ . Let $v \in E_1$ be the positive (with respect to $\pi _1$ ) unit length eigenvector for the eigenvalue $\lambda $ . Here, the unit length is considered with respect to the distance $| \cdot |_\alpha $ . Let E be the positive half-space corresponding to v. Set
Note, for future use, that
Denote by $P_v$ the translated polytope $Lv + P$ , and let $\mathcal {C}_v \subset E$ be the cone over the polytope $P_v$ . Equivalently, if ${1}/{L} \cdot P_j$ is the dilation of $P_j$ by the factor ${1}/{L}$ centered at the origin of $E_j$ , then
Denote the vertices of $P_v$ by $v_1, v_2, \ldots , v_k$ , where
Pick integral points $z_1, z_2, \ldots , z_k$ in $\mathcal {O}_{\mathbb {K}}$ such that the distance between $v_i$ and $z_i$ does not exceed the covering radius $\ell $ of $(\mathcal {O}_{\mathbb {K}}, q_\alpha )$ .
First we show that each $z_i$ lies in the positive half-space E. Identify the subspace $E_1$ with $\mathbb {R}$ via
It is enough to show that for each i, we have $\pi _1(z_i)>0$ . By the triangle inequality
where the second inequality holds by inequality (3.5), and the last inequality is true by inequality (3.15).
Define $\mathcal {C}$ as the cone over the points $z_1, z_2, \ldots , z_k$ with apex at the origin,
For each $z_i$ , define $w_i$ as the intersection point of the ray through $z_i$ from the origin and the affine hyperplane $Lv+E_1^c$ , where $E_1^c = \oplus _{j>1} E_j$ is the complementary invariant subspace to $E_1$ . Then
Equivalently, if $Q_v \subset Lv + E_1^c$ is the convex hull of $w_1, w_2, \ldots , w_k$ , then
Hence, $\mathcal {C}$ is the cone over the points $z_1, z_2, \ldots , z_k$ , or, equivalently, the cone over the points $w_1, w_2, \ldots , w_k$ . In what follows, we will use the two descriptions of $\mathcal {C}$ as needed.
Step 4: Showing that the cone $\mathcal {C}$ is invariant. We need a few lemmas to prove that $\mathcal {C}$ is invariant.
Lemma 3.5. For each i, we have
Proof For the first inequality, we have
where we used equations (3.4) and (3.14).
Assume that $w_i = r_i z_i$ for some positive real number $r_i$ . Note that
where we used inequality (3.5) in the first line above. Therefore,
Hence, by the triangle inequality,
The last implication above, $L \leq 2(L- \ell )$ , holds by inequality (3.15).
Set
Lemma 3.6. The following estimates hold for every $z_i$ , where $M_\lambda $ is the linear endomorphism of $\mathbb {K}_{\mathbb {R}}$ induced by multiplication by $\lambda $ in $\mathbb {K}$ .
-
(a)
$$\begin{align*}\frac{|\pi_j (M_\lambda(z_i))|_\alpha}{|\pi_1 (M_\lambda(z_i))|_\alpha} \leq \rho_j \bigg( \frac{R_j + \ell}{L - \ell} \bigg) \quad \text{for every } j>1.\end{align*}$$ -
(b)
$$\begin{align*}\frac{|\pi_j (M_\lambda(z_i))|_\alpha}{|\pi_1 (M_\lambda(z_i))|_\alpha} \leq \frac{R_j - b}{L} \quad \text{for every } 1<j\leq r. \end{align*}$$ -
(c)
$$\begin{align*}\frac{|\pi_j(M_\lambda(z_i))|_\alpha}{|\pi_1(M_\lambda(z_i))|_\alpha} \leq \frac{1}{L} \bigg( R_j \bigg(1 - \frac{5}{N_j^2} \bigg) - b \bigg) \quad \text{for every } j>r. \end{align*}$$
Proof (a) We have
where we have used inequality (3.5).
(b) Assume that $\sigma _j$ is a real place where $1 < j \leq r$ , and set $R^{\prime }_j = R_j/\ell $ , $L' = L/ \ell $ , and $b' = b / \ell $ . Then
Hence, using $\rho _j \in (0,1)$ , it is enough to have
On the other hand, $L' \geq R_j'$ by equation (3.14). Hence, it suffices to have
which holds by equation (3.9).
(c) Now assume that $\sigma _j$ is a complex place where $j>r$ . Then
By equation (3.14), we have $L \geq R_j = N_j^2 \ell $ , and so
Set $b' = b/\ell $ . After substituting $R_j = N_j^2 \ell $ and dividing both sides by $\ell $ , it is enough to have
Multiplying both sides by $N_j^2$ and then expanding, the last inequality is equivalent to
So, using $\rho _j \in (0,1)$ and neglecting the positive term $5+b'$ , it suffices to have
and the last inequality holds by inequality (3.11).
Recall that $\mathcal {C}$ is the cone over the points $w_1, \ldots , w_k$ with apex at the origin, that is, the cone over the polytope $Q_v$ (that is, the convex hull of $w_i$ ). For each i, the point $w_i$ of $Q_v$ is of distance at most $b = (2\sqrt {d}+2)\ell $ from the corresponding point $v_i$ of $P_v$ . Since b is ‘small’, we expect the polytope $Q_v$ to contain a ‘smaller version’ of $P_v$ . More precisely, define
where
Lemma 3.7. We have $P_v^b \subset Q_v$ .
Proof Let B be the ball of radius b around the origin in $E_1^c = \oplus _{j>1} E_j$ . We show that
where $Q_v + B$ denotes the Minkowski sum. To see this, pick an arbitrary point $x \in P_v$ and write it as a convex combination
with $0 \leq \alpha _i \leq 1$ and $\sum _{i=1}^{k} \alpha _i =1$ . Then
If we set $y = \sum _{i=1}^{k} \alpha _i w_i$ and $z = \sum _{i=1}^{k} \alpha _i (v_i - w_i)$ , then $y \in Q_v$ . Moreover, by the triangle inequality, $z \in B$ , since each term $v_i - w_i$ lies in B. This shows that $P_v \subseteq Q_v + B$ .
If we denote the Minkowski difference by $\div $ , then we have
where the last equality holds since $Q_v$ and B are non-empty, compact, and convex. See Lemma 2.3. Therefore, it suffices to show that $P_v^b \subset P_v \div B$ . It follows from the definition that $P_v^b+B \subset P_v$ . Hence, assuming that $P_v^b$ is non-empty, we have
where we used Lemma 2.3 twice. This completes the proof.
Define
and set
Lemma 3.8. We have $W \subset V \subset P_v^b \subset Q_v$ .
Proof The inclusion $W \subset V$ follows from the following simple fact: Let T be a regular N-gon inscribed in a circle of radius R centered at the point O. Every point in $\partial T$ is of distance at least $R(1 - 5/N^2)$ from O.
For completeness, we give a proof of the above fact. After scaling, we may assume that $R=1$ . The minimum distance $d(O ,x)$ for $x \in \partial T$ is obtained by drawing the perpendicular from O to a side of T. Such a perpendicular has length $\cos ({\pi }/{N})$ . Hence
where we have used the inequality $\cos (x)> 1 - ({x^2}/{2})$ for $0 < x <1$ .
The inclusion $P_v^b \subset Q_v$ was proved in Lemma 3.7. Now we show that $V \subset P_v^b$ . Pick arbitrary points $x \in V$ and $y \in \partial P_v = \partial (Lv + P)$ . Since $P = \prod _{j>1} P_j$ is a product, there is $j>1$ such that $\pi _j(y) \in \partial P_j$ . Therefore
In the above, we used the fact that projection onto a non-empty closed convex set in a Euclidean space is distance decreasing; see e.g. [Reference SchneiderSch13, Theorem 1.2.1]. Hence $x \in P_v^b$ , proving the inclusion $V \subset P_v^b$ .
Using equation (3.20), to show that $\mathcal {C}$ is invariant, it is enough to prove that for every $z_i$ ,
Equivalently, we would like to show that for every $j>1$ and every $z_i$ , the following inequalities hold:
But the above inequalities hold by Lemma 3.6. This finishes the proof of the invariance of $\mathcal {C}$ .
Step 5: Defining a non-negative integral matrix A, where each irreducible component of A has spectral radius equal to $\lambda $ . Let S be the semigroup generated by elements of $\mathcal {C} \cap \mathcal {O}_{\mathbb {K}}$ under vector addition. By Proposition 2.1, the set of integral points (that is, elements of $\mathcal {O}_{\mathbb {K}}$ ) in the compact region
generates S. Enumerate the set of points in $C \cap \mathcal {O}_{\mathbb {K}}$ by $c_1, \ldots , c_n$ . Recall that $M_\lambda $ acts on S. Moreover, $M_\lambda $ can be represented by the companion matrix B when written in the basis $\{ 1, \lambda , \ldots , \lambda ^{d-1} \}$ of $\mathbb {K}_{\mathbb {R}} = \mathbb {Q}(\lambda ) \otimes \mathbb {R}$ . Let $A=[a_{ij}]$ be a non-negative integral matrix corresponding to the action of B on S in the basis $\{ c_1, \ldots , c_n\}$ . Hence
where $B \colon \mathbb {R}^d \rightarrow \mathbb {R}^d$ is the companion matrix associated with $\lambda $ . There might be various choices for A, when $Bc_j$ can be written as a non-negative linear combination of $c_1, \ldots , c_n$ in more than one way.
Lind’s argument [Reference LindLin84, pp. 288–289] shows that every irreducible component of A has spectral radius $\lambda $ . We follow the proof given in [Reference Lind and MarcusLM21, Lemma 11.1.10] briefly for the reader’s convenience. Replace A by one of its irreducible components; this amounts to taking a minimal subset of $\{ c_1 , \ldots , c_n \}$ for which equation (3.23) holds. Let $\mu $ be the spectral radius of A. Let $e_i$ be the ith unit vector in $\mathbb {R}^n$ , and define the linear map $P \colon \mathbb {R}^n \rightarrow \mathbb {R}^d$ by $P(e_i)=c_i$ . Then by equation (3.23), we have $PA=BP$ . Since A is non-negative and irreducible, by the Perron–Frobenius theorem A has a positive eigenvector $v_\mu $ corresponding to $\mu $ . Since $Pv_\mu $ is a positive linear combination of the $c_j$ , and each $c_j$ satisfies $\pi _1(c_j)>0$ , we necessarily have $\pi _1(Pv_\mu )>0$ and so $Pv_\mu \neq 0$ . Moreover,
Hence, $Pv_\mu $ is an eigenvector for B with eigenvalue $\mu $ . However, the only eigenvectors of B with positive $E_1$ component lie in $E_1$ . Hence $Pv_\mu $ is a multiple of v (the eigenvector with eigenvalue $\lambda $ ) and
Remark 3.9. In [Reference Thurston, Bonifant, Lyubich and SutherlandThu14, p. 354], Thurston gave a new proof of Lind’s converse to the integer Perron–Frobenius theorem. Thurston assumed that the matrix constructed via his method can be taken to be primitive as well. If this is the case, then Theorem 1.3 readily upgrades to give an upper bound for the Perron–Frobenius degree of a Perron number.
Step 6: Bounding the dimension of the matrix A. By Proposition 2.2, the number of integral points in C is at most
We give an upper bound for ${\mathrm{Vol}}(C)$ via a series of lemmas.
Lemma 3.10. For every $j>1$ and every $w_i$ , we have
Proof By the triangle inequality,
In the first line above, inequality (3.5) is used. The second line follows from Lemma 3.5, and the last inequality holds by equations (3.9) and (3.13), and inequality (3.11).
Lemma 3.11. Let $M:= 2 k \sqrt {d}$ , where k is the number of vertices of $P_v$ . Then
Proof Since $Q_v$ is the convex hull of $w_i$ , the ray from the origin and passing through an arbitrary point p of C intersects $Q_v$ at some point w. Therefore, if we define $M_1$ as
then
It is enough to show that $M_1 \leq M$ . For every point w in $Q_v$ ,
On the other hand, if p is a point in C, then by the triangle inequality, we have
where we used Lemma 3.5 for the implication $|v_i|_\alpha \leq \sqrt {d} L$ . Combining the previous inequalities gives
Lemma 3.12. We have
where ${\mathrm{Vol}}(Q_v)$ is the $(d-1)$ -dimensional volume of $Q_v$ .
Proof We have
where ${\mathrm{Vol}}$ , ${\mathrm{Length}}$ , and ${\mathrm{Area}}$ are with respect to the inner product $q_\alpha $ . For $j>1$ , define $\textbf {R}_j$ as
Therefore
By Lemma 3.10 and the triangle inequality, we have $\textbf {R}_j < 2 R_j$ . The lemma now follows from the inequalities $\textbf {R}_j < 2 R_j$ and $\pi < 4$ , and the equality $r+2s=d$ .
Lemma 3.13. We have
where C is defined as in equation (3.22).
Proof Note that
where we used equation (3.17), inequality (3.12), and the identity $r+2s=d$ . By Lemma 3.11,
The upper bound above is the volume of a cone over $M \cdot Q_v$ , where $M \cdot Q_v$ denotes a dilation of $Q_v$ with the factor of M from the point $0 \in \mathbb {R}^d$ . The height of this cone is $ML$ , since the height of $Q_v$ measured perpendicularly from its apex along the $E_1$ -axis is L. Hence, the volume of the cone is equal to
Therefore, we have
where we used Lemma 3.12. Moreover, using inequalities (3.10) and (3.12), and equation (3.13), we have
Combining with the previous upper bound for ${\mathrm{Vol}}(C)$ , and using $L < ({16 \sqrt {d} \ell }/({1-\rho }))$ , we have
where we used the relation $r+2s =d$ .
Therefore, the number of integral points in C is at most
In the first line above, we used the inequality
The second line above uses the definition of $\ell $ and Definition 3.1. By taking infimum over all $\alpha \in \mathfrak {B}$ , we obtain the upper bound
for the number of integral points in C and hence for the dimension of the matrix A.
Bayer Fluckiger showed in [Reference Bayer FluckigerBay06, Proposition 4.2], as a corollary of the work of Banaszczyk [Reference BanaszczykBan93, Theorem 2.2], that for any $\alpha \in \mathfrak {B}$ ,
This gives the upper bound
for the dimension of A. Both parts of the theorem now follow from the crude estimates
The bound in Theorem 1.3 is perhaps enormous compared to the Perron–Frobenius degree, so we have not tried to make the constants optimal. The point is to have an explicit bound in terms of data that we believe are relevant to the Perron–Frobenius degree; see Question 5.2.
Remark 3.14. Denote the covering conjecture for dimension d by $C_d$ .
Conjecture 3.15. (Covering conjecture)
The covering radius of any well-rounded unimodular lattice L in $\mathbb {R}^d$ (with the standard norm $|\cdot |$ ) satisfies
Equality happens if and only if $L = g \cdot \mathbb {Z}^d$ for some $g \in \mathrm {SO}_d(\mathbb {R})$ .
See McMullen [Reference McMullenMcM05] for the definition of well-rounded lattice and the application of the covering conjecture to Minkowski’s conjecture. The covering conjecture is proved for $d \leq 9$ ; see [Reference Kathuria and RakaKR16] and the references therein. Moreover, it is known to be false for $d \geq 30$ ; see [Reference Regev, Shapira and WeissRSW17].
In [Reference Bayer FluckigerBay06, p. 313], Bayer Fluckiger pointed out that McMullen’s results [Reference McMullenMcM05] together with the covering conjecture $C_d$ imply that for any totally real $\lambda $ of degree d, we have $\tau _{\min } (\mathcal {O}_{\mathbb {K}}) \leq ({d}/{4})$ . Hence, for any $d \leq 9$ and for any totally real Perron number $\lambda $ of degree d,
It would be interesting to know which number fields satisfy an inequality similar to that of $\tau _{\min } (\mathcal {O}_{\mathbb {K}}) \leq ({d}/{4})$ with upper bound only depending on the degree d. As pointed out kindly by McMullen to me, it is easy to see that such a bound does not exist for imaginary quadratic fields $\mathbb {Q}(\sqrt {n})$ for $n<0$ . On the other hand, Bayer Fluckiger showed that the inequality $\tau _{\min } (\mathcal {O}_{\mathbb {K}}) \leq ({d}/{4})$ holds for fields of the form $\mathbb {K} = \mathbb {Q}(\zeta _{p^r})$ or $\mathbb {K} = \mathbb {Q}(\zeta _{p^r}+\zeta ^{-1}_{p^r})$ , where p is an odd prime number, r is a natural number, and $\zeta _{p^r}$ is a primitive $p^r$ th root of unity; see respectively [Reference Bayer FluckigerBay06, p. 319, line 4] and [Reference Bayer FluckigerBay06, Lemma 8.5].
Example 3.16. (Pisot numbers)
A real algebraic integer $\alpha>1$ is called Pisot if all other Galois conjugates of $\alpha $ lie in the unit circle $\{ z \in \mathbb {C} \, | \, |z|<1\}$ . Note, a Pisot number is always Perron. We collect a few facts about Pisot numbers.
-
(1) A number field is called real if it has at least one real place. It is clear that every number field containing a Pisot number with the same degree should be real. Pisot [Reference PisotPis38] proved that every real number field $\mathbb {K}$ of degree d contains a Pisot number of degree d. Moreover, the set of Pisot numbers of degree d in $\mathbb {K}$ is closed under multiplication; see [Reference MeyerMey72, Corollary, p. 33].
-
(2) The smallest Pisot number is the largest root p of $x^3 - x -1$ and is known as the plastic constant. This was identified as the smallest known Pisot number by Salem [Reference SalemSal44], and Siegel proved it to be the smallest possible Pisot number [Reference SiegelSie44]. In particular, if $\lambda $ is a Pisot number with spectral ratio $\rho $ , then $\rho < p^{-1}$ implying that
$$\begin{align*}\frac{1}{1-\rho} < \frac{1}{1 - p^{-1}}. \end{align*}$$
Let $\mathbb {K}$ be a real number field of degree $d \geq 3$ . Let $\lambda $ be any Pisot number in $\mathbb {K}$ of degree d, and note that $\mathbb {Q}(\lambda ) = \mathbb {K}$ . By Theorem 1.3, $d_{\mathrm {PF}}^{\mathrm {irr}}(\lambda )$ is bounded above by
Note this upper bound only depends on $\mathbb {K}$ and not on the Pisot number $\lambda \in \mathbb {K}$ . Every number field $\mathbb {K}$ has only finitely many subfields $\mathbb {K}' \subset \mathbb {K}$ . Hence there is a similar upper bound depending only on $\mathbb {K}$ for an arbitrary Pisot number $\lambda \in \mathbb {K}$ (with any degree).
4 Primitive matrices
In this section, we use Theorem 1.3 to derive an upper bound for the Perron–Frobenius degree of a Perron number $\lambda $ . A useful observation is that if there exists an irreducible matrix A with spectral radius $\lambda -1$ , then $\lambda $ is the spectral radius of the primitive matrix $I+A$ . This is because $I+A$ is irreducible and has positive trace; hence it is primitive.
Theorem 1.6. Let $\mathbb {\lambda }$ be a Perron number of algebraic degree $d \geq 3$ and spectral ratio $\rho $ . Set $\mathbb {K}:= \mathbb {Q}(\lambda )$ . Denote the bound from Theorem 1.3 by $B(\mathbb {K}, \rho )$ , and let $\kappa (\mathbb {K}, \rho )$ be as in Notation 1.5. The Perron–Frobenius degree of $\lambda $ is bounded above by
Proof We may assume that $\lambda \geq M = 1+ ({4}/({1-\rho }))$ . First we show that $\lambda -1$ is Perron. The Galois conjugates of $\lambda -1$ are equal to $\lambda _i -1$ . We have
where we used the assumptions $\rho <1$ and $\lambda \geq 1+ ({4}/({1-\rho }))$ . Denote the spectral ratio for $\lambda -1$ by $\mu $ . Note that
In fact, we just showed that for every i,
so
Inequality (4.1) in particular implies that
Now $\lambda -1$ is a Perron number of spectral ratio $\mu $ and it generates the same number field $\mathbb {K}$ as $\lambda $ . By Theorem 1.3, there is an integral irreducible matrix A with spectral radius $\lambda -1$ and dimension at most $B(\mathbb {K}, \mu )$ . Then $I + A$ is a primitive matrix of the same dimension and with spectral radius $\lambda $ .
5 Questions
In Theorem 1.3, for technical reasons, we constructed an integral irreducible matrix with spectral radius equal to a given Perron number, although it would have been more natural to construct an integral primitive matrix instead. This motivates the following questions.
Question 5.1.
-
(1) Are there upper bounds for $d_{\mathrm {PF}}(\lambda )$ in terms of $d_{\mathrm {PF}}^{\mathrm {irr}}(\lambda )$ ?
-
(2) Does $d_{\mathrm {PF}}(\lambda ) = d_{\mathrm {PF}}^{\mathrm {irr}}(\lambda )$ always hold?
The lower bound for the Perron–Frobenius degree in [Reference YazdiYaz21] is in terms of the two largest Galois conjugates in the complex plane. Therefore, the following is a natural question.
Question 5.2. Let d and $\rho $ denote respectively the algebraic degree and the spectral ratio. Is there an upper bound $B=B(d,\rho )$ for $d_{\mathrm {PF}}^{\mathrm {irr}}(\lambda )$ (respectively $d_{\mathrm {PF}}(\lambda )$ ) where $\lambda $ is an arbitrary Perron number?
We expect the above question to have a negative answer, meaning that in Theorem 1.3, the arithmetic information on $\mathcal {O}_{\mathbb {K}}$ cannot be overlooked. A unit algebraic integer $\lambda $ is bi-Perron if all other Galois conjugates of $\lambda $ lie in the annulus $\{ c \in \mathbb {C} \hspace {1mm} | \hspace {1mm} \lambda ^{-1} < |z| < \lambda \}$ except possibly for $\pm \lambda ^{-1}$ . See [Reference McMullenMcM14].
Question 5.3. How can the bound in Theorem 1.3 be improved for special classes of algebraic integers such as totally real Perron numbers, Pisot numbers, Salem numbers, or bi-Perron numbers?
In particular, we ask the following about Pisot numbers.
Question 5.4. Let d denote the algebraic degree. Is there an upper bound $B = B(d)$ for $d_{\mathrm {PF}}^{\mathrm {irr}}(\lambda )$ (respectively $d_{\mathrm {PF}}(\lambda )$ ) where $\lambda $ is an arbitrary Pisot number?
Acknowledgements
I would like to thank Douglas Lind for sharing his intuition that some lattice property of the ring of integers should play a role in the Perron–Frobenius degree. Many thanks also go to Curtis T. McMullen and Eva Bayer Fluckiger for helpful comments connected to Remark 3.14. I am grateful to the anonymous referee for their very helpful comments, in particular for explaining how the bound for primitive matrices in Theorem 1.6 can be obtained. During this work, the author was supported by a Glasstone Research Fellowship in Science, a Titchmarsh Fellowship, and a UKRI Postdoctoral Research Fellowship.