1. Introduction
Let $\mathcal{M}$ be the set of monic polynomials belonging to the polynomial ring $\mathbb{F}_q[t]$ with coefficients in the finite field $\mathbb{F}_q$ with q elements, where $q \geq 2$ is a prime power. A random Rademacher multiplicative function $f \,{:}\, \mathcal{M} \to \{-1,0,1\}$ over $\mathbb{F}_q[t]$ is obtained by picking independent random variables f(P) uniformly distributed on $\{\pm 1\}$ (that is, taking the value $\pm 1$ with probability $1/2$ each) for monic irreducible polynomials P, extending f multiplicatively to all squarefree monic polynomials, and setting f to be zero for all non-squarefree monic polynomials. For example, if $F=P_1\cdots P_k$ for distinct irreducible monic polynomials $P_1,\dots, P_k$ , then $f(F) = f(P_1) \cdots f(P_k)$ . For any positive integers k and n, set
where $\omega(F)$ is the number of distinct irreducible factors of the polynomial F, and $\deg(F)$ is the degree of F. The purpose of this paper is to establish the following theorem.
Theorem 1. Let f be a random Rademacher multiplicative function over $\mathbb{F}_q[t]$ , where $q \geq 2$ is a fixed prime power. If $k \geq 1$ satisfies $k=o(\!\log n)$ as $n\to\infty$ , then
converges in distribution to the standard normal distribution N(0,1) as $n\to\infty$ .
This result is motivated by the study of random multiplicative functions over the integers, which were introduced by Wintner [Reference Wintner12] to heuristically model the Möbius function. A random Rademacher multiplicative function $f \,{:}\, \mathbb{N} \to \{-1,0,1\}$ over the integers is similarly obtained by picking independent random variables f(p) for each prime p, extending it multiplicatively to all squarefree integers, and setting it to be zero for all non-squarefree integers. If $\pi_k(x)$ is the number of squarefree integers $\leq x$ with k distinct prime factors, then the sum
parallels the quantity in (1·1). Indeed, integers of size x are known to heuristically correspond to polynomials in $\mathbb{F}_q[t]$ of degree $n \approx \log x$ . Improving upon a result of Hough [Reference Hough5], Harper [Reference Harper4] established the following theorem which motivates our Theorem 1.
Theorem 2. (Harper) Let f be a random Rademacher multiplicative function over the integers. If $k \geq 1$ satisfies $k=o(\!\log\log x)$ as $x \to \infty$ , then (1·2) converges in distribution to the standard normal distribution N(0,1) as $x \to \infty$ .
Notice the range $k=o(\!\log\log x)$ in Theorem 2 over the integers corresponds precisely to the range $k = o(\!\log n)$ in Theorem 1 over the polynomial ring $\mathbb{F}_q[t]$ . Theorem 1 can therefore be viewed as an extension of Theorem 2 to the function field setting. For an introduction to multiplicative functions over function fields, we refer the reader to work of Granville, Harper and Soundararajan [Reference Granville, Harper and Soundararajan3], whose conventions we follow here.
The proof strategy for Theorem 1 adapts Harper’s key ideas with the verification of three conditions in a martingale central limit theorem (Theorem 3). In Section 2, we prepare this strategy and define our martingale difference sequence. The analysis of this martingale allows us to efficiently reduce the theorem to a natural counting problem (Lemma 4), just as Harper did in [Reference Harper4, section 4·2]. However, this counting problem for function fields introduces cases which did not appear for the integers. The source of these new cases is simple: two distinct irreducible polynomials can have the same degree, but two distinct rational primes cannot have the same size. Since our martingale is filtered based on the degree of the largest irreducible factor (similar to the size of the prime for integers), this distinction creates new terms in our sums that we must carefully treat; see the remark following Lemma 4 for details.
In Section 3, we proceed to analyse these sums and complete the proof of Theorem 1 with some technical estimates. Although these combinatorial sums are somewhat more intricate, the estimation of these sums is simpler due to the familiar analytic benefits of function fields over integers. The key technical lemma for this analysis (Lemma 5) is proved in Section 4. We use recent results on the size of $\mathcal{P}_{k}(n)$ by Gómez-Colunga et al. [Reference Gómez-Colunga, Kavaler, McNew and Zhu2] and Afshar and Porritt [Reference Afshar and Porritt1], which respectively parallel classical estimates for $\pi_k(x)$ by Hardy and Ramanujan, and Sathe and Selberg.
We conclude the introduction with a few remarks on the sharpness of Theorem 1 and possible extensions. Harper showed that the range $k=o(\!\log\log x)$ in Theorem 2 is optimal [Reference Harper4, corollary 1]. He further established a normal approximation for sums like (1·2) with the looser restriction $\omega(m) \leq k$ and also for a larger class of random multiplicative functions [Reference Harper4, theorem 3]. It would be of interest to determine whether the range $k=o(\!\log n)$ in Theorem 1 is optimal and whether similar extensions hold in our setting. It seems plausible that such results carry over by similar arguments, but we did not pursue those investigations. If the range $k=o(\!\log n)$ is optimal as Harper’s work would suggest, then this indicates that the proof of Theorem 1 is quite delicate and sensitive to even minor losses.
Notation
Let $q \geq 2$ be a prime power. Let $\mathbb{F}_q[t]$ be the polynomial ring with coefficients in the finite field $\mathbb{F}_q$ with q elements. Let $\mathcal{M}$ be the set of monic polynomials belonging to $\mathbb{F}_q[t]$ . We shall use capital letters to denote a polynomial F in $\mathcal{M}$ , writing $\deg\!(F)$ for the degree of the polynomial F, $\omega(F)$ for the number of distinct irreducible factors of F, and $\textrm{P}^+(F)$ for the maximum degree of an irreducible dividing F. The letters P and Q will be reserved for monic irreducible polynomials. For integers $k, n \geq 1$ , let $\mathcal{P}_k(n)$ be the set of squarefree polynomials F in $\mathcal{M}$ with $\omega(F) = k$ and $\deg(F) = n$ . The letter f denotes a random Rademacher multiplicative function over $\mathbb{F}_q[t]$ . The relation $u \ll v$ means that there exists an absolute positive constant C such that $|u| \leq C v$ . If the constant C depends on a parameter, say $\varepsilon$ , then we shall write $u \ll_{\varepsilon} v$ .
2. Plan for the proof of Theorem 1
For integers $k, n \geq 1$ and a random Rademacher multiplicative function f over $\mathbb{F}_q[t]$ , define
Notice $\mathbb{E}[f(F)] = 0$ for any non-trivial squarefree F because f is multiplicative and $(f(P))_P$ is a sequence of independent random variables with mean zero. Hence, $S^{(k)}(n)$ has mean zero. Also, since $\mathbb{E}[f(F)f(G)]=1$ if $F=G$ and 0 otherwise, it follows that
Thus the mean of (1·1) is zero and its variance is indeed one. Our goal is to prove that $S^{(k)}(n)$ , normalised by its standard deviation $\sqrt{|\mathcal{P}_{k}(n)|}$ , converges in distribution to the standard normal as $n \to \infty$ , provided $k = o(\!\log n)$ . The strategy follows that of Harper [Reference Harper4] with appropriate modifications and simplifications as mentioned earlier in the introduction.
First, notice $\mathcal{P}_{1}(n)$ is the set of irreducible monic polynomials of degree n, so $S^{(1)}(n)$ is a sum of $|\mathcal{P}_{1}(n)|$ independent random variables uniform on $\{\pm 1\}$ . Thus, the classical central limit theorem implies that $S^{(1)}(n)$ converges in distribution to N(0,1) as $n \to \infty$ . We may therefore assume throughout that $k \geq 2$ .
2·1 Central limit theorem for martingale difference sequences
To prove convergence in distribution to the standard normal, we want to use a central limit theorem that gives information on the convergence of the partial sums of a martingale difference sequence. The result we use was obtained by McLeish [Reference McLeish6], but we state it as it appeared in [Reference Harper4].
Theorem 3. (McLeish) For $n \in \mathbb{N}$ , suppose that $k_n \in \mathbb{N}$ , and that $X_{i,n}$ , $1 \leq i \leq k_n$ , is a martingale difference sequence on $(\Omega, \mathscr{F}, (\mathscr{F}_{i,n})_i, \mathbb{P})$ . Write $S_n \,{:\!=}\, \sum_{i \leq k_n}X_{i,n}$ and suppose that the following conditions hold:
-
(i) $\displaystyle\sum_{i \leq k_n}\mathbb{E}[X_{i,n}^2] \to 1$ as $n\to \infty;$
-
(ii) for each $\varepsilon > 0$ , we have $\displaystyle\sum_{i \leq k_n}\mathbb{E}\left[X_{i,n}^2\textbf{1}_{|X_{i,n}| > \varepsilon}\right] \to 0$ as $n\to \infty ;$
-
(iii) $\displaystyle\limsup_{\substack{n\to \infty}} \sum_{i \leq k_n}\sum_{j \leq k_n, j\neq i} \mathbb{E}\left[X_{i,n}^2 X_{j,n}^2\right] \leq 1.$
Then, $S_n$ converges in distribution to N(0,1) as $n\to \infty$ .
Let us describe the martingale difference sequence in our problem. Let $n \geq k \geq 2$ . Write $\textrm{P}^+(F)$ for the maximum degree of the irreducible factors of F. For $d \geq 1$ , define
and set
Notice the set $\mathcal{P}_{k,n}(n)$ is empty as $k \geq 2$ , so $\mathcal{P}_{k}(n)$ is the union of $\mathcal{P}_{k,d}(n)$ over $1 \leq d \leq n-1$ and therefore $S^{(k)}(n) = \sum_{d=1}^{n-1} S^{(k)}_d (n)$ .
Writing $F\in \mathcal{P}_{k,d}(n)$ as $F=QF'$ , where Q is a degree d factor of F (among possibly many), it follows by multiplicativity and independence that $\mathbb{E}[f(F)] = \mathbb{E}[f(Q)] \mathbb{E}[f(F')].$ Since $\mathbb{E}[ f(Q) \mid \{f(P): \deg P< d\} ] = \mathbb{E}[f(Q)] =0$ , we get that
and so by the linearity of expectation, it follows that
Hence, if $\mathscr{F}_d$ denotes the sigma algebra generated by $\{f(P)\,{:}\, \deg P < d\}$ , then $(S_d^{(k)}(n))_{d\leq n-1}$ is a martingale difference sequence with respect to $(\mathscr{F}_d)_{d\leq n-1}$ .
We will therefore apply Theorem 3 to the random variables $S_d^{(k)}(n)/\sqrt{\left|\mathcal{P}_k(n)\right|}$ , which still form a martingale difference sequence and whose sum over $d\leq n-1$ equals $S^{(k)}(n)/\sqrt{\left|\mathcal{P}_k(n)\right|}$ , the quantity considered in Theorem 1. For convenience, we also use the notation
2·2 Reduction to some counting problems
By a computation similar to (2·1), it follows that $\mathbb{E}[S_d^{(k)}(n)^2]=\left|\mathcal{P}_{k,d}(n)\right|$ , so that condition (i) of Theorem 3 holds for all n, not just in the limit. Proving Theorem 1 then boils down to verifying conditions (ii) and (iii) of Theorem 3. The second condition stated in terms of our normalised random variables asks that for all $\varepsilon>0$ ,
as $n\to\infty.$ This quantity is at most
Thus, it suffices to prove that
as $n\to\infty$ . The third condition becomes
Equivalently, we will show
as $n\to\infty$ . For any $1 \leq d,e \leq n-1$ , it will therefore be convenient to express $\mathbb{E}\left[S_d^{(k)}(n)^2S_e^{(k)}(n)^2\right]$ in terms of an explicit counting problem.
Lemma 4. With the same notation as above,
where
and $J_{k,d,e}(n) = 0$ if $d \neq e$ , otherwise
Proof. Expanding out the sums and applying linearity of expectation, we get that
Notice the expectation on the right-hand side is nonzero only when WXYZ is a square, in which case it equals 1. This is the counting problem which we proceed to reformulate. We may write WX as the product of a square part $U^2$ and a square-free part M so $W=UA$ and $X=U(M/A)$ for some A that divides M. Note that U, A and $M/A$ are all relatively prime and the maximum degree of their irreducible factors is $\leq d$ . A similar reasoning for YZ gives some other square part, say $V^2$ , and forces their squarefree part to be M as well so $Y = VB$ and $Z = V(M/B)$ for some B that divides M. Again, V, B, and $M/B$ are all relatively prime and the maximum degree of their irreducible factors is $\leq e$ .
With this notation in mind, we proceed to count the corresponding contributions according to cases. First, if $M = 1$ then $A=B=1$ , so this case contributes at most
Next, we count the terms in (2·6) where $M \neq 1$ so $\deg M$ and $\omega(M)$ are both non-zero. As M is non-trivial, we have that $A \in P_{t, \leq d}(\ell)$ for some $t \in \{1,\dots,k\}$ and $\ell \in\{1,\dots,n\}$ . Comparing the degrees and number of irreducible factors of $W = UA$ and $X = U(M/A)$ , we deduce that $\textrm{P}^+(UA) = d$ and
As $A \in \mathcal{P}_{{t,\leq d}}({\ell})$ , this implies that $U \in \mathcal{P}_{{k-t, \leq d}}({n-\ell})$ and $M \in \mathcal{P}_{{2t, \leq d}}({2\ell})$ . A similar analysis holds when comparing $Y = VB$ and $Z = V(M/B)$ but, since the polynomial M is common to both arguments, it follows that A and B necessarily have the same degree and same number of prime factors and so do U and V. Hence, $B \in \mathcal{P}_{{t,\leq e}}(\ell), V \in \mathcal{P}_{{k-t,\leq e}}({n-\ell})$ , and $M \in \mathcal{P}_{{2t,\leq \min\{d,e\}}}({2\ell})$ . The terms in (2·6) with $M \neq 1, t \in \{1,\dots,k-1\}$ , and $\ell \in \{1,\dots,n-1\}$ therefore contribute at most $I_{k,d,e}(n)$ .
Continuing with this notation, the last case to consider is when $M \neq 1$ and $t=k$ (or equivalently $\ell =n$ ) in which case $U=V=1$ . Notice $U=V=1$ implies that $WXYZ = U^2 V^2 M^2 = M^2$ has a prime factor of degree $\max\{d,e\}$ yet $\textrm{P}^+(M) \leq \min\{ d,e\}$ . If $d \neq e$ , this leads to a contradiction, so this last case occurs if and only if $d=e$ . Thus, M has at least two distinct degree d factors in this case and $\textrm{P}^+(A) = \textrm{P}^+(B) = d$ where A and B divide M. Thus, there exists a pair of distinct irreducibles $P, Q \in \mathcal{P}_{1}(d)$ such that $M=PQM'$ , where M $^{\prime}$ belongs to $\mathcal{P}_{{2k-2}}({2n-2d})$ and at least one of the following holds:
If, say, the first situation holds, then $A = PA'$ and $B = QB'$ for $A',B' \in \mathcal{P}_{{k-1}}({n-d})$ dividing M $^{\prime}$ . A similar statement holds for the other cases. Combining all of these observations, we see that the terms in (2·6) with $M \neq 1$ and $t=k$ contribute at most $J_{k,d,e}(n)$ , as required.
Remark. This lemma and its proof possess the key differences between the function field setting and the integers. Crucially, the product WXYZ can form a square in a new way and contribute to (2·6). Namely, if $d \leq e$ then W and X do not need to share the same irreducible factor of degree d; these factors of degree d can instead pair with factors from Y and Z. This manifests in (2·4) by allowing M to have these large irreducible factors of degree $d = \min\{d,e\}$ and also by creating the additional terms (2·5) which do not appear in [Reference Harper4, section 4·2]. If the irreducible factors of degree d from W and X (resp. of degree e from Y and Z) are paired in a one-to-one manner, then only U (resp. V) in (2·4) would have these large factors of degree d (resp. degree e) and moreover (2·5) would not exist. This is precisely what happens for Harper in the integer setting. Namely, if integers w and x share the same largest prime factor p, then $p^2$ always divides wx since the size of the prime corresponds uniquely to the prime itself.
Now, using Lemma 4 with $d=e$ , we see that (2·2) holds provided that
Similarly, (2·3) holds provided that
Since
both (2·2) and (2·3) will therefore be satisfied provided
as $n \to \infty$ . This establishes Theorem 1 assuming (2·7) holds.
3. Completing the proof of Theorem 1
It remains to prove (2·7), which rests on the following key technical lemma whose proof is postponed to Section 4.
Lemma 5. Fix an integer $r \geq 1$ . If k and n are integers such that $r \leq k \leq (\log n)/3$ , then
In particular, if $r \geq 2$ is fixed and $k = o(\!\log n)$ as $n \to \infty$ , then the above is $o(|\mathcal{P}_{k}(n)|^2)$ .
Assuming Lemma 5, it suffices to show that each of the three sums in (2·7) are $o( |\mathcal{P}_{k}(n)|^2)$ provided $k =o(\!\log n)$ as $n \to \infty$ . We deal with each estimate in separate subsections.
3·1 Estimate for $\sum_{d=1}^{n-1}\left|\mathcal{P}_{k,d}(n)\right|^2$
For $F\in\mathcal{P}_{k,d}(n)$ , one has $F=PF'$ for some $P \in \mathcal{P}_{1}(d)$ and $F' \in \mathcal{P}_{{k-1}}({n-d})$ . This implies that $|\mathcal{P}_{k,d}(n)| \leq |\mathcal{P}_{1}(d)| |\mathcal{P}_{{k-1}}({n-d})|$ and so
This is a subsum of Lemma 5 with $r=2$ so it is $o(|\mathcal{P}_k(n)|^2)$ as $n \to \infty$ , as required.
3·2 Estimate for $\sum_{d=1}^{n-1}\sum_{e=1}^{n-1} I_{k,d,e}(n)$
Consider the definition of $I_{k,d,e}(n)$ in (2·4). The condition $\textrm{P}^+(UA) = d$ implies that at least one of the following holds: $\textrm{P}^+(U) = d$ or $\textrm{P}^+(A) = d$ . Summing over d and, in some cases, dropping the requirement that the maximum degree of the irreducible factors of our polynomials is $\leq d$ or $\leq e$ , this implies that $\sum_{d=1}^{n-1} I_{k,d,e}(n)$ is at most
Applying the same argument to the condition $\textrm{P}^+(VB) = e$ and summing over e, it follows that $\sum_{d=1}^{n-1} \sum_{e=1}^{n-1} I_{k,d,e}(n) $ is at most
Fix $t \in \{1,\dots,k-1\}$ and $\ell \in \{1,\dots,n-1\}$ . Notice that the double sum with U and V is equal to $|\mathcal{P}_{{k-t}}({n-\ell})|^2$ . Next, consider the triple sum with M, A, and B. Writing $G = \gcd(A,B)$ , we have that $A = GA', B= GB'$ , and $M = GA'B'M'$ for some M $^{\prime}$ coprime to A $^{\prime}$ , B $^{\prime}$ , and G. Since A and B have the same degree and same number of prime factors (and hence so do A $^{\prime}$ and B $^{\prime}$ ), it follows that G and M $^{\prime}$ must have the same degree and same number of prime factors. Namely, if $G \in \mathcal{P}_{j}(g)$ for some integer $0 \leq j \leq t$ and some integer $0 \leq g \leq \ell$ , then $M' \in \mathcal{P}_{j}(g)$ and $A', B' \in \mathcal{P}_{{t-j}}({\ell-g})$ . Note the case $j=0$ (and hence $g=0$ ) occurs when $G=M'=1$ so $M = AB$ , and the case $j=t$ (and hence $g=\ell$ ) occurs when $A'=B'=1$ so $M = GM'$ . Combining these observations implies that
Inserting these estimates in (3·2), we conclude that $\sum_{d=1}^{n-1} \sum_{e=1}^{n-1} I_{k,d,e}(n)$ is at most
Since $k = o(\!\log n)$ as $n \to \infty$ , both of these sums are $o( |\mathcal{P}_{k}(n)|^2)$ by Lemma 5, as required.
3·3 Estimate for $\sum_{d=1}^{n-1} J_{k,d,d}(n)$
From (2·5), we have that
Notice the inner triple sum is the same as (3·3) with $\ell = n-d$ and $t=k-1$ . Thus, $\sum_{d=1}^{n-1} J_{k,d,d}(n)$ is at most
Since $k=o(\!\log n)$ as $n\to\infty$ , all of these sums are $o(|\mathcal{P}_{k}(n)|^2)$ by Lemma 5. This completes the proof of (2·7) and the proof of Theorem 1.
4. Proof of Lemma 5
All that remains is to prove Lemma 5. To do so, we shall first require an estimate for the size of $\mathcal{P}_{k}(n)$ that is uniform for all integers k and n. Gómez-Colunga et al. [Reference Gómez-Colunga, Kavaler, McNew and Zhu2] have recently established such a result.
Proposition 6. (Gómez-Colunga--Kavaler--McNew--Zhu) Uniformly for all $k, n \geq 1$ ,
This corresponds to a classical result of Hardy and Ramanujan [Reference Ramanujan and Hardy7] for the integers: there exists a constant $B>0$ such that, for all $k\geq 1$ and $x\geq 2$ , we have
where $\pi_k(x)$ is the number of squarefree integers up to x with k prime factors.
Sathe [Reference Sathe8, Reference Sathe9] and Selberg [Reference Selberg10] famously derived an asymptotic estimate for $\pi_k(x)$ when $k =o(\!\log \log x)$ . We shall also need an asymptotic estimate for $|\mathcal{P}_{k}(n)|$ that is valid when $k = o(\!\log n)$ . Although the estimate $|\mathcal{P}_{k}(n)| \sim {q^n (\!\log n)^{k-1}}/{n (k-1)!}$ (see, e.g., [Reference Warlimont11]) suffices for our purposes, we state here the strongest and most recent result on an estimate for $|\mathcal{P}_{k}(n)|$ , which is a so-called Sathe–Selberg formula for function fields established by Afshar and Porritt [Reference Afshar and Porritt1].
Proposition 7. (Afshar--Porritt). Let $A>1$ . Uniformly for all $n\geq2$ and $1\leq k\leq A\log n$ ,
where
and $\Gamma(\!\cdot\!)$ is the Gamma function defined as $ \Gamma(z) = \int_{0}^{\infty} x^{z-1}e^{-x} dx. $
Propositions 6 and 7 imply our key technical lemma.
Proof of Lemma 5. The second estimate follows from (3·1) since
and Proposition 7 implies that if $k = o(\!\log n)$ as $n \to \infty$ , then $|\mathcal{P}_{k}(n)| \sim {q^n (\!\log n)^{k-1}}/{n (k-1)!}$ .
To prove (3·1), we proceed by induction on r. For $r=1$ , the claim follows immediately from Proposition 6. For $r \geq 2$ , if $n_1+\cdots+n_r = n$ , then at least one of $n_1,\dots,n_r$ is at most $\lfloor n/r \rfloor $ . By symmetry, we may assume it is $n_r$ so the left-hand side of (3·1) is at most
Notice that $n-n_r \geq n/2$ as $r \geq 2$ . Since $k \leq (\log n)/3$ by assumption, this implies that $k-k_r \leq k-1 \leq \log(n/2)/3 \leq \log(n-n_r)/3$ . Thus, by the inductive hypothesis, the above is
where for brevity we have set $c=2-\log 2$ . Applying Proposition 6, we see that this is at most
Note that for any integer $m \geq 0$ ,
Using this estimate on the inner sum over $n_r$ , it follows that (4·1) is
For the final sum over $k_r$ , notice that the ratio of consecutive summands is equal to
since $k \leq (\log n)/3$ by assumption. Hence, the final sum over $k_r$ is dominated by its value at the endpoint $k_r=1$ , yielding the desired estimate. This establishes Lemma 5.
Acknowledgments
This research was conducted as part of the 2020 Fields Undergraduate Summer Research Program. The authors are grateful to the Fields Institute for their financial support and facilitating their online collaboration.