Hostname: page-component-745bb68f8f-grxwn Total loading time: 0 Render date: 2025-02-06T12:21:10.258Z Has data issue: false hasContentIssue false

Density theorems for anisotropic point configurations

Published online by Cambridge University Press:  03 May 2021

Vjekoslav Kovač*
Affiliation:
Department of Mathematics, Faculty of Science, University of Zagreb, Bijenička cesta 30, 10000Zagreb, Croatia
*
Rights & Permissions [Opens in a new window]

Abstract

Several results in the existing literature establish Euclidean density theorems of the following strong type. These results claim that every set of positive upper Banach density in the Euclidean space of an appropriate dimension contains isometric copies of all sufficiently large elements of a prescribed family of finite point configurations. So far, all results of this type discussed linear isotropic dilates of a fixed point configuration. In this paper, we initiate the study of analogous density theorems for families of point configurations generated by anisotropic dilations, i.e., families with power-type dependence on a single parameter interpreted as their size. More specifically, we prove nonisotropic power-type generalizations of a result by Bourgain on vertices of a simplex, a result by Lyall and Magyar on vertices of a rectangular box, and a result on distance trees, which is a particular case of the treatise of distance graphs by Lyall and Magyar. Another source of motivation for this paper is providing additional evidence for the versatility of the approach stemming from the work of Cook, Magyar, and Pramanik and its modification used recently by Durcik and the present author. Finally, yet another purpose of this paper is to single out anisotropic multilinear singular integral operators associated with the above combinatorial problems, as they are interesting on their own.

Type
Article
Copyright
© Canadian Mathematical Society 2021

1 Introduction

1.1 Overview of previous results

The topic of our interest is density theorems within the subfield of the Euclidean Ramsey theory. Such theorems typically attempt to identify “many” dilates of a given finite point configuration in all sufficiently large measurable sets $A\subseteq \mathbb{R}^d$ . Here, the concept of largeness has to be interpreted in an appropriate measure-theoretic sense: by requiring that a certain density of A is strictly positive. A general and very convenient notion of density is the upper Banach density, defined for a measurable set $A\subseteq \mathbb{R}^d$ as

$$ \begin{align*} \overline{\delta}(A) := \limsup_{R\rightarrow\infty}\sup_{x\in\mathbb{R}^d}\frac{|A\cap(x+[0,R]^d)|}{R^d}. \end{align*} $$

A quite strong density theorem for the simplest possible point configuration, namely the set of two points, was shown independently by Furstenberg et al. [Reference Furstenberg, Katznelson, Weiss, Nešetřil and Rödl27], and Falconer and Marstrand [Reference Falconer and Marstrand23]:

  1. For every measurable set $A\subseteq \mathbb{R}^2$ satisfying $\overline {\delta }(A)>0$ , there is a positive number $\lambda _0=\lambda _0(A)$ such that for each $\lambda \in [\lambda _0,\infty )$ , there exist points $x,x'\in A$ satisfying $|x-x'|=\lambda $ .

Here and in what follows, $|v|$ denotes the Euclidean norm of a vector $v\in \mathbb{R}^d$ . The claim extends to higher dimensions, but we always consider only the smallest dimension in which the result is known to hold. Bourgain [Reference Bourgain4] generalized the above result to the set of vertices $\Delta \subseteq \mathbb{R}^{n}$ of a nondegenerate n-dimensional (i.e., $(n+1)$ -point) simplex:

  • For every measurable set $A\subseteq \mathbb{R}^{n+1}$ satisfying $\overline {\delta }(A)>0$ , there is a positive number $\lambda _0=\lambda _0(A,\Delta )$ such that for each $\lambda \in [\lambda _0,\infty )$ , the set A contains an isometric copy of $\lambda \Delta $ .

Note that the configuration $\Delta $ is initially given in the n-dimensional Euclidean space, while the ambient Euclidean space (i.e., the one containing the set A) has dimension $n+1$ . This dimensional increase is used in all known proofs of the aforementioned result, giving an additional “degree of freedom,” but at the time of writing, it is not known if it is really necessary when $n\geq 2$ .

The paper [Reference Bourgain4] was very influential and it motivated a series of papers handling more complicated point configurations. Pursuing one possible direction, Lyall and Magyar initiated the study of density theorems for product-type point configurations. In [Reference Lyall and Magyar45], they considered Cartesian products $\Delta _1\times \Delta _2$ , where both $\Delta _1$ and $\Delta _2$ are sets of vertices of nondegenerate simplices, while in [Reference Lyall and Magyar46], they extended their study to Cartesian products of finitely many such sets. An interesting (and already nontrivial) particular case is the set of vertices of an n-dimensional rectangular box, $\Box = \{0,b_1\} \times \cdots \times \{0,b_n\} \subseteq \mathbb{R}^n$ for some $b_1,\ldots ,b_n>0$ . One of the results by Lyall and Magyar [Reference Lyall and Magyar46, Theorem 1.1(i)] reads as follows:

  • For every measurable set $A\subseteq \mathbb{R}^{2}\times \cdots \times \mathbb{R}^{2}=(\mathbb{R}^2)^n$ satisfying $\overline {\delta }(A)>0$ , there is a positive number $\lambda _0=\lambda _0(A,\Box )$ such that for each $\lambda \in [\lambda _0,\infty )$ , the set A contains an isometric copy of $\lambda \Box $ with sides parallel to the distinguished $2$ -dimensional coordinate planes. In other words, for each $\lambda \in [\lambda _0,\infty )$ , one can find $x_1,\ldots ,x_n, y_1,\ldots ,y_n \in \mathbb{R}^2$ satisfying

    $$ \begin{align*} \big\{ (x_1 + r_1 y_1, x_2 + r_2 y_2, \ldots, x_n + r_n y_n) : (r_1,\ldots,r_n)\in\{0,1\}^n \big\} \subseteq A \end{align*} $$
    and $|y_k| = \lambda b_k$ for $k=1,2,\ldots ,n$ .

In fact, Durcik and the present author first established a weaker result [Reference Durcik and Kovač15, Theorem 1], with $(\mathbb{R}^2)^n$ replaced by $(\mathbb{R}^5)^n$ , and then also reproved the above result [Reference Durcik and Kovač16, Theorem 3]. The main concern of the paper [Reference Durcik and Kovač16] was a certain quantitative aspect that will not be discussed here, but the proof given there will turn out to be quite relevant for the present paper.

Generalizing Bourgain’s result in another direction, Lyall and Magyar [Reference Lyall and Magyar47] studied density theorems for the so-called distance graphs. Informally speaking, these are graphs embedded in a Euclidean space that carry information about lengths of their edges. Certain nondegeneracy conditions are then needed in order to have meaningful results. We will neither give a precise definition of those concepts here, nor formulate the most general known result on this topic, which is [Reference Lyall and Magyar47, Theorem 2]. Instead, we will state its particular case when the graph is a tree, as this is the one that we are about to generalize later.

Take a tree $\mathcal {T}=(V,E)$ on a finite set of vertices V, having E as the set of its edges. Suppose that we are also given a function $\ell \colon E\to (0,\infty )$ , so that $\ell (e)$ is interpreted as the “length” of an edge $e\in E$ . One could say that $\mathcal {T}$ equipped with $\ell $ is a distance tree. A special case of [Reference Lyall and Magyar47, Theorem 2] by Lyall and Magyar reads as follows:

  • For every measurable set $A\subseteq \mathbb{R}^{2}$ satisfying $\overline {\delta }(A)>0$ , there is a positive number $\lambda _0=\lambda _0(A,\mathcal {T},\ell )$ such that for each $\lambda \in [\lambda _0,\infty )$ , the set A contains a set of points $\{x_v:v\in V\}$ satisfying $|x_u-x_v|=\lambda \ell (e)$ whenever $e\in E$ is an edge connecting vertices $u,v\in V$ .

Interestingly, distance trees and their even more special cases, distance chains, also play a prominent role in somewhat related problems [Reference Bennett, Iosevich and Taylor1, Reference Iosevich and Taylor38]. Distance trees are not rigid point configurations. One can still talk about their “isometric copies” within A (and remain in line with the previous formulations), if one defines the concept of isometric distance graphs in an obvious way; see [Reference Lyall and Magyar47].

In formulations of all of these results, the emphasis needs to be put on the fact that all sufficiently large dilates of the configuration can be identified in the set A. Note that finding just any dilate is trivial, because the Lebesgue density arguments identify sufficiently small dilates of any given finite configuration inside a set of positive measure. On the other hand, the existence of merely some sufficiently large dilates can be deduced easily from Szemerédi’s theorem [Reference Szemerédi56] or its multidimensional version by Furstenberg and Katznelson [Reference Furstenberg and Katznelson26].

Let us also mention that there are many related papers that study lower-dimensional subsets A of the Euclidean space [Reference Bennett, Iosevich and Taylor1, Reference Chan, Łaba and Pramanik6, Reference Greenleaf, Iosevich, Liu and Palsson31Reference Greenleaf, Iosevich and Pramanik33, Reference Iosevich and Liu36Reference Iosevich and Taylor38, Reference Yavicoli59], subsets A of the multidimensional integer lattice [Reference Bulinski5, Reference Lyall and Magyar48, Reference Magyar49], or patterns with arithmetic structure [Reference Cook, Magyar and Pramanik10, Reference Denson, Pramanik and Zahl11, Reference Durcik and Kovač15Reference Durcik, Kovač and Rimanić17, Reference Fraser, Guo and Pramanik24, Reference Fraser and Pramanik25, Reference Henriot, Łaba and Pramanik34, Reference Keleti39, Reference Łaba and Pramanik44, Reference Shmerkin51]. We do not discuss these types of results here.

1.2 Statements of new results

It is natural to start asking questions on generalizations of the above results to configurations that are not dilated equally in all directions. For instance, one might ask if a positive upper Banach density set necessarily contains copies of rectangles with sides $\lambda $ and $\lambda ^2$ for all sufficiently large numbers $\lambda $ . In the present paper, we attempt to answer a few questions of this type, without insisting on formulating the most general possible results; see the comments in Section 6.2 below. The presented proofs will reveal which of the techniques found in the literature allow modifications applicable to the anisotropic setting, and which objects from harmonic analysis appear along the way.

First, we turn to n-dimensional simplices in $\mathbb{R}^{n+1}$ . Let us try to come up with a natural polynomial formulation of Bourgain’s result. For instance, it does not make sense to look for triangles with sides $\lambda $ , $\lambda ^2$ , and $\lambda ^3$ , because these three numbers fail to satisfy the triangle inequality for large $\lambda $ . However, one can still be requiring that adjacent sides of the desired triangle are $\lambda $ and $\lambda ^2$ units long and that the angle between them is fixed. Similarly, it can also be interesting to study right-angled simplices with perpendicular edges of lengths $\lambda , \lambda ^2, \ldots , \lambda ^n$ , i.e., they are scaled according to the so-called moment curve $\lambda \mapsto (\lambda ,\lambda ^2,\ldots ,\lambda ^n)$ .

In fact, polynomials will not play any role here, and the approach will generalize naturally to power-type dilations with ratios of the form $\lambda ^a b$ , where $a,b>0$ are fixed parameters. In particular, a does not need to be an integer. In other words, we will be working with anisotropic power-type dilations

(1.1) $$ \begin{align} (x_1,\ldots,x_n) \mapsto (\lambda^{a_1} b_1 x_1,\ldots \lambda^{a_n} b_n x_n), \end{align} $$

but not necessarily in the standard coordinate system. Here and in what follows, n is a fixed positive integer, while

(1.2) $$ \begin{align} a_1,a_2,\ldots,a_n, b_1,b_2,\ldots,b_n>0 \end{align} $$

are fixed parameters.

Our first result is an anisotropic version of the above theorem of Bourgain [Reference Bourgain4]. Suppose that we are also given linearly independent unit vectors

(1.3) $$ \begin{align} u_1, u_2, \ldots, u_n \in \mathbb{R}^{n}, \end{align} $$

in addition to the numbers (1.2). The idea is that these vectors determine directions of the edges (meeting at a single point) of the simplex with vertices $\Delta = \{\textbf {0}, b_1 u_1, b_2 u_2, \ldots , b_n u_n\}$ .

Theorem 1 For every measurable set $A\subseteq \mathbb{R}^{n+1}$ satisfying $\overline {\delta }(A)>0$ , there is a positive number $\lambda _0=\lambda _0(A,a_1,\ldots ,a_n,b_1,\ldots ,b_n,u_1,\ldots ,u_n)$ such that for each $\lambda \in [\lambda _0,\infty )$ , one can find a point $x\in \mathbb{R}^{n+1}$ and vectors $y_1,y_2,\ldots ,y_n\in \mathbb{R}^{n+1}$ satisfying

$$ \begin{align*} \{x,x+y_1,x+y_2,\ldots,x+y_n\} \subseteq A \end{align*} $$

and

$$ \begin{align*} y_k\cdot y_l = \lambda^{a_k} b_k u_k\cdot \lambda^{a_l} b_l u_l \quad \text{for } k,l=1,2,\ldots,n. \end{align*} $$

In other words, for each $\lambda \in [\lambda _0,\infty )$ , the set A contains an isometric copy of

$$ \begin{align*} \big\{\textbf{0}, \lambda^{a_1} b_1 u_1, \lambda^{a_2} b_2 u_2, \ldots, \lambda^{a_n} b_n u_n\big\}. \end{align*} $$

A notable particular case of Theorem 1 is obtained when the unit vectors (1.3) are mutually orthogonal, i.e., the simplex in question is right-angled. In this case, the theorem simply guarantees the existence of mutually orthogonal vectors $y_k$ with lengths $\lambda ^{a_k} b_k$ such that a translate of $\{\textbf {0},y_1,\ldots ,y_n\}$ is contained in A.

Now, we turn to n-dimensional rectangular boxes in $\mathbb{R}^{2n}$ . Our second result is an anisotropic generalization of the above theorem of Lyall and Magyar [Reference Lyall and Magyar46].

Theorem 2 For every measurable set $A\subseteq (\mathbb{R}^{2})^n$ satisfying $\overline {\delta }(A)>0$ , there is a positive number $\lambda _0=\lambda _0(A,a_1,\ldots ,a_n,b_1,\ldots ,b_n)$ such that for each $\lambda \in [\lambda _0,\infty )$ , one can find $x_1,\ldots ,x_n, y_1,\ldots ,y_n \in \mathbb{R}^2$ satisfying

$$ \begin{align*} \big\{ (x_1 + r_1 y_1, x_2 + r_2 y_2, \ldots, x_n + r_n y_n) : (r_1,\ldots,r_n)\in\{0,1\}^n \big\} \subseteq A \end{align*} $$

and

$$ \begin{align*} |y_k| = \lambda^{a_k} b_k\quad \text{for } k=1,2,\ldots,n. \end{align*} $$

In other words, for each $\lambda \in [\lambda _0,\infty )$ , the set A contains an isometric copy of

$$ \begin{align*} \{0,\lambda^{a_1} b_1\} \times \{0,\lambda^{a_2} b_2\} \times \cdots\times \{0,\lambda^{a_n} b_n\} \subset \mathbb{R}^n \end{align*} $$

with sides parallel to the distinguished $2$ -dimensional coordinate planes.

At the first sight, it might appear that Theorem 2 generalizes the “right-angled case” of Theorem 1, because vertices of a right-angled simplex clearly form a subset of the set of vertices of an appropriate rectangular box. A subtle distinction is that Theorem 1 already holds in the $(n+1)$ -dimensional Euclidean space, but we allowed the simplex to rotate in all possible directions. As opposed to that, Theorem 2 is easily seen to fail in less than $2n$ dimensions, because in it we consider only rotations coming from n coordinate planes of the splitting $(\mathbb{R}^{2})^n=\mathbb{R}^2\times \mathbb{R}^2\times \cdots \times \mathbb{R}^2$ . A $(2n-1)$ -dimensional counterexample is the set $A\subseteq \mathbb{R}\times \mathbb{R}^2\times \cdots \times \mathbb{R}^2$ obtained by restricting the first coordinate to $\bigcup _{m\in \mathbb{Z}}[m-1/10,m+1/10]$ . On the other hand, the corresponding result for arbitrarily rotated rectangular boxes in less than $2n$ dimensions has not been either proved or disproved at the time of writing, even in the case of a cube, i.e., when all parameters from (1.2) are equal to $1$ .

Finally, we give an anisotropic generalization of the aforementioned result of Lyall and Magyar on distance trees [Reference Lyall and Magyar47]. Let $\mathcal {T}=(V,E)$ be a finite tree with vertices V and edges E. It is convenient to identify the set of edges E with $\{1,2,\ldots ,n\}$ , and the number of vertices is then equal to $n+1$ . This way, the parameters $a_k,b_k$ from (1.2) are associated to the edges $k\in E$ of the tree. We no longer need to mention any length function $\ell $ , as the assignment $k\mapsto a_k,b_k$ gives rise to an even more complex structure. However, if we want to have some length function defined explicitly, then we can set $\ell (k):=b_k$ for each edge k.

Theorem 3 For every measurable set $A\subseteq \mathbb{R}^{2}$ satisfying $\overline {\delta }(A)>0$ , there is a positive number $\lambda _0=\lambda _0(A,\mathcal {T},a_1,\ldots ,a_n,b_1,\ldots ,b_n)$ such that for each $\lambda \in [\lambda _0,\infty )$ , one can find a set of points

$$ \begin{align*} \{ x_v : v\in V \} \subseteq A \end{align*} $$

satisfying

$$ \begin{align*} |x_u-x_v| = \lambda^{a_k} b_k\quad \text{for each edge } k\in E \text{ joining vertices } u,v\in V. \end{align*} $$

In other words, for each $\lambda \in [\lambda _0,\infty )$ , the set A contains an embedding of the distance tree combinatorially isomorphic to $\mathcal {T}$ and having the numbers $\ell (k)=\lambda ^{a_k} b_k$ as lengths of its edges.

Note that Theorem 3 is placed in two dimensions only. If we disregarded its dimensional sharpness, the particular case $a_1=\cdots =a_n$ , $b_1=\cdots =b_n$ of Theorem 3 would be a consequence of Theorem 2, because each tree is easily seen to be a subgraph of the hypercube graph in a sufficiently large dimension.

Proofs of Theorems 13 will rely on a few relatively known ideas from the harmonic analysis, modulo a general approach described in Section 1.3. Therefore, the main contribution of the present paper lies simply in recollecting, selecting, and reapplying those ideas to the above combinatorial problems. Further connections between the harmonic analysis and the combinatorics of the Euclidean space will be discussed at the end of the paper, in Section 6.1.

1.3 General scheme of the approach

The proofs of Theorems 13 will be presented in Sections 35, respectively. Here, we only discuss the general outline.

We will fit the proofs to the scheme that we are about to describe. The approach is an abstraction of the method stemming from the work of Bourgain [Reference Bourgain4] and first used by Cook et al. [Reference Cook, Magyar and Pramanik10] in a way that emphasizes the role of estimates for multilinear singular integrals or similar objects. Its variant was named the largeness–smoothness multiscale approach by Durcik and the present author [Reference Durcik and Kovač16]. Limitations of the method are essentially only the limitations within the field of multilinear harmonic analysis, which has seen vast and rapid development over the last few decades. The approach has already been reused several times after [Reference Cook, Magyar and Pramanik10] (see the papers [Reference Durcik and Kovač15Reference Durcik, Kovač and Rimanić17], which study arithmetic progressions and related configurations), but here, we want to point out that the method is also effective for many geometric configurations without any algebraic structure.

For each of the studied problems, we will define a counting form $\mathcal {N}^{0}_{\lambda }$ that identifies the configuration associated with the parameter $\lambda>0$ . In order to prove the claim, it is sufficient to show that $\mathcal {N}^{0}_{\lambda }$ is positive for all $\lambda $ that are sufficiently large depending on the set A. Besides $\lambda $ , which can be thought of as a scale of largeness, there will be another scale $0<\varepsilon \leq 1$ , interpreted as a scale of smoothness. A two-parameter family of counting forms $\mathcal {N}^{\varepsilon }_{\lambda }$ will recover $\mathcal {N}^{0}_{\lambda }$ in the limit as $\varepsilon \to 0$ . The reason for “smoothing” or “blurring out” up to scale $\varepsilon $ is that the smoother configuration can be identified with a more direct counting argument. Thus, the method starts by decomposing $\mathcal {N}^{0}_{\lambda }$ as

(1.4) $$ \begin{align} \mathcal{N}^{1}_{\lambda} + \big(\mathcal{N}^{\varepsilon}_{\lambda}-\mathcal{N}^{1}_{\lambda}\big) + \big(\mathcal{N}^{0}_{\lambda} - \mathcal{N}^{\varepsilon}_{\lambda}\big). \end{align} $$

The smoothest term $\mathcal {N}^{1}_{\lambda }$ can be thought of as the structured part, and its lower bound is a simpler problem. Typically, one needs to zoom the picture to scale $\lambda $ and then perform a direct counting argument. In this paper, we need to handle mismatched scales $\lambda ^{a_1},\ldots ,\lambda ^{a_n}$ simultaneously, which is a novel complication in relation to simplices, boxes, or trees, but it has already appeared in the work of Bourgain on polynomial three-term progressions [Reference Bourgain3]. The term $\mathcal {N}^{1}_{\lambda }$ is a reason why we will first show several estimates for general filtrations of a fixed probability space in Section 2.3.

The third term $\mathcal {N}^{0}_{\lambda }-\mathcal {N}^{\varepsilon }_{\lambda }$ is interpreted as the uniform part, and some oscillatory phenomenon should guarantee that it converges to $0$ uniformly in $\lambda $ as $\varepsilon \to 0$ . Quite often and also in this paper, the only oscillatory estimate needed is the decay of the Fourier transform of a spherical measure; see (2.1) and (2.2) below. Thus, one can fix a sufficiently small $\varepsilon>0$ such that the uniform part is always dominated by the structured part.

The middle term $\mathcal {N}^{\varepsilon }_{\lambda }-\mathcal {N}^{1}_{\lambda }$ is the error part. It cannot be efficiently estimated for a fixed value of $\lambda $ , so we rather attempt to control it “on the average” for sufficiently many scales $\lambda _1<\lambda _2<\cdots <\lambda _J$ satisfying, say, $\lambda _{j+1}\geq 2\lambda _j$ for each j. More precisely, sums of the form

(1.5) $$ \begin{align} \sum_{j=1}^{J} \big|\mathcal{N}^{\varepsilon}_{\lambda_j}-\mathcal{N}^{1}_{\lambda_j}\big| \end{align} $$

for lacunary scales $\lambda _j$ are shown to satisfy a bound that is allowed to blow up as $\varepsilon \to 0$ , but it is at the same nontrivial in the total number of scales J, i.e., it grows like $o(J)$ as $J\to \infty $ . When J is sufficiently large, pigeonholing guarantees that at least one of the individual errors $|\mathcal {N}^{\varepsilon }_{\lambda _j}-\mathcal {N}^{1}_{\lambda _j}|$ is sufficiently small. It is precisely the multiscale quantity (1.5) that resembles a certain multilinear singular integral form. Bounds for (1.5) are shown using certain “cancellation” between different scales $\lambda _j$ . This can be done using techniques from multilinear harmonic analysis, by treating (1.5) as a multisublinear integral operator. This is a route that we follow in the present paper, except that we clean up the proofs by reducing the aforementioned operator bounds merely to several Gaussian identities and estimates that we first establish in Section 2.2. To a large extent, these have already appeared in [Reference Durcik and Kovač16].

Finally, the actual result is shown by contradiction: assuming that the set A contains no copies of the desired configuration associated with a lacunary sequence of parameters $\lambda _1<\lambda _2<\cdots $ . By choosing $\varepsilon>0$ sufficiently small and then choosing J sufficiently large, we can guarantee that $|\mathcal {N}^{\varepsilon }_{\lambda _j}-\mathcal {N}^{1}_{\lambda _j}|$ and $|\mathcal {N}^{0}_{\lambda _j}-\mathcal {N}^{\varepsilon }_{\lambda _j}|$ are both dominated by $\mathcal {N}^{1}_{\lambda _j}$ for at least one index j. This implies $\mathcal {N}^{0}_{\lambda _j}>0$ , which contradicts our hypothesis that A does not contain the desired configuration of size $\lambda _j$ . A small technicality is that it is more convenient to work with a localized version B of the given set A. Details of the method applied to specific configurations can be found, for instance, in [Reference Cook, Magyar and Pramanik10, Section 2] or [Reference Durcik, Kovač and Rimanić17, Section 3]. We will also be completely rigorous about these details in Sections 35.

Already the pioneering work of Bourgain [Reference Bourgain4] used a particular case of the above scheme of proof. The main novelty introduced by Cook et al. [Reference Cook, Magyar and Pramanik10] is that the smoother version $\mathcal {N}^{\varepsilon }_{\lambda }$ of the counting form $\mathcal {N}^{0}_{\lambda }$ need not be obtained by smoothing the input functions (see Sections 4 and 5), even though this is one legitimate possibility (see Section 3). Admittedly, this general scheme was primarily devised for studying “more singular configurations,” such as arithmetic progressions; see [Reference Cook, Magyar and Pramanik10, Reference Durcik and Kovač15Reference Durcik, Kovač and Rimanić17]. It can be an overkill in relation to simplices or boxes, as the papers [Reference Bourgain4, Reference Lyall and Magyar45, Reference Lyall and Magyar47] proceed by following a different philosophy. However, the present work does benefit from the power and flexibility of the general largeness–smoothness multiscale approach. Namely, we will define $\mathcal {N}^{\varepsilon }_{\lambda }$ using the heat flow—the motivation comes from [Reference Durcik and Kovač16, Section 7], while [Reference Durcik and Kovač16, Sections 3–6] also employ a similar time-space dynamics. That way, we will make use of a simple fact that the heat equation remains essentially the same after a power-type change of the time variable; compare formulae (2.6) and (2.7) below.

As we said, Gaussians play a prominent role throughout the paper, so Section 2.2 will recall their properties needed in the proofs. Most notable ones will be the heat equation (2.6), estimates (2.8) and (2.9), and identities (2.10) and (2.15) below. They allow us to easily estimate convolutions with general probability measures both from below (which will be needed in Sections 3.1, 4.1, and 5.1) and from above (which will be needed in Sections 3.2, 4.2, and 5.2). It is quite likely that other semigroup structures work as well, at least in Section 3 as Bourgain [Reference Bourgain4] used the Poisson kernel for simplices. In any case, the present paper tries to advertise Gaussians as convenient mollifiers for the problems studied here.

1.4 Organization of the paper

Let us explain shortly how the rest of the paper is organized. Section 2 discusses the notation used throughout the paper. It also recalls a few basic notions from the Fourier analysis, proves a couple of identities concerning Gaussian functions, and shows a few inequalities for conditional expectations on a general probability space. Section 3 establishes Theorem 1 on anisotropic simplices, Section 4 establishes Theorem 2 on anisotropic rectangular boxes, while Section 5 proves Theorem 3 on anisotropic trees. Each of these sections is divided further into three subsections that, respectively, handle structured, error, and uniform terms from the basic splitting (1.4). Finally, Section 6 discusses anisotropic multilinear singular integral operators that are motivated by the above combinatorial problems. It also comments on possible generalizations of the results and limitations of the approach.

2 Notation and preliminaries

2.1 Basic notation

Let A and B be two nonnegative quantities. We write $A\lesssim B$ and $B\gtrsim A$ if $A\leq C B$ holds for some (unimportant) finite positive constant C. We write $A\sim B$ if $c B\leq A\leq C B$ holds for finite positive constants c and C. Throughout the paper, it will be understood that any of these constants $c,C$ are allowed to depend on the dimension of the ambient Euclidean space, the number n, the parameters from (1.2), and (in Section 3) also on the unit vectors from (1.3), but are independent of all other parameters or variables.

The open Euclidean ball with radius r centered at x will be denoted $\textrm {B}(x,r)$ . The (standard) inner product and the Euclidean norm on $\mathbb{R}^d$ will be written as $(x,y)\mapsto x\cdot y$ and $x\mapsto |x|$ , respectively. The distance from a point $x\in \mathbb{R}^d$ to a set $S\subseteq \mathbb{R}^d$ will be denoted $\mathop {\textrm {dist}}(x,S)$ . The linear span of a set of vectors $S\subseteq \mathbb{R}^d$ will be written as $\mathop {\textrm {span}}(S)$ . We will write for the indicator function of a set $A\subseteq \mathbb{R}^d$ . The floor function is denoted $x\mapsto \lfloor x\rfloor $ , i.e., $\lfloor x\rfloor $ is the largest integer less than or equal to $x\in \mathbb{R}$ . The complex imaginary unit will be written as . The logarithm function will be written “ $\log $ ,” and it will be understood that its base is the number e.

We will always specify the measure with respect to which the integrals are evaluated, unless we are working with the Lebesgue measure. Similarly, we will simply write $|A|$ for the Lebesgue measure of A. By , we denote the average of a locally integrable complex function f over a bounded measurable set $A\subseteq \mathbb{R}^d$ . We will write $f\mapsto \|f\|_{\textrm {L}^p}$ for the norm of $\textrm {L}^p(\mathbb{R}^d)$ , $p\in [1,\infty ]$ , and $(f,g)\mapsto \langle f,g\rangle _{\textrm {L}^2}$ for the inner product in $\textrm {L}^2(\mathbb{R}^d)$ .

Now, we come to dilates and convolutions of functions and measures. For an integrable function $f\colon \mathbb{R}^d\to \mathbb{C}$ and a number $\lambda \in \mathbb{R}\setminus \{0\}$ , we define

$$ \begin{align*} f_{\lambda}(x) := |\lambda|^{-d} f(\lambda^{-1}x) \quad \text{for }x\in\mathbb{R}^d. \end{align*} $$

More generally, for a finite measure $\nu $ on Borel subsets $A\subseteq \mathbb{R}^d$ , we write

$$ \begin{align*} \nu_{\lambda}(A) := \nu(\lambda^{-1}A) = \nu(\{\lambda^{-1}x : x\in A\}). \end{align*} $$

Note that the normalizations of $f_{\lambda }$ and $\nu _{\lambda }$ are consistent with each other: if $\nu $ happens to be an absolutely continuous measure with density f, then $\nu _{\lambda }$ will have $f_{\lambda }$ for its density. In fact, for a bounded measurable function $h\colon \mathbb{R}^d\to \mathbb{C}$ , we have

$$ \begin{align*} \int_{\mathbb{R}^d} h(x) \,\textrm{d}\nu_{\lambda}(x) & = \int_{\mathbb{R}^d} h(\lambda x) \,\textrm{d}\nu(x), \\ \int_{\mathbb{R}^d} h(x) f_{\lambda}(x) \,\textrm{d}x & = \int_{\mathbb{R}^d} h(\lambda x) f(x) \,\textrm{d}x. \end{align*} $$

If $g\colon \mathbb{R}^d\to \mathbb{C}$ is another integrable function, then it makes sense to define

$$ \begin{align*} (f\ast g)(x) := \int_{\mathbb{R}^d} f(y) g(x-y) \,\textrm{d}y \quad \text{for a.e. }x\in\mathbb{R}^d. \end{align*} $$

It is well known that the operation of convolution is commutative. More generally, the convolution of a finite measure $\nu $ and an integrable function g is defined as

$$ \begin{align*} (\nu\ast g)(x) := \int_{\mathbb{R}^d} g(x-y) \,\textrm{d}\nu(y) \quad \text{for a.e. }x\in\mathbb{R}^d. \end{align*} $$

Even at this level of generality, the operation of convolution is associative, i.e., $({\nu \ast f})\ast g=\nu \ast (f\ast g)$ , while the Dirac delta measure at the origin, denoted $\delta _{\textbf {0}}$ , serves as the identity element.

The Fourier transform of an integrable function $f\colon \mathbb{R}^d\to \mathbb{C}$ is $\widehat {f}\colon \mathbb{R}^d\to \mathbb{C}$ defined as

while the Fourier transform of a finite Borel measure $\nu $ is $\widehat {\nu }\colon \mathbb{R}^d\to \mathbb{C}$ defined as

Basic properties of the Fourier transform can be found in any introductory textbook on the harmonic analysis, such as [Reference Stein and Weiss54]. For instance, it is useful to know that, for any $\lambda \in \mathbb{R}\setminus \{0\}$ and $f,\nu $ as above, one has

$$ \begin{align*} \widehat{f_{\lambda}}(\xi) = \widehat{f}(\lambda \xi), \quad \widehat{\nu_{\lambda}}(\xi) = \widehat{\nu}(\lambda \xi) \quad \text{for } \xi\in\mathbb{R}^d. \end{align*} $$

Important instances of Borel measures are spherical measures. If $\sigma $ is the normalized surface measure of the $(d-1)$ -dimensional standard unit sphere $\mathbb{S}^{d-1}$ in $\mathbb{R}^{d}$ , $d\geq 2$ , then its Fourier transform satisfies the well-known decay

(2.1) $$ \begin{align} |\widehat{\sigma}(\xi)| \lesssim \min\big\{1,|\xi|^{-(d-1)/2}\big\} \quad \text{for } \xi\in\mathbb{R}^d; \end{align} $$

see [Reference Stein52, Section VIII.5.B]. If $\sigma $ is a normalized $(k-1)$ -dimensional spherical measure supported on a sphere of radius $r>0$ in a k-dimensional plane in $\mathbb{R}^{d}$ orthogonal to some $(d-k)$ -dimensional linear subspace $H\subset \mathbb{R}^{d}$ , then

(2.2) $$ \begin{align} |\widehat{\sigma}(\xi)| \lesssim \min\left\{1,\left(r\mathop{\textrm{dist}}(\xi,H)\right)^{-(k-1)/2}\right\} \quad \text{for } \xi\in\mathbb{R}^d. \end{align} $$

2.2 Gaussian identities

We write

for the standard d-dimensional Gaussian function,

We also reserve special letters for its partial derivatives,

and for its Laplacian

Basic properties of the Fourier transform easily yield

and

where $\xi =(\xi _1,\xi _2,\ldots ,\xi _d)\in \mathbb{R}^d$ is arbitrary. These formulae (and the fact that the Fourier transform interchanges convolutions and pointwise products) imply the following convolution identities:

(2.3)
(2.4)

and

(2.5)

for any $\alpha ,\beta \in (0,\infty )$ . Identities (2.3)–(2.5) will be useful for splitting

or

into a convolution of one term with a desired scale and a uniquely determined remaining term; see Sections 3.2 and 4.2.

The above Gaussian functions are easily seen to satisfy the heat equation:

(2.6)

on $(t,x)\in (0,\infty )\times \mathbb{R}^d$ . By a simple chain rule, (2.6) generalizes to

(2.7)

where $a,b\in (0,\infty )$ can be arbitrary. The heat equation will govern the smoothing dynamics; see the beginning of Section 3.2 and the beginning of Section 4.2.

On the one hand, because Gaussian tails decay faster than any polynomial, we trivially have

(2.8)

for $x\in \mathbb{R}^d$ . On the other hand,

(2.9)

for $x\in \mathbb{R}^d$ . In words, Schwartz tails are dominated by a superposition of dilated Gaussians. Formula (2.9) was first used in a similar context by Durcik [Reference Durcik12]. It can be shown easily by investigating the asymptotic behavior of the right-hand side as $|x|\to \infty $ ; see the details in [Reference Durcik12, Section 3] or [Reference Durcik, Kovač, Škreb and Thiele18, Section 3]. A combination of (2.8) and (2.9) will be used to bound a convolution of a Gaussian and an arbitrary probability measure with bounded support by a superposition of (nontranslated) Gaussians; see the computations leading to (3.11) and (3.13) below.

We claim a simple identity:

(2.10)

for any compactly supported $f\in \textrm {L}^2(\mathbb{R}^d)$ and $a,b\in (0,\infty )$ . Indeed, using (2.4), (2.7), and (2.3), respectively, the left-hand side of (2.10) can be rewritten as

Next, for fixed parameters (1.2), for any choice of $\gamma _1,\ldots ,\gamma _n\in (0,\infty )$ , and for a compactly supported real-valued $f\in \textrm {L}^{2^n}((\mathbb{R}^d)^n)$ , we define

(2.11)

Here, we denote

(2.12) $$ \begin{align} \mathcal{F}(\textbf{x}) := \prod_{(r_1,\ldots,r_n)\in\{0,1\}^n} f(x_1^{r_1}, \ldots, x_n^{r_n}), \end{align} $$

so that this is a function of $\textbf {x}=(x_1^0,x_1^1,\ldots ,x_n^0,x_n^1)\in (\mathbb{R}^d)^{2n}$ , and write formally

(2.13) $$ \begin{align} \textrm{d}\textbf{x} := \textrm{d}x_1^0\,\textrm{d}x_1^1 \,\textrm{d}x_2^0\,\textrm{d}x_2^1 \cdots \textrm{d}x_n^0\,\textrm{d}x_n^1. \end{align} $$

We will also denote

(2.14) $$ \begin{align} \mathcal{F}^{(m)}(x) := \prod_{(r_1,\ldots,r_{m-1},r_{m+1}\ldots,r_n)\in\{0,1\}^{n-1}} f(x_1^{r_1}, \ldots, x_{m-1}^{r_{m-1}}, x, x_{m+1}^{r_{m+1}}, \ldots, x_{n}^{r_{n}}), \end{align} $$

for $x\in \mathbb{R}^d$ and $m\in \{1,\ldots ,n\}$ , keeping in mind that $\mathcal {F}^{(m)}(x)$ also depends on other variables than just x.

Formula (2.4) allows us to rewrite

which reveals that $\Theta ^{n,m}_{\gamma _1,\ldots ,\gamma _n}(f)$ is nonnegative and well-defined. This time, we claim the identity

(2.15) $$ \begin{align} \sum_{m=1}^{n} a_m \Theta^{n,m}_{\gamma_1,\ldots,\gamma_n}(f) = 2\pi \|f\|_{\textrm{L}^{2^n}(\mathbb{R}^d)}^{2^n}. \end{align} $$

First, by the product rule for differentiation and the generalized heat equation (2.7), we can write

(2.16)

A consequence of the last display and the fundamental theorem of calculus is

The first limit above equals (allowing a slightly informal usage of $\delta _{\textbf {0}}$ ):

$$ \begin{align*} & \int_{(\mathbb{R}^d)^{2n}} \mathcal{F}(\textbf{x}) \Bigg(\prod_{k=1}^{n}\delta_{\textbf{0}}(x_k^0-x_k^1)\Bigg) \,\textrm{d}\textbf{x} \\ & = \int_{(\mathbb{R}^d)^{n}} f(x_1,\ldots,x_n)^{2^n} \,\textrm{d}x_1 \cdots \textrm{d}x_n = \|f\|_{\textrm{L}^{2^n}(\mathbb{R}^d)}^{2^n}, \end{align*} $$

while the second one is $0$ . This proves (2.15).

Every summand on the left-hand side of (2.15) is nonnegative, so the $\Theta ^{n,m}_{\gamma _1,\ldots ,\gamma _n}(f)$ is clearly bounded by $(2\pi /a_m)\|f\|_{\textrm {L}^{2^n}(\mathbb{R}^d)}^{2^n}$ . Note that this bound is uniform in the parameters $\gamma _1,\ldots ,\gamma _n$ . Identities (2.10) and (2.15) will bound multiscale expressions coming from the study of the error parts in the decomposition (1.4). They can be viewed, respectively, as cheap substitutes for square function estimates and bounds for entangled multilinear singular integrals (mentioned in Section 6.1).

2.3 Conditional expectations

A few concepts from probability theory will be useful in the proofs below. Even though the dyadic setting would be sufficient, the general probabilistic notation makes arguments elegant and concise.

We are working in a fixed probability space $(\Omega ,\mathcal {F},\mathbb{P})$ . The associated expectation is the operator $\mathbb{E}\colon f\mapsto \int _{\Omega } f\,\textrm {d}\mathbb{P}$ defined on $\textrm {L}^1(\Omega ,\mathcal {F},\mathbb{P})$ . Moreover, for any $\sigma $ -algebra $\mathcal {G}\subseteq \mathcal {F}$ , one can construct the operator of conditional expectation with respect to $\mathcal {G}$ as the map

$$ \begin{align*} \textrm{L}^1(\Omega,\mathcal{F},\mathbb{P}) \to \textrm{L}^1(\Omega,\mathcal{G},\mathbb{P}), \quad f\mapsto \mathbb{E}(f|\mathcal{G}) \end{align*} $$

such that

$$ \begin{align*} \int_{A} \mathbb{E}(f|\mathcal{G})\,\textrm{d}\mathbb{P} = \int_{A} f\,\textrm{d}\mathbb{P} \end{align*} $$

for every $f\in \textrm {L}^1(\Omega ,\mathcal {F},\mathbb{P})$ and every $A\in \mathcal {G}$ . The proof of its existence (and uniqueness) can be found in many textbooks on introductory probability theory, such as the one by Durrett [Reference Durrett22]. If $\mathcal {G}$ is generated by a finite partition of $\Omega $ into atoms and if, among these atoms, $A_1,\ldots ,A_N$ have nonzero probability, then we have a simple formula

(2.17)

In words, on each of the atoms, $\mathbb{E}(f|\mathcal {G})$ is constantly equal to the average value of f over that atom. As a very special case of only one atom, we get

(2.18) $$ \begin{align} \mathbb{E}\big(f\big|\{\emptyset,\Omega\}\big) = \mathbb{E}f \quad \textrm{a.s.} \end{align} $$

The following properties of conditional expectations are standard; see [Reference Durrett22, Section 4.1]. The operator $f\mapsto \mathbb{E}(f|\mathcal {G})$ is linear and monotone. If $f\in \textrm {L}^1(\Omega ,\mathcal {F},\mathbb{P})$ and $g\in \textrm {L}^{\infty }(\Omega ,\mathcal {G},\mathbb{P})$ , $\mathcal {G}\subseteq \mathcal {F}$ , then

(2.19) $$ \begin{align} \mathbb{E}(fg|\mathcal{G}) = \mathbb{E}(f|\mathcal{G}) g \quad \textrm{a.s.} \end{align} $$

Next, for $p\in [1,\infty )$ and $f\in \textrm {L}^p(\Omega ,\mathcal {F},\mathbb{P})$ , we have

(2.20) $$ \begin{align} \big|\mathbb{E}(f|\mathcal{G})\big|^p \leq \mathbb{E}\big(|f|^p\big|\mathcal{G}\big) \quad \textrm{a.s.} \end{align} $$

In particular, the conditional expectation is a (not necessarily strict) contraction in the $\textrm {L}^p$ norms. Finally, if $\mathcal {G}$ and $\mathcal {H}$ are two $\sigma $ -algebras such that $\mathcal {F}\supseteq \mathcal {G}\supseteq \mathcal {H}$ , then

(2.21) $$ \begin{align} \mathbb{E}\big(\mathbb{E}(f|\mathcal{G})\big|\mathcal{H}\big) = \mathbb{E}(f|\mathcal{H}) = \mathbb{E}\big(\mathbb{E}(f|\mathcal{H})\big|\mathcal{G}\big) \quad \textrm{a.s.} \end{align} $$

for any integrable function f.

Now, suppose that we are given a filtration $(\mathcal {G}_m)_{m=0}^{\infty }$ of the probability space $(\Omega ,\mathcal {F},\mathbb{P})$ , i.e., a sequence of $\sigma $ -algebras satisfying $\mathcal {G}_0\subseteq \mathcal {G}_1\subseteq \mathcal {G}_2\subseteq \cdots \subseteq \mathcal {F}$ . For shortness, let us denote the conditional expectation operator $f\mapsto \mathbb{E}(f|\mathcal {G}_m)$ simply by $\mathbb{E}_m$ , for each index m. Conditional expectations with respect to filtrations are widely studied in the literature on martingales; see, for instance, [Reference Durrett22, Chapter 4]. We will need the following slightly nonstandard inequality in Sections 3.1 and 5.1. We claim that for any nonnegative bounded measurable function f and for nonnegative integers $m,m_1,m_2,\ldots ,m_n$ satisfying $m\leq \min \{m_1,\ldots ,m_n\}$ , we have

(2.22) $$ \begin{align} \mathbb{E}_m \big( f (\mathbb{E}_{m_1}f) (\mathbb{E}_{m_2}f) \cdots (\mathbb{E}_{m_n}f) \big) \geq (\mathbb{E}_m f)^{n+1} \quad \textrm{a.s.} \end{align} $$

For the proof of (2.22), we can assume, without loss of generality, that $m\leq m_1\leq \cdots \leq m_n$ . We use the mathematical induction on $k=0,1,\ldots ,n-1$ to show

(2.23) $$ \begin{align} \mathbb{E}_m \bigg( \bigg( \prod_{i=1}^{k}\mathbb{E}_{m_i}f \bigg) (\mathbb{E}_{m_{k+1}}f)^{n+1-k} \bigg) \geq (\mathbb{E}_m f)^{n+1} \quad \textrm{a.s.} \end{align} $$

The case $k=n-1$ of (2.23) is precisely (2.22), because we can use (2.21) and (2.19), respectively, to equate their left-hand sides:

$$ \begin{align*} \mathbb{E}_m \bigg( f \bigg( \prod_{i=1}^{n}\mathbb{E}_{m_i}f \bigg) \bigg) = \mathbb{E}_m \mathbb{E}_{m_n} \bigg( f \bigg( \prod_{i=1}^{n}\mathbb{E}_{m_i}f \bigg) \bigg) = \mathbb{E}_m \bigg( \bigg( \prod_{i=1}^{n-1}\mathbb{E}_{m_i}f \bigg) (\mathbb{E}_{m_n}f)^2 \bigg) \quad \textrm{a.s.} \end{align*} $$

The induction basis $k=0$ of (2.23) is just a consequence of (2.20) and (2.21):

$$ \begin{align*} \mathbb{E}_m \big( (\mathbb{E}_{m_1}f)^{n+1} \big) \geq (\mathbb{E}_m \mathbb{E}_{m_1}f)^{n+1} = (\mathbb{E}_m f)^{n+1} \quad \textrm{a.s.} \end{align*} $$

For the induction step, we only need to rewrite and estimate the left-hand side of (2.23) as

$$ \begin{align*} & \mathbb{E}_m \mathbb{E}_{m_{k}} \bigg( \bigg( \prod_{i=1}^{k}\mathbb{E}_{m_i}f \bigg) (\mathbb{E}_{m_{k+1}}f)^{n+1-k} \bigg) = \mathbb{E}_m \bigg( \bigg( \prod_{i=1}^{k}\mathbb{E}_{m_i}f \bigg) \,\mathbb{E}_{m_{k}} \big( (\mathbb{E}_{m_{k+1}}f)^{n+1-k} \big) \bigg) \\ & \geq \mathbb{E}_m \bigg( \bigg( \prod_{i=1}^{k}\mathbb{E}_{m_i}f \bigg) (\mathbb{E}_{m_{k}}\mathbb{E}_{m_{k+1}}f)^{n+1-k} \bigg) = \mathbb{E}_m \bigg( \bigg( \prod_{i=1}^{k-1}\mathbb{E}_{m_i}f \bigg) (\mathbb{E}_{m_{k}}f)^{n+2-k} \bigg) \quad \textrm{a.s.}, \end{align*} $$

where we used properties (2.21), (2.19), (2.20), and (2.21) again, in that order. This completes the inductive proof of (2.23) and thus also confirms (2.22).

An immediate consequence of (2.22) combined with (2.18), (2.21), and (2.20) is a scalar inequality

(2.24) $$ \begin{align} \mathbb{E} \big( f (\mathbb{E}_{m_1}f) (\mathbb{E}_{m_2}f) \cdots (\mathbb{E}_{m_n}f) \big) \geq (\mathbb{E} f)^{n+1} \end{align} $$

for any nonnegative bounded measurable f and nonnegative integers $m_1,m_2,\ldots ,m_n$ . On the other hand, an easy generalization of (2.22) is

(2.25) $$ \begin{align} \mathbb{E}_m \left(f \prod_{i=1}^{n}\mathbb{E}_{m_i}f\right) \geq \left(\overbrace{\prod_{\substack{i\\m_i<m}} \mathbb{E}_{m_i}f}^{N\text{factors}}\right)(\mathbb{E}_m f)^{n+1-N} \quad \textrm{a.s.},\end{align}$$

where m is now a completely arbitrary nonnegative integer and N is the number of indices i satisfying $m_i<m$ . One only needs to use (2.19) to factor out N terms from the left-hand side of (2.25) and then apply (2.22) to the remaining $n+1-N$ terms. Inequality (2.25) will be used to resolve nested conditional expectations when bounding them from below.

3 Anisotropic simplices: proof of Theorem 1

Our approach is the closest in spirit to Bourgain’s original proof from [Reference Bourgain4]. One superficial difference is that we are using the heat kernel, where Bourgain used the Poisson kernel. Other notable differences are the treatment of “mismatched” scales in the structured part using (3.5) below and the way in which nonlinear scales are treated in the error part. Even though Lyall and Magyar gave several very slick alternative proofs for the case of isotropic “linear” dilates of a nondegenerate simplex [Reference Lyall and Magyar45Reference Lyall and Magyar47] (also see [Reference Huckaba, Lyall and Magyar35]), we still find Bourgain’s proof the easiest one to adapt to our general scheme.

Recall that we were given positive numbers (1.2) and unit vectors (1.3). However, here, we embed $\mathbb{R}^n\cong \mathbb{R}^n\times \{0\}\subset \mathbb{R}^{n+1}$ , so that the $u_k$ are now viewed as unit vectors in $\mathbb{R}^{n+1}$ . Let $\mu $ be the normalized Haar measure on the special orthogonal group $\textrm {SO}(n+1,\mathbb{R})$ . For a compactly supported measurable function $f\colon \mathbb{R}^{n+1}\to [0,1]$ , the counting form is defined as

$$ \begin{align*} \mathcal{N}^{0}_{\lambda}(f) := \int_{\mathbb{R}^{n+1}} \int_{\textrm{SO}(n+1,\mathbb{R})} f(x) \bigg( \prod_{k=1}^{n} f(x + \lambda^{a_k} b_k U u_k) \bigg) \,\textrm{d}\mu(U) \,\textrm{d}x, \end{align*} $$

while its smoothened variant is

for $\lambda>0$ and $0<\varepsilon \leq 1$ . Denote

(3.1) $$ \begin{align} a:=a_1+a_2+\cdots+a_n,\quad c:=\min\{a_1,a_2,\ldots,a_n\}. \end{align} $$

As explained in Section 1.3, it is sufficient to show that for any choice of numbers $\delta ,\varepsilon \in (0,1]$ , positive integer J, scales $0<\lambda _1<\cdots <\lambda _J$ satisfying $\lambda _{j+1}\geq 2\lambda _j$ for each index j, yet another scale $\lambda \in (0,\lambda _J]$ , a sufficiently large number $R>0$ (depending on J and the scales $\lambda _j$ ), and a measurable set $B\subseteq [0,R]^{n+1}$ with measure $|B|\geq \delta R^{n+1}$ , we have

(3.2)
(3.3)
(3.4)

Saying that R is sufficiently large means, for instance, assuming that $R\geq 2\lambda _J^{a_k}b_k$ for each $k\in \{1,\ldots ,n\}$ .

Once we have (3.2)–(3.4), the actual argument establishing Theorem 1 takes $\delta $ to be a fixed positive number smaller than $\overline {\delta }(A)$ . Afterward, we choose $\varepsilon $ small enough such that

$$ \begin{align*} \varepsilon^{c/2} \leq \vartheta_1 \delta^{n+1} \end{align*} $$

and then J large enough, so that

$$ \begin{align*} \varepsilon^{-(n+2)a} \log(1/\varepsilon) J^{-1/2} \leq \vartheta_2 \delta^{n+1}. \end{align*} $$

Here, $\vartheta _1,\vartheta _2>0$ are sufficiently small constants, depending only on the implicit constants from (3.2)–(3.4). Next, we take an unbounded sequence of scales $(\lambda _j)_{j=1}^{\infty }$ such that the set A does not contain the desired configuration (which is here an anisotropic simplex) of size $\lambda _j$ for any index j. As a consequence, vanishes for each j. We sparsify the sequence to achieve $\lambda _{j+1}\geq 2\lambda _j$ . Take $x\in \mathbb{R}^{n+1}$ and a sufficiently large R for which the set $B := (A-x)\cap [0,R]^{n+1}$ has measure at least $\delta R^{n+1}$ . By pigeonholing over the summands in (3.3), we can find an index $j\in \{1,2,\ldots ,J\}$ such that is at least a positive constant multiple of $\delta ^{n+1} R^{n+1}$ . That way, we arrive at a contradiction with .

3.1 The structured part: proof of (3.2)

For $k=2,\ldots ,n$ , let us write the orthogonal projection of $u_k$ onto $\mathop {\textrm {span}}(\{u_1,\ldots ,u_{k-1}\})$ explicitly as

$$ \begin{align*} \beta_{k,1} u_1 + \beta_{k,2} u_2 + \cdots + \beta_{k,k-1} u_{k-1} \end{align*} $$

for some scalars $\beta _{k,1},\beta _{k,2},\ldots ,\beta _{k,k-1}\in \mathbb{R}$ . Moreover, denote

$$ \begin{align*} d_k := \mathop{\textrm{dist}}\big(u_k, \mathop{\textrm{span}}(\{u_1,\ldots,u_{k-1}\})\big)>0. \end{align*} $$

By the symmetry present in $\mathcal {N}^{\varepsilon }_{\lambda }$ when integrating over all systems $(y_1,\ldots ,y_n)=(\lambda ^{a_1} b_1 U u_1,\ldots ,\lambda ^{a_n} b_n U u_n)$ , we can rewrite the smoother counting form as in [Reference Bourgain4]:

Here, $\sigma $ denotes the spherical measure supported on $\mathbb{S}^n\subset \mathbb{R}^{n+1}$ , while, for $k=2,\ldots ,n$ , we write $\sigma ^{y_1,\ldots ,y_{k-1}}$ for the $(n-k+1)$ -dimensional spherical measure supported on the sphere that is centered at

$$ \begin{align*} \beta_{k,1} \frac{y_1}{|y_1|} + \beta_{k,2} \frac{y_2}{|y_2|} + \cdots + \beta_{k,k-1} \frac{y_{k-1}}{|y_{k-1}|}, \end{align*} $$

has radius $d_k$ , and belongs to an $(n-k+2)$ -dimensional plane in $\mathbb{R}^{n+1}$ orthogonal to $\mathop {\textrm {span}}(\{y_1,\ldots ,y_{k-1}\})$ . We normalize these measures in a way that each of them has its total mass equal to $1$ and we view them as being defined on all Borel subsets of $\mathbb{R}^{n+1}$ , even though their supports are lower-dimensional sets contained in the standard unit sphere $\mathbb{S}^n$ . In addition, note that the $n+1$ integrals over $\mathbb{R}^{n+1}$ are nested and the integration is performed in the order from the innermost one to the outermost one.

Here, we concentrate on the smoothest case $\varepsilon =1$ . It is easy to observe that for any probability measure $\nu $ supported inside the standard unit sphere $\mathbb{S}^n\subset \mathbb{R}^{n+1}$ , we can bound pointwise

where

The integral in $y_n$ in $\mathcal {N}^{1}_{\lambda }(f)$ can be recognized as a triple convolution,

Now, we do the same to the integral in $y_{n-1}$ , etc. Repeating this process n times, we end up with

$$ \begin{align*} \mathcal{N}^{1}_{\lambda}(f) \gtrsim \int_{(\mathbb{R}^{d})^{n+1}} f(x) \left( \prod_{k=1}^{n} (f\ast\varphi_{\lambda^{a_k}b_k})(x) \right) \,\textrm{d}x. \end{align*} $$

Once we know that

(3.5)

holds for every $t_1,t_2,\ldots ,t_n\in (0,R/2]$ and every measurable function $f\colon [0,R]^{d}\to [0,1]$ , then (3.2) will follow simply by choosing $t_k=\lambda ^{a_k}b_k$ , for $k=1,\ldots ,n$ , and taking $d=n+1$ and

. However, (3.5) is just a straightforward generalization of [Reference Durcik, Guo and Roos14, Lemma 2.1] by Durcik, Guo, and Roos, which, in turn, is an elaboration of Bourgain’s [Reference Bourgain3, Lemma 6] (stated there without proof). The paper [Reference Durcik, Guo and Roos14] established the special case $d=1$ , $n=2$ , $R=1$ of (3.5) elegantly, by dominating convolutions $f\ast \varphi _{t_k}$ pointwise from below by a dyadic martingale. We are about to reuse this idea to give a quick proof of (3.5) in general. Even if this generalization is quite clear and expected, we still choose to be sufficiently detailed, because we will have to argue similarly in an even greater generality in Section 5.1.

Let us turn $[0,R]^d$ into a probability space by taking $\mathbb{P}$ to be the Lebesgue measure normalized by the factor $R^{-d}$ . The dyadic filtration $(\mathcal {G}_m)_{m=0}^{\infty }$ of $[0,R]^d$ is obtained by defining $\mathcal {G}_m$ to be a finite $\sigma $ -algebra (i.e., a finite algebra of sets) generated with $2^{dm}$ congruent cubes of side length $2^{-m}R$ that partition $[0,R]^d$ . Let $m_k$ be the smallest nonnegative integer such that $2^{-m_k}R d^{1/2} < t_k$ . By this choice, a cube of side length $2^{-m_k}R$ has diameter smaller than $t_k$ , so, if it contains a point x, then it remains fully inside the ball $B(x,t_k)$ . Because of this and equation (2.17), we clearly have

$$ \begin{align*} f\ast\varphi_{t_k} \gtrsim \mathbb{E}(f|\mathcal{G}_{m_k}) = \mathbb{E}_{m_k}f \quad \textrm{a.e.} \end{align*} $$

Now, (3.5) becomes a consequence of the probabilistic inequality (2.24).

3.2 The error part: proof of (3.3)

We will keep writing f interchangeably with

, where B is as before. Using the product rule for differentiation and applying the generalized heat equation (2.7), we get

Thus, by the fundamental theorem of calculus, for any $0<\alpha <\beta $ , the difference $\mathcal {N}^{\alpha }_{\lambda }(f)-\mathcal {N}^{\beta }_{\lambda }(f)$ can be written as

$$ \begin{align*} \sum_{m=1}^{n}\mathcal{L}_{\lambda}^{\alpha,\beta,m}(f), \end{align*} $$

where

(3.6)

By symmetry, it is sufficient to consider the case $m=n$ , and the same reasoning as in the previous subsection gives

(3.7)

The following arguments will not depend on the dimension of the ambient space $\mathbb{R}^{n+1}$ . We will emphasize this fact by writing it as $\mathbb{R}^d$ and remembering that $d={n+1\geq 2}$ . This will also be convenient for recycling the same computation in Section 5. Here, we need to control

$$ \begin{align*} \sum_{j=1}^{J}|\mathcal{L}_{\lambda_j}^{\varepsilon,1,n}(f)|. \end{align*} $$

Set

(3.8) $$ \begin{align} \theta:=10^{-1/a_n}e^{-1}. \end{align} $$

Let us begin the estimation by multiplying the inner integral of $\mathcal {L}_{\lambda _j}^{\varepsilon ,1,n}(f)$ with

(3.9) $$ \begin{align} \int_{\theta t\lambda_j}^{e\theta t\lambda_j} \,\frac{\textrm{d}s}{s} = 1 \end{align} $$

and rewriting the whole expression as

For

(3.10) $$ \begin{align} j\in\{1,\ldots,J\},\quad \varepsilon\leq t\leq 1,\quad \theta t\lambda_j\leq s\leq e\theta t\lambda_j, \end{align} $$

denote

$$ \begin{align*} r = r(j,s,t) := \big((t\lambda_j)^{2a_n}-s^{2a_n}\big)^{1/2a_n}, \end{align*} $$

remembering that r depends on j, s, and t, and observing that $s\sim t\lambda _j \sim r$ . Convolution identity (2.4) gives

so that, introducing the integration variable y,

Take $\nu $ to be an arbitrary probability measure supported on a subset of the standard unit sphere in $\mathbb{R}^{d}$ , even though the following reasoning will only be applied with $\nu =\sigma ^{y_1,\ldots ,y_{n-1}}$ in the current section. Expanding the convolution according to its definition and using (2.8), (2.9) enables us to estimate

so, dilating by $s^{a_n}b_n$ , we also get

(3.11)

for $j,t,s$ as in (3.10) and for $x\in \mathbb{R}^d$ . Consequently,

In the next step, we realize that $y_{n-1}$ only appears in one of the factors of the integrand and as one of the integration variables in the last expression. For that reason, we can rewrite the integral in $y_{n-1}$ as a convolution and get

(3.12)

Now, we observe that, by (2.8) and (2.9),

for a general $\nu $ as before and for any $k\in \{1,\ldots ,n-1\}$ . Rescaling this by $s^{a_k}b_k$ yields

(3.13)

Convolutions with dilates of

can be controlled pointwise by the Hardy–Littlewood maximal function $\textrm {M}$ (see [Reference Stein and Weiss54, Formula (3.9)]), so (3.13) implies

(3.14)

We use (3.14) with $k=n-1$ and $\nu =\sigma ^{y_1,\ldots ,y_{n-2}}_{-1}$ to further bound the expression (3.12). Repeating the previous step $n-2$ times more, i.e., for $k=n-2,\ldots ,2,1$ , we end up with

where a is the sum of the numbers $a_k$ , as in (3.1). Observing $f\leq \textrm {M}f$ and using the Cauchy–Schwarz inequality, we get

Now, we sum in j and observe that, for each fixed $t\in [\varepsilon ,1]$ , the intervals $[\theta t\lambda _j,e\theta t\lambda _j]$ , $j=1,\ldots ,J$ , cover any fixed point from $(0,\infty )$ at most two times. By one last application of the Cauchy–Schwarz inequality (this time for the sum in j) followed by boundedness of $\textrm {M}$ on $\textrm {L}^{2n}(\mathbb{R}^d)$ , we conclude

That way, identity (2.10) and trivial estimates $\|f\|_{\textrm {L}^{2n}}\leq R^{d/2n}$ , $\|f\|_{\textrm {L}^{2}}\leq R^{d/2}$ , together with $d=n+1$ , finish the proof of (3.3).

3.3 The uniform part: proof of (3.4)

Take $0<\vartheta <\varepsilon $ and a measurable function $f\colon [0,R]^{n+1}\to [0,1]$ . From the previous subsection, we know that $\mathcal {N}^{\vartheta }_{\lambda }(f)-\mathcal {N}^{\varepsilon }_{\lambda }(f)$ is the sum of $\mathcal {L}_{\lambda }^{\vartheta ,\varepsilon ,m}(f)$ over $m=1,\ldots ,n$ . Once again, by symmetry, it is sufficient to consider the case $m=n$ , when we have the representation (3.7). Using $0\leq f\leq 1$ and the Cauchy–Schwarz inequality in the variable x, we get

Another application of the Cauchy–Schwarz inequality, this time in the variables $y_1,\ldots ,y_{n-1}$ , followed by Plancherel’s identity, leads to

Observe that $\sigma ^{y_1,\ldots ,y_{n-1}}$ is just the circle measure inside a two-dimensional plane orthogonal to $\mathop {\textrm {span}}(\{y_1,\ldots ,y_{n-1}\})$ as long as $y_1,\ldots ,y_{n-1}$ are linearly independent, which happens almost surely. Using the decay estimate (2.2), we get

$$ \begin{align*} \big|\widehat{\sigma}^{y_1,\ldots,y_{n-1}}(\zeta)\big| \lesssim \mathop{\textrm{dist}}\big(\zeta,\mathop{\textrm{span}}(\{y_1,\ldots,y_{n-1}\})\big)^{-1/2}, \end{align*} $$

for every $\zeta \in \mathbb{R}^{n+1}$ . By rescaling $y_1,\ldots ,y_{n-1}$ , this, in turn, gives

(3.15)

where

$$ \begin{align*} \mathcal{I}(\zeta) := \int_{(\mathbb{R}^{n+1})^{n-1}} & \mathop{\textrm{dist}}\big(\zeta,\mathop{\textrm{span}}(\{y_1,\ldots,y_{n-1}\})\big)^{-1} \\ & \textrm{d}\sigma^{y_1,\ldots,y_{n-2}}(y_{n-1}) \cdots \textrm{d}\sigma^{y_1}(y_2) \,\textrm{d}\sigma(y_1). \end{align*} $$

Integrating over all rotations $U\in \textrm {SO}(n+1,\mathbb{R})$ and taking $(y_1,\ldots ,y_{n-1})=(U u_1,\ldots ,U u_{n-1})$ , we want to conclude

(3.16) $$ \begin{align} \mathcal{I}(\zeta) \lesssim |\zeta|^{-1}. \end{align} $$

We could proceed as in [Reference Bourgain4] or [Reference Lyall and Magyar45], but a more elementary argument is also available. Instead of integrating over U, we can rather take $y_1,\ldots ,y_{n-1}$ to be fixed unit vectors that span the coordinate plane $\mathbb{R}^{n-1}\times \{(0,0)\}$ and integrate over all possible directions of $\zeta $ determined by the standard unit sphere $\mathbb{S}^{n}$ in $\mathbb{R}^{n+1}$ . This will yield the same result for $\mathcal {I}(\zeta )$ up to a constant. Writing $\zeta $ in the $(n+1)$ -dimensional spherical coordinates,

$$ \begin{align*} \zeta = |\zeta| (\cos\phi_1, \sin\phi_1\cos\phi_2, \ldots, \sin\phi_1\cdots\sin\phi_{n-1}\cos\phi_n, \sin\phi_1\cdots\sin\phi_n), \end{align*} $$

$\phi _1,\ldots ,\phi _{n-1}\in [0,\pi ]$ , $\phi _n\in [0,2\pi )$ , and observing

$$ \begin{align*} \mathop{\textrm{dist}}\big(\zeta,\mathop{\textrm{span}}(\{y_1,\ldots,y_{n-1}\})\big) = |\zeta| \sin\phi_1 \cdots \sin\phi_{n-1}, \end{align*} $$

we can estimate $\mathcal {I}(\zeta )$ by a constant multiple of

$$ \begin{align*} |\zeta|^{-1} \int_{[0,\pi]^{n-1}\times[0,2\pi)} \frac{\sin^{n-1}\phi_1\sin^{n-2}\phi_2 \cdots \sin\phi_{n-1} \,\textrm{d}\phi_1\,\textrm{d}\phi_2\cdots\textrm{d}\phi_n}{\sin\phi_1\sin\phi_2 \cdots \sin\phi_{n-1}} \lesssim |\zeta|^{-1}. \end{align*} $$

This establishes (3.16) and thus also gives

(3.17)

for $\zeta \in \mathbb{R}^{n+1}$ and $t>0$ . Substituting $\zeta =\lambda ^{a_n}b_n\xi $ , plugging (3.17) into (3.15), and using Plancherel’s theorem again, we finally get

$$ \begin{align*} \big|\mathcal{L}_{\lambda}^{\vartheta,\varepsilon,n}(f)\big| \lesssim \|f\|_{\textrm{L}^2(\mathbb{R}^{n+1})}^2 \int_{\vartheta}^{\varepsilon} t^{a_n/2} \,\frac{\textrm{d}t}{t} \lesssim R^{n+1}\varepsilon^{a_n/2}. \end{align*} $$

This proves (3.4).

Arguments in this subsection reveal irrelevance of the nature of dilations for this part of the proof, which is the reason why we were able to proceed in a similar way as Bourgain [Reference Bourgain4] or Lyall and Magyar [Reference Lyall and Magyar45, Reference Lyall and Magyar47].

4 Anisotropic rectangular boxes: proof of Theorem 2

In this section, $\sigma $ will denote exclusively the circular measure in $\mathbb{R}^2$ . Many elements of the proof will be similar to the corresponding ingredients in [Reference Durcik and Kovač16, Section 7]. Still, a few modifications are needed.

For a compactly supported measurable function $f\colon (\mathbb{R}^2)^n\to [0,1]$ and for $\lambda>0$ and $0<\varepsilon \leq 1$ , this time, we define

$$ \begin{align*} \mathcal{N}^{0}_{\lambda}(f) := \int_{(\mathbb{R}^2)^{2n}} \left( \prod_{(r_1,\ldots,r_n)\in\{0,1\}^n} \!\!\! f(x_1 + r_1 y_1, \ldots, x_n + r_n y_n) \right) \left( \prod_{k=1}^{n}\textrm{d}x_k\,\textrm{d}\sigma_{\lambda^{a_k}b_k}(y_k) \right) \end{align*} $$

and

where we recall the notation (2.12)–(2.14), specialized to $d=2$ .

Theorem 2 will follow once we establish:

(4.1)
(4.2)
(4.3)

Here, a, c, $\delta $ , $\varepsilon $ , J, $\lambda _j$ , $\lambda $ , and R are just as in Section 3, while $B\subseteq ([0,R]^2)^n$ is a measurable set satisfying $|B|\geq \delta R^{2n}$ . We set and continue using the notation (2.12)–(2.14).

4.1 The structured part: proof of (4.1)

We partition a major part of the cube $([0,R]^2)^n$ into the collection of rectangular boxes $Q_1 \times \cdots \times Q_n$ , where each $Q_k$ is a square of the form $[l \lambda ^{a_k}b_k,{(l+1)} \lambda ^{a_k}b_k)\times [l' \lambda ^{a_k}b_k,(l'+1) \lambda ^{a_k}b_k)$ for some integers $0\leq l,l'\leq \lfloor \lambda ^{-a_k}b_k^{-1}R\rfloor -1$ . Each of these boxes has measure $(\lambda ^{a_k}b_k)^2$ and their total number is clearly comparable to $R^{2n}\prod _{k=1}^{n}\lambda ^{-2a_k}$ .

When estimating

from below, we restrict the domain of integration with additional constraints requiring that $x_k^0$ , $x_k^1$ lie in the same square $Q_k$ mentioned above, for each $k=1,\ldots ,n$ . Using the box–Gowers–Cauchy–Schwarz inequality [Reference Green and Tao30, Reference Shkredov50, Reference Tao57], or simply by several applications of the ordinary Cauchy–Schwarz inequality, we obtain

Applying this to each of the choices of $Q_1\times \cdots \times Q_n$ , using

and recalling $|Q_k|\sim \lambda ^{2a_k}$ , we get

so discrete Jensen’s inequality for the power function $t\mapsto t^{2^n}$ gives (4.1).

4.2 The error part: proof of (4.2)

Using the product rule and the generalized heat equation (2.7), we obtain

for $\lambda>0$ , and $y_1,\ldots ,y_n\in \mathbb{R}^2$ . By the fundamental theorem of calculus, the difference $\mathcal {N}^{\alpha }_{\lambda }(f)-\mathcal {N}^{\beta }_{\lambda }(f)$ is, for any $0<\alpha <\beta $ , equal to the sum of n terms given by

for $m=1,\ldots ,n$ . By symmetry, it is sufficient to fix $m=n$ and prove an upper bound for

(4.4) $$ \begin{align} \sum_{j=1}^{J}|\mathcal{L}_{\lambda_j}^{\varepsilon,1,n}(f)|. \end{align} $$

Take $\theta $ as in (3.8) and use (3.9), transforming $\mathcal {L}_{\lambda _j}^{\varepsilon ,1,n}(f)$ into

For $j,s,t$ as in (3.10), this time, we denote

$$ \begin{align*} r = r(j,s,t) := \big((t\lambda_j)^{2a_n}-2s^{2a_n}\big)^{1/2a_n} \end{align*} $$

and observe, once again, that $s\sim t\lambda _j \sim r$ . From (2.4) and (2.5), we see

so (4.4) is at most a constant times

The same computation leading to (3.13) still applies, so we can use this formula again, this time with $d=2$ and $\nu =\sigma $ :

(4.5)

for $k=1,\ldots ,n-1$ . Very similarly, imitating the proof of (3.11), we get

(4.6)

Estimates (4.5) and (4.6) bound (4.4) by a constant multiple of

Using the Cauchy–Schwarz inequality in all variables other than $x_n^0$ and $x_n^1$ , observing that the two obtained terms are equal, expanding out the square, and collapsing back the convolution using identity (2.4), we see that this expression is at most

(4.7)

For a fixed t and varying j the number of times, the intervals $[\theta t\lambda _j,e\theta t\lambda _j]$ cover any fixed point from $(0,\infty )$ is at most a constant depending only on $a_n$ . Using this observation in connection with (4.7) and recognizing the inner expression as (2.11), we finally obtain

$$ \begin{align*} \sum_{j=1}^{J}|\mathcal{L}_{\lambda_j}^{\varepsilon,1,n}(f)| \lesssim \varepsilon^{-3a} \log(1/\varepsilon) \int_{[1,\infty)^n} \Theta^{n,n}_{\gamma_1,\ldots,\gamma_{n-1},2^{1/2}}(f) \,\frac{\textrm{d}\gamma_1}{\gamma_1^2} \cdots \frac{\textrm{d}\gamma_n}{\gamma_n^2}. \end{align*} $$

Thus, the bound for $\Theta ^{n,n}_{\gamma _1,\ldots ,\gamma _{n-1},2^{1/2}}(f)$ coming from identity (2.15) completes the proof of (4.2).

4.3 The uniform part: proof of (4.3)

Take $0<\vartheta <\varepsilon \leq 1$ . In order to control $\mathcal {N}^{\vartheta }_{\lambda }(f)-\mathcal {N}^{\varepsilon }_{\lambda }(f)$ , we need to bound $|\mathcal {L}_{\lambda }^{\vartheta ,\varepsilon ,m}(f)|$ for $m=1,\ldots ,n$ .

Once we fix m and a number t such that $\vartheta \leq t\leq \varepsilon $ , Plancherel’s theorem yields

(4.8)

We use the decay estimate (2.1) for $\widehat {\sigma }$ to conclude

(4.9)

for each $\zeta \in \mathbb{R}^2$ . Then, we simply take $\zeta =\lambda ^{a_m}b_m\xi $ in (4.9) and combine it with (4.8). By another application of Plancherel’s theorem, we see that (4.8) is at most a constant times

$$ \begin{align*} t^{a_m/2} \|\mathcal{F}^{(m)}\|_{\textrm{L}^2(\mathbb{R}^2)}^2 \leq t^{a_m/2} R^2. \end{align*} $$

Recall that

. Integrating in all of the remaining variables, we get

and thus also

It remains to send $\vartheta \to 0^+$ .

5 Anisotropic trees: proof of Theorem 3

After all of the material presented in Sections 3 and 4, the proof of Theorem 3 will not require many new ideas. We merely need to pick and reapply a few elements of the proofs of Theorems 1 and 2, so we will be somewhat brief.

Once again, let $\sigma $ be the circular measure supported on $\mathbb{S}^1\subseteq \mathbb{R}^2$ . Take a compactly supported measurable function $f\colon \mathbb{R}^2\to [0,1]$ and parameters $\lambda \in (0,\infty )$ , $\varepsilon \in (0,1]$ . Let us jump straight to the definition of the smoother counting form:

where we write $u(k),v(k)$ for the two vertices that are joined by the edge $k\in E$ . The actual counting form $\mathcal {N}^{0}_{\lambda }$ , obtained in the limit as $\varepsilon \to 0$ , can also be defined explicitly by declaring a particular vertex as the “root” of the tree; see a similar reasoning in Section 5.1 below. We do not need a formula for $\mathcal {N}^{0}_{\lambda }(f)$ in the proof.

In order to prove Theorem 3, it is enough to show the following:

(5.1)
(5.2)
(5.3)

where a, c, J, $\lambda _j$ , $\lambda $ , and R were described in Section 3 and $B\subseteq [0,R]^2$ is a measurable set satisfying $|B|\geq \delta R^2$ . We write f interchangeably with .

5.1 The structured part: proof of (5.1)

Denote $t_k:=\lambda ^{a_k}b_k$ for each $k\in E$ . Just as in Section 3.1, turn $[0,R]^2$ into a probability space and define the dyadic filtration $(\mathcal {G}_m)_{m=0}^{\infty }$ on it.

Let us declare arbitrary particular vertex $v_{\mathcal {T}}\in V$ as the tree root or the tree top. We can imagine that the tree $\mathcal {T}$ is “hanged” upside down by holding it by the vertex $v_{\mathcal {T}}$ . This naturally yields to relations is a child of and is a parent of on the set of vertices V. For any $v\in V$ , let $\mathcal {T}_v=(V_v,E_v)$ be the subtree of $\mathcal {T}$ consisting of v as its root and of all descendants of v with respect to $\mathcal {T}$ .

For any tree $\mathcal {T}=(V,E)$ with root $v_{\mathcal {T}}$ , we define the following function of only one variable $x_{v_{\mathcal {T}}}$ :

By induction on the tree structure, we are going to show that for any such tree $\mathcal {T}$ , there exist nonnegative integers $m_1,m_2,\ldots ,m_{|E|}$ (depending also on the numbers $t_k$ ) such that the following pointwise inequality holds:

(5.4) $$ \begin{align} \mathcal{A}_{\mathcal{T}}f \geq f (\mathbb{E}_{m_1}f) (\mathbb{E}_{m_2}f) \cdots (\mathbb{E}_{m_{|E|}}f) \quad \textrm{a.e.} \end{align} $$

for any nonnegative bounded measurable function f. Indeed, the induction basis is the case when the tree $\mathcal {T}$ has only one vertex and no edges, and then (5.4) reduces to a trivial inequality $f\geq f$ . For the induction step, let $k_1,\ldots ,k_s$ be all edges incident with the root $v_{\mathcal {T}}$ and let $v_1,\ldots ,v_s$ , respectively, be the corresponding children of $v_{\mathcal {T}}$ . Clearly,

i.e.,

For each $i\in \{1,\ldots ,s\}$ , let $l_i$ be the smallest nonnegative integer such that $2^{-{l_i}}R<t_{k_i}$ . That way, we get

(5.5) $$ \begin{align} \mathcal{A}_{\mathcal{T}}f \gtrsim f \ \prod_{i=1}^{s} \mathbb{E}_{l_i} \mathcal{A}_{\mathcal{T}_{v_i}}f \quad \textrm{a.e.} \end{align} $$

Let us apply the induction hypothesis of (5.4) to each of the subtrees $\mathcal {T}_{v_i}$ and then resolve the nested conditional expectations using inequality (2.25). Plugging these into (5.5) and multiplying them for all i, we conclude that $\mathcal {A}_{\mathcal {T}}f$ also satisfies a lower bound of the form (5.4) for some positive integers $m_1,\ldots ,m_{|E|}$ , which we do not need to make explicit. This finalizes the induction step and establishes the claim (5.4).

Combining (5.4) with (2.24) and applying it to

, we finally conclude

which is precisely (5.1).

5.2 The error part: proof of (5.2)

Generalized heat equation (2.7) and the fundamental theorem of calculus allow us to rewrite the difference $\mathcal {N}^{\varepsilon }_{\lambda }(f)-\mathcal {N}^{1}_{\lambda }(f)$ as

$$ \begin{align*} \sum_{m\in E}\mathcal{L}_{\lambda}^{\varepsilon,1,m}(f), \end{align*} $$

where, this time,

The proof of (3.3) presented in Section 3.2 does not recognize any graph structure at all. Thus, the same proof carries over here, replacing measures $\sigma ^{y_1,\ldots ,y_{k-1}}$ (associated with varying spheres of different dimensions) always with the same circle measure $\sigma $ .

5.3 The uniform part: proof of (5.3)

The proof of (3.4) given in Section 3.3 does not see any graph structure either. It keeps cancellation at only one crucial place, i.e., associated with only one chosen edge. For this reason, we can proceed with only very minor modifications of the arguments from either Section 3.3 or 4.3.

6 Closing remarks

6.1 Multilinear anisotropic singular integrals

As we have already said, a part of the motivation behind this work lies in encouraging connections between the techniques from multilinear harmonic analysis and the problems on combinatorics of the Euclidean space. In this subsection, we want to single out singular integrals that can naturally be associated with problems studied in Theorems 13. This correspondence should not be understood literally, but rather on the level of heuristics and methodology. After all, the proofs of the above theorems did not use any estimates for the integral operators that will be mentioned here. In Sections 35, we preferred to use ad hoc shortcuts in the form of the Gaussian domination estimate (2.9) and identities (2.10) and (2.15). However, it would be a pity not to mention analytical counterparts of these problems, if for nothing else, then to point out where to look for the techniques for handling similar or more general combinatorial questions.

First, for a tuple $\textbf {f}=(f_0,f_1,\ldots ,f_n)$ of measurable functions on $\mathbb{R}^d$ and for a certain singular kernel K, we define the translation-invariant multilinear form

(6.1) $$ \begin{align} \Lambda_{K}(\textbf{f}) := \textrm{p.v.} \int_{(\mathbb{R}^{d})^{n+1}} K(x_1-x_0,\ldots,x_n-x_0) \,\left(\prod_{k=0}^{n} f_k(x_k)\,\textrm{d}x_k \right). \end{align} $$

From either (3.6) or (3.7), we are naturally lead to the study of multilinear singular integral forms (6.1). For instance, if we were allowed to replace the measures $\mu $ or $\sigma ^{y_1,\ldots ,y_{k-1}}$ by the Dirac delta measure at the origin, we would obtain the kernel K given by

(6.2)

The study of multilinear singular integrals (6.1) with Calderón–Zygmund kernels K was initiated by Coifman and Meyer in the 1970s (for instance, see [Reference Coifman and Meyer7Reference Coifman and Meyer9]), while a more systematic treatment was first given by Grafakos and Torres [Reference Grafakos and Torres29]. However, (6.2) is not the most usual singular kernel. It satisfies the Calderón–Zygmund estimates with respect to quasinorms associated with anisotropic power-type dilations (1.1). Such more general dilation structures have been studied by Stein and Wainger [Reference Stein and Wainger53]. Strictly speaking, classical results, such as those from [Reference Grafakos and Torres29], do not apply to kernels (6.2), but the same techniques do, and the proofs can be repeated mutatis mutandis (also see [Reference Ghosh, Bhojak, Mohanty and Shrivastava28]). Unsurprisingly, Section 3.2 already comes quite close to proving some $\textrm {L}^p$ bounds for (6.1), with its use of maximal and square function estimates, the trivial case of the latter being identity (2.10). However, it is conceivable that, in the future, one reduces a certain combinatorial problem to the study of (6.1) for quite different kernels K.

Next, for a tuple of measurable functions $\textbf {f}=(f_{r_1,\ldots ,r_n})_{(r_1,\ldots ,r_n)\in \{0,1\}^n}$ on $(\mathbb{R}^d)^{n}$ indexed by points from $\{0,1\}^n$ and for a singular kernel K, we define

(6.3) $$ \begin{align} \Theta_{K}(\textbf{f}) := \textrm{p.v.} \int_{(\mathbb{R}^d)^{2n}} & \left( \prod_{(r_1,\ldots,r_n)\in\{0,1\}^n} f_{r_1,\ldots,r_n}(x_1 + r_1 y_1, \ldots, x_n + r_n y_n) \right) \nonumber \\ & \times \,K(y_1,\ldots,y_n) \left( \prod_{k=1}^{n}\textrm{d}x_k\,\textrm{d}y_k \right) . \end{align} $$

These forms have also been studied extensively when K is the usual Calderón–Zygmund kernel, i.e., satisfying Calderón–Zygmund estimates with respect to the Euclidean metric. The first $\textrm {L}^p$ bounds for the forms (6.3) were established in the case $d=1$ and $n=2$ by Durcik [Reference Durcik12, Reference Durcik13]. Prior to that, their dyadic model has been investigated by the present author [Reference Kovač40, Reference Kovač41] and by Thiele and the present author [Reference Kovač and Thiele43]. Estimates for rather general “entangled” multilinear singular integrals of the above type (i.e., with cubical structure) have been shown recently by Durcik and Thiele [Reference Durcik and Thiele21]. It is also interesting to mention that the study of multiparameter variants of these objects has only recently been initiated by Bernicot and Durcik [Reference Bernicot and Durcik2]. It could be interesting to study (6.3) for more general dilation structures, i.e., when K is a generalized Calderón–Zygmund kernel, such as in the case (6.2), which is relevant here. In this context, a similar but still different object has been studied by Škreb and the present author [Reference Kovač and Škreb42]. Some generalizations of the result by Durcik [Reference Durcik12] are straightforward: the single $\textrm {L}^{2^n}\times \cdots \times \textrm {L}^{2^n}$ bound for (6.3) can be extracted easily from the presented proof of Theorem 2 combined with a cone decomposition of the kernel. It is quite likely that $\Theta _K$ still satisfies the same $\textrm {L}^p$ estimates from [Reference Durcik13], but we do not attempt such generalizations here.

Finally, after Theorem 3, a closely related topic of further investigation could be establishing estimates for entangled multilinear singular integrals associated with bipartite graphs or r-partite r-regular hypergraphs. Dyadic models of these problems are significantly easier: they have already been handled quite generally by the present author [Reference Kovač40] and Stipčić [Reference Stipčić55], respectively. Otherwise, the only cases and variants of entangled singular integral forms studied so far are the so-called “twisted paraproduct operator” [Reference Durcik and Roos20, Reference Kovač41], the operators with cubical structure [Reference Durcik12, Reference Durcik13, Reference Durcik and Thiele21], and the operators that resemble multilinear Hilbert transforms [Reference Durcik and Kovač16, Reference Durcik, Kovač and Thiele19, Reference Tao58, Reference Zorin-Kranich60].

6.2 Comments on the anisotropic setting

Configurations generated by anisotropic power-type dilations (1.1) have not been studied prior to this work. We found this setting sufficiently interesting, because it fits nicely to the general method explained in Section 1.3. Still, the results formulated in Section 1.2 are far from being definite.

In relation to simplices, the fact that we are dilating by power functions $\lambda \mapsto \lambda ^a b$ plays little role in the proof of Theorem 1. Structured and uniform parts are handled much more generally, while the control of the error part essentially requires control of the multiscale objects of the form (6.1). Perhaps by studying multilinear singular integral forms (6.1) more closely, one can hope for further generalizations of Theorem 1. In the present paper, the setting (1.1) was just very convenient.

The same comment does not apply to rectangular boxes. So far, the only known way of handing entangled singular forms (6.3) is quite rigid and uses the same steps as those employed in the proof of Theorem 2. All of the papers [Reference Durcik12, Reference Durcik13, Reference Durcik, Kovač, Škreb and Thiele18, Reference Durcik, Kovač and Thiele19, Reference Durcik and Thiele21] used domination by Gaussians (2.9) and some form of the product rule (2.16), or equivalently, integration by parts. This leaves less flexibility, so algebraic properties, such as (2.3)–(2.7), are now crucial in the proof.

Finally, one might wonder why we do not study general distance graphs in Theorem 3. When vertices of the graph (or points of the configuration) are added one by one, it is desirable that the location of the new vertex depends on the locations of previous vertices and $\lambda $ in a reasonably simple way. In general, dilations (1.1) with different exponents $a_k$ can make this dependence “very nonlinear.” This complication arises already when the distance graph in question is a cycle of length $3$ . Suppose that, for all sufficiently large $\lambda $ , we want to find points $x_0,x_1,x_2\in A\subseteq \mathbb{R}^d$ such that $|x_0-x_1|=\lambda $ , $|x_0-x_2|=|x_1-x_2|=\lambda ^2$ . Once $x_0$ and $x_1$ are located, the third point $x_2$ has to lie on the hyperplane H orthogonal to the segment joining $x_0$ and $x_1$ and passing through its midpoint $(x_0+x_1)/2$ . However, the distance from $x_3$ to $(x_0+x_1)/2$ needs to be $\lambda \sqrt {4\lambda ^2-1}/2$ , and we would need to stretch the corresponding spherical measure by this radical factor. For a similar reason in Theorem 1, we do not determine a simplex solely by lengths of its edges, but rather fix the angles between all pairs of its edges meeting at a single vertex.

Acknowledgment

The author is grateful to Alex Amenta, Polona Durcik, and João Pedro Ramos for useful discussions and to both anonymous referees for numerous suggestions on improving the text. He also appreciates hospitality of the Georgia Institute of Technology in the academic year 2019–2020.

Footnotes

This work is supported in part by the Croatian Science Foundation project UIP-2017-05-4129 (MUNHANAP) and in part by the Fulbright Scholar Program.

References

Bennett, M., Iosevich, A., and Taylor, K., Finite chains inside thin subsets of d . Anal. PDE 9(2016), no. 3, 597614.CrossRefGoogle Scholar
Bernicot, F. and Durcik, P., Boundedness of some multi-parameter fiber-wise multiplier operators. Indiana Univ. Math. J. Preprint, 2020. arXiv:2007.02211 Google Scholar
Bourgain, J., A Szemerédi type theorem for sets of positive density in R k . Israel J. Math. 54(1986), no. 3, 307316.CrossRefGoogle Scholar
Bourgain, J., A nonlinear version of Roth’s theorem for sets of positive density in the real line. J. Analyse Math. 50(1988), 169181.CrossRefGoogle Scholar
Bulinski, K., Spherical recurrence and locally isometric embeddings of trees into positive density subsets of d . Math. Proc. Cambridge Philos. Soc. 165(2018), no. 2, 267278.Google Scholar
Chan, V., Łaba, I., and Pramanik, M., Finite configurations in sparse sets. J. Anal. Math. 128(2016), 289335.CrossRefGoogle Scholar
Coifman, R. R. and Meyer, Y., On commutators of singular integrals and bilinear singular integrals. Trans. Amer. Math. Soc. 212(1975), 315331.CrossRefGoogle Scholar
Coifman, R. R. and Meyer, Y., Au delà des opérateurs pseudo-différentiels. Astérisque, 57, Société mathématique de France, Paris, 1978.Google Scholar
Coifman, R. R., and Meyer, Y., Commutateurs d’intégrales singulières et opérateurs multilinéaires. Ann. Inst. Fourier (Grenoble) 28(1978), no. 3, 177202.CrossRefGoogle Scholar
Cook, B., Magyar, Á., and Pramanik, M., A Roth-type theorem for dense subsets of ℝd . Bull. Lon. Math. Soc. 49(2017), no. 4, 676689.CrossRefGoogle Scholar
Denson, J., Pramanik, M., and Zahl, J., Large sets avoiding rough patterns. In: M. T. Rassias (ed.), Harmonic Analysis and Applications, Springer Optimization and Its Applications 168, Springer, Cham, 2021, pp. 5975.Google Scholar
Durcik, P., An ${\mathsf{L}}^4$ estimate for a singular entangled quadrilinear form . Math. Res. Lett. 22(2015), no. 5, 13171332.CrossRefGoogle Scholar
Durcik, P., L p estimates for a singular entangled quadrilinear form . Trans. Amer. Math. Soc. 369(2017), no. 10, 69356951.CrossRefGoogle Scholar
Durcik, P., Guo, S., and Roos, J., A polynomial Roth theorem on the real line. Trans. Amer. Math. Soc. 371(2019), no. 10, 69736993.CrossRefGoogle Scholar
Durcik, P. and Kovač, V., Boxes, extended boxes, and sets of positive upper density in the Euclidean space. Math. Proc. Cambridge Philos. Soc. Published Online (2021). doi:10.1017/S0305004120000316 Google Scholar
Durcik, P. and Kovač, V., A Szemerédi-type theorem for subsets of the unit cube. Anal. PDE Preprint, 2020. arXiv:2003.01189 Google Scholar
Durcik, P., Kovač, V., and Rimanić, L., On side lengths of corners in positive density subsets of the Euclidean space. Int. Math. Res. Not. 2018(2018), no. 22, 68446869.CrossRefGoogle Scholar
Durcik, P., Kovač, V., Škreb, K. A., and Thiele, C., Norm-variation of ergodic averages with respect to two commuting transformations. Ergod. Th. & Dynam. Sys. 39(2019), no. 3, 658688.CrossRefGoogle Scholar
Durcik, P., Kovač, V., and Thiele, C., Power-type cancellation for the simplex Hilbert transform. J. Anal. Math. 139(2019), 6782.CrossRefGoogle Scholar
Durcik, P. and Roos, J., Averages of simplex Hilbert transforms. Proc. Amer. Math. Soc. 149(2021), 633647.CrossRefGoogle Scholar
Durcik, P. and Thiele, C., Singular Brascamp–Lieb inequalities with cubical structure. Bull. Lond. Math. Soc. 52(2020), no. 2, 283298.Google Scholar
Durrett, R., Probability—theory and examples. 5th ed., Cambridge Series in Statistical and Probabilistic Mathematics, 49, Cambridge University Press, Cambridge, 2019.CrossRefGoogle Scholar
Falconer, K. J. and Marstrand, J. M., Plane sets with positive density at infinity contain all large distances. Bull. Lond. Math. Soc. 18(1986), no. 5, 471474.CrossRefGoogle Scholar
Fraser, R., Guo, S., and Pramanik, M., Polynomial Roth theorems on sets of fractional dimensions. Int. Math. Res. Not. Preprint, 2019. arXiv:1904.11123 Google Scholar
Fraser, R. and Pramanik, M., Large sets avoiding patterns. Anal. PDE 11(2018), no. 5, 10831111.CrossRefGoogle Scholar
Furstenberg, H. and Katznelson, Y., An ergodic Szemerédi theorem for commuting transformations. J. Anal. Math. 38(1978), no. 1, 275291.CrossRefGoogle Scholar
Furstenberg, H., Katznelson, Y., and Weiss, B., Ergodic theory and configurations in sets of positive density . In: Nešetřil, J. and Rödl, V. (eds.), Mathematics of Ramsey theory, Algorithms and Combinatorics, 5, Springer, Berlin, 1990, pp. 184198.CrossRefGoogle Scholar
Ghosh, A., Bhojak, A., Mohanty, P., and Shrivastava, S., Sharp weighted estimates for multi-linear Calderón–Zygmund operators on non-homogeneous spaces. J. Pseudo-Differ. Oper. Appl. 11(2020), 18331867.CrossRefGoogle Scholar
Grafakos, L. and Torres, R. H., Multilinear Calderón–Zygmund theory. Adv. Math. 165(2002), no. 1, 124164.CrossRefGoogle Scholar
Green, B. and Tao, T., The primes contain arbitrarily long arithmetic progressions. Ann. of Math. (2). 167(2008), no. 2, 481547.CrossRefGoogle Scholar
Greenleaf, A., Iosevich, A., Liu, B., and Palsson, E., An elementary approach to simplexes in thin subsets of Euclidean space. Preprint, 2016. arXiv:1608.04777 CrossRefGoogle Scholar
Greenleaf, A., Iosevich, A., and Mkrtchyan, S., Existence of similar point configurations in thin subsets of d . Math. Z. 297(2021), 855865.Google Scholar
Greenleaf, A., Iosevich, A., and Pramanik, M., On necklaces inside thin subsets of d . Math. Res. Lett. 24(2017), no. 2, 347362.CrossRefGoogle Scholar
Henriot, K., Łaba, I., and Pramanik, M., On polynomial configurations in fractal sets. Anal. PDE 9(2016), no. 5, 11531184.CrossRefGoogle Scholar
Huckaba, L., Lyall, N., and Magyar, Á., Simplices and sets of positive upper density in d . Proc. Amer. Math. Soc. 145(2017), no. 6, 23352347.CrossRefGoogle Scholar
Iosevich, A. and Liu, B., Equilateral triangles in subsets of d of large Hausdorff dimension . Israel J. Math. 231(2019), no. 1, 123137.CrossRefGoogle Scholar
Iosevich, A. and Magyar, Á., Simplices in thin subsets of Euclidean spaces. Preprint, 2020. arXiv:2009.04902 Google Scholar
Iosevich, A. and Taylor, K., Finite trees inside thin subsets of d . In: A. Karapetyants, V. Kravchenko, and E. Liflyand (eds.), Modern methods in operator theory and harmonic analysis, Springer Proceedings in Mathematics & Statistics, 291, Springer, Cham, 2019, pp. 5156.CrossRefGoogle Scholar
Keleti, T., Construction of one-dimensional subsets of the reals not containing similar copies of given patterns. Anal. PDE 1(2008), no. 1, 2933.CrossRefGoogle Scholar
Kovač, V., Bellman function technique for multilinear estimates and an application to generalized paraproducts. Indiana Univ. Math. J. 60(2011), no. 3, 813846.CrossRefGoogle Scholar
Kovač, V., Boundedness of the twisted paraproduct. Rev. Mat. Iberoam. 28(2012), no. 4, 11431164.CrossRefGoogle Scholar
Kovač, V. and Škreb, K. A., One modification of the martingale transform and its applications to paraproducts and stochastic integrals. J. Math. Anal. Appl. 426(2015), no. 2, 11431163.CrossRefGoogle Scholar
Kovač, V. and Thiele, C., A T(1) theorem for entangled multilinear dyadic Calderón–Zygmund operators . Illinois J. Math. 57(2013), no. 3, 775799.Google Scholar
Łaba, I. and Pramanik, M., Arithmetic progressions in sets of fractional dimension. Geom. Funct. Anal. 19(2009), no. 2, 429456.CrossRefGoogle Scholar
Lyall, N. and Magyar, Á., Product of simplices and sets of positive upper density in d . Math. Proc. Cambridge Philos. Soc. 165(2018), no. 1, 2551.CrossRefGoogle Scholar
Lyall, N. and Magyar, Á., Weak hypergraph regularity and applications to geometric Ramsey theory. Trans. Amer. Math. Soc. Preprint, 2019.Google Scholar
Lyall, N. and Magyar, Á., Distance graphs and sets of positive upper density in d . Anal. PDE 13(2020), no. 3, 685700.Google Scholar
Lyall, N. and Magyar, Á., Distances and trees in dense subsets of d . Israel J. Math. 240(2020), 769790.CrossRefGoogle Scholar
Magyar, Á., k-Point configurations in sets of positive density of n . Duke Math. J. 146(2009), no. 1, 134.Google Scholar
Shkredov, I. D., On a problem of Gowers (in Russian). Izv. Ross. Akad. Nauk Ser. Mat. 70(2006), no. 2, 179221. English translation in Izv. Math. 70(2006), no. 2, 385–425.Google Scholar
Shmerkin, P., Salem sets with no arithmetic progressions. Int. Math. Res. Not. IMRN 7(2017), 19291941.Google Scholar
Stein, E. M., Harmonic analysis: real-variable methods, orthogonality, and oscillatory integrals. Princeton Mathematical Series, 43, Princeton University Press, Princeton, NJ, 1993.Google Scholar
Stein, E. M. and Wainger, S., Problems in harmonic analysis related to curvature. Bull. Amer. Math. Soc. 84(1978), no. 6, 12391295.CrossRefGoogle Scholar
Stein, E. M. and Weiss, G., Introduction to Fourier analysis on Euclidean spaces, Princeton Mathematical Series, 32, Princeton University Press, Princeton, NJ, 1971.Google Scholar
Stipčić, M., T(1) theorem for dyadic singular integral forms associated with hypergraphs . J. Math. Anal. Appl. 481(2020), no. 2, Article no. 123496.CrossRefGoogle Scholar
Szemerédi, E., On sets of integers containing no k elements in arithmetic progression. Acta Arith. 27(1975), 199245.CrossRefGoogle Scholar
Tao, T., The ergodic and combinatorial approaches to Szemerédi’s theorem . In: A. Granville, M. B. Nathanson, and J. Solymosi (eds.), Additive combinatorics , CRM Proceedings and Lecture Notes, 43, American Mathematical Society, Providence, RI, 2007, pp. 145193.CrossRefGoogle Scholar
Tao, T., Cancellation for the multilinear Hilbert transform. Collect. Math. 67(2016), no. 2, 191206.CrossRefGoogle Scholar
Yavicoli, A., Patterns in thick compact sets. Israel J. Math. Preprint, 2020. arXiv:1910.10057 CrossRefGoogle Scholar
Zorin-Kranich, P., Cancellation for the simplex Hilbert transform. Math. Res. Lett. 24(2017), no. 2, 581592.CrossRefGoogle Scholar