1. Introduction
Let
$h(\mathbf{Y})=-\int_{\Bbb{R}^{m}}{f_{\mathbf{Y}}(\mathbf{y};\ t)\log f_{\mathbf{Y}}(\mathbf{y};\ t)\,\mathrm{d}\mathbf{y}}$
denote the differential entropy of a random vector
$\mathbf{Y}$
with probability density function (PDF)
$f_{\mathbf{Y}}(\mathbf{y};\ t)$
depending on a real parameter t. The entropy power of an m-variate random vector
$\mathbf{Y}$
is defined by
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU1.png?pub-status=live)
which was first introduced by Shannon [Reference Shannon13]. One of the most important inequalities in information theory is the entropy power inequality (EPI), which gives a lower bound for the differential entropy of the sum of the independent random vectors
$\mathbf{X}$
and
$\mathbf{Y}$
as
$N(\mathbf{X}+\mathbf{Y})\geq N(\mathbf{X})+N(\mathbf{Y})$
. The first complete proof of the EPI was given in [Reference Stam15]; in its development, [Reference Stam15] proved an equality called de Bruijn’s identity. This identity links Fisher information with Shannon’s differential entropy (see [Reference Blachman5]). Consider the additive Gaussian noise channel model
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn1.png?pub-status=live)
in which the input signal
$\mathbf{X}=(X_{1},\ldots,X_{m})^\top$
and the additive noise
$\mathbf{W}_{t}=(W_{t,1},\ldots,W_{t,m})^\top$
are two m-variate random vectors and
$\mathbf{W}_{t}$
is normally distributed with mean vector
$\mathbf{0}$
and covariance matrix
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn2.png?pub-status=live)
where the
$\sigma_{ij}$
,
$i,j=1,2,\ldots ,m$
, are real numbers. De Bruijn’s identity, generalized by Costa [Reference Costa7] to multivariate random variables, is given by
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn3.png?pub-status=live)
in which
$\mathbf{X}$
and
$\mathbf{W}_{t}$
are independent random vectors and
$J(\mathbf{Y})$
stands for the Fisher information of
$f_{\mathbf{Y}}(\mathbf{y};\ t)$
, defined by
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn4.png?pub-status=live)
There are several applications of the EPI, such as in bounding the capacity of certain kinds of channels and proving converses of channel or source coding theorems; see, e.g., [Reference Bergmans6, Reference Weingarten, Steinberg and Shamai18]. Considering the channel model (1), [Reference Costa7] presented an extension of the EPI for the case in which
$\mathbf{W}_{t}$
is independent of
$\mathbf{X}$
with
$\Sigma_{\mathbf{W}_{t}}=t\mathbf{I}_{m}$
, where
$\mathbf{I}_{m}$
is the
$m\times m$
identity matrix. That is,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU2.png?pub-status=live)
or, equivalently,
$N(\mathbf{X}+\mathbf{W}_{t})$
is concave in t, i.e.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn5.png?pub-status=live)
Later, [Reference Dembo8] provided another simple proof for the Costa’s concavity inequality (5) via the Stam Fisher information inequality [Reference Stam15] defined by
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU3.png?pub-status=live)
where X and W are independent random variables. Also, [Reference Villani17] used some advanced methods to simplify Costa’s proof of the inequality (5).
As mentioned before, in all of the above results the assumption of independence between the input signal
$\mathbf{X}$
and the additive noise
$\mathbf{W}_{t}$
has been required. However, there are several real situations, such as in radar and sonar systems, in which the noise is highly dependent on the transmitted signal [Reference Kay11]. It was illustrated in [Reference Takano, Watanabe, Fukushima, Prohorov and Shiryaev16] that, under some assumptions, Shannon’s EPI can hold for weakly dependent random variables; [Reference Asgari and Alamatsaz3] extended the EPI to dependent random variables with arbitrary distibutions; and [Reference Johnson10] provided certain conditions under which the conditional EPI can hold for dependent summands as well.
One of the best methods for describing the dependency structure among random variables is by copula functions. Copula theory was first introduced in [Reference Sklar14] in order to achieve the connection between a joint PDF and its marginals. In [Reference Asgari, Alamatsaz and Khoolenjani4], the authors extended two inequalities based on the Fisher information when the input signal and noise components are dependent and their dependence structure is modeled by several well-known copulas. There are several families of copulas with different dependence structures. The Gaussian copula is one of the most usable, and describes different levels of dependence between marginal components. In the present paper, by considering the additive Gaussian noise channel model (1) where the input signal
$\mathbf{X}$
and noise
$\mathbf{W}_{t}$
are dependent random vectors obeying the multivariate Gaussian copula, first, an extension of de Bruijn’s identity (3) is derived, and then Costa’s concavity inequality (5) is proved, under some mild conditions.
The rest of the paper is organized as follows. In Section 2 we recall the copula theory concept and the basic definition of the multivariate Gaussian copula function, along with one of its particular cases. In Section 3 we provide a generalization of the first-order derivatives of the differential entropy and Fisher information, provided that the input signal and noise components are dependent variables. Thus, based on these derivatives, Costa’s concavity inequality for the case that the random vector
$ \mathbf{X}$
is composed of independent coordinates is extended. Finally, we illustrate the one-dimensional versions of our results in Section 4.
Let us first establish the fundamental definitions and notation used in this paper. Let
$\phi(\mathbf{y})$
and
$\psi(\mathbf{y})$
be twice continuously differentiable functions on
$\Bbb{R}^{m}$
, and V be any closed and simply connected m-dimensional region in
$\Bbb{R}^{m}$
bounded by a piecewise smooth, closed, and oriented surface S. We recall Green’s identity [Reference Amazigo and Rubenfeld1], which is stated as
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn6.png?pub-status=live)
in which
$\nabla\phi$
and
$\nabla\psi$
are the gradients of
$\phi$
and
$\psi$
, respectively,
$\mathbf{n}_{S}$
denotes the unit vector normal to the surface S, and
$\nabla\psi .\mathbf{n}_{S}$
is the inner product of the two vectors. Now, the m-dimensional Stokes’ theorem is recalled: it states that if
$\mathbf{F}\colon\Bbb{R}^{m}\rightarrow\Bbb{R}^{m}$
is a vector field over
$\Bbb{R}^{m}$
, then
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn7.png?pub-status=live)
where
$\partial V=S$
is the boundary of V.
We denote the PDF and cumulative distribution function (CDF) of a random variable X by
$f_{X}(x)$
and
$F_{X}(x)$
, respectively.
2. Copula background
Copula theory is popular in multivariate distribution analysis as copulas allow easy modeling of the distribution of a random vector by its marginals. A copula is a multivariate CDF with standard uniform marginal distributions which couples univariate distribution functions to generate a multivariate CDF and indicates the dependency structure of the random variables. Copulas are important parts of the study of dependency between variables since they allow us to separate the effect of dependency from the effects of the marginal distributions [Reference Joe9]. In recent years, there has been a revival of copulas in applications where the matter of dependency between random variables is of great importance [Reference Arias-Nicolás, Fernández-Ponce, Luque-Calvo and Suárez-Llorens2].
The fundamental theorem for copulas was introduced by Sklar [Reference Sklar14] and illustrates the role that copulas play in the relationship between multivariate CDFs and their univariate marginals. In an n-dimensional multivariate case, Sklar’s theorem states that if
$F_{T_1,T_2,\ldots,T_n}$
is an n-dimensional CDF with marginals
$F_{T_1},F_{T_2},\ldots,F_{T_n}$
, then there exists an n-copula
$C\colon I^{n}\longrightarrow I$
such that
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn8.png?pub-status=live)
where
$I=[0,1]$
. If
$F_{T_1},F_{T_2},\ldots,F_{T_n}$
are continuous, the n-copula C is unique; otherwise, C is uniquely determined on the range of
$F_{T_1}$
$\times$
the range of
$F_{T_2}$
$\times\cdots\times$
the range of
$F_{T_n}$
. Conversely, if C is an n-copula and
$F_{T_1},F_{T_2},\ldots,F_{T_n}$
are univariate distribution functions, then
$F_{T_1,T_2,\ldots,T_n}$
is a joint CDF with marginals
$F_{T_1},F_{T_2},\ldots,F_{T_n}$
.
For any n-copula function C, there exists a corresponding copula density function c:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn9.png?pub-status=live)
Therefore, if
$f_{T_1,T_2,\ldots,T_n}$
,
$f_{T_1},f_{T_2},\ldots f_{T_n}$
, and c are the density functions of
$F_{T_1,T_2,\ldots,T_n}$
,
$F_{T_1},F_{T_2},\ldots F_{T_n}$
, and C, respectively, the relation in (8) yields
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn10.png?pub-status=live)
where
$u_{1},u_{2},\ldots ,u_{n}$
are related to
$t_{1},t_{2},\ldots ,t_{n}$
through the marginal distribution functions
$u_{1}=F_{T_1}(t_{1})$
,
$u_{2}=F_{T_2}(t_{2})$
, …,
$u_{n}=F_{T_n}(t_{n})$
.
Let us recall the definition of one of the most popular copulas, the multivariate Gaussian copula, which we consider here.
Definition 1. The n-dimensional Gaussian copula with covariance matrix
$\boldsymbol{\Sigma}$
is defined by
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn11.png?pub-status=live)
where
$\Phi_{\boldsymbol{\Sigma}}$
denotes the CDF of the n-variate normal random vector with mean vector
$\mathbf{0}$
and covariance matrix
$\boldsymbol{\Sigma}$
,
$\Phi^{-1}$
is the inverse of the univariate standard Gaussian CDF, and
$0\leq u_{1},u_{2},\ldots ,u_{n}\leq 1$
.
In this paper we consider the special version of the n-dimensional Gaussian copula with
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU4.png?pub-status=live)
and
$-1/(n-1)<\rho<1$
in which
$\mathbf{1}_{n}=(1,1,\ldots,1)_{1\times n}^\top$
. Thus, from (9), the n-dimensional Gaussian copula density is given by
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn12.png?pub-status=live)
where
$\phi_{\boldsymbol{\Sigma}}$
is the PDF of the n-variate Gaussian distribution, and
$z_{i}=\Phi^{-1}(u_{i})$
,
$i=1,2,\ldots,n$
. Since
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU5.png?pub-status=live)
we have
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn13.png?pub-status=live)
Now, due to the fact that
$\big(\sum_{i=1}^{n}z_{i}\big)^{2}=\sum_{i=1}^{n}z_{i}^{2}+\sum_{i\neq j}z_{i}z_{j}$
, substituting (13) into (12) yields
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn14.png?pub-status=live)
where
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU6.png?pub-status=live)
Remark 1. Note that setting
$\boldsymbol{\Sigma}=\mathbf{I}_{n}$
, i.e.
$\rho=0$
, in (11) leads to the independent copula
$C_{\mathbf{I}_{n}}(u_{1},u_{2},\ldots ,u_{n}) = u_{1}u_{2}\cdots u_{n}$
, which is equivalent to the random variables
$T_{1},T_{2},\ldots,T_{n}$
being independent.
A particular case of the n-dimensional Gaussian copula is the bivariate Gaussian copula. If we put
$n=2$
and
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU7.png?pub-status=live)
with
$-1<\rho<1$
, then the bivariate Gaussian copula is defined by
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU8.png?pub-status=live)
where
$\rho\in(\!-1,1)$
is the Gaussian copula parameter and
$\Phi_{2}$
is the bivariate standard Gaussian CDF. The Gaussian copula density for
$-1<\rho<1$
is obtained as
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn15.png?pub-status=live)
3. The general case
Consider the additive Gaussian noise channel model (1). Let
$\mathbf{X}$
and
$\mathbf{W}_{t}$
be two dependent random vectors with a differentiable joint PDF
$f_{\mathbf{X},\mathbf{W}_{t}}(\mathbf{x},\mathbf{w}_{t})$
. Then, for the PDF of
$\mathbf{Y}$
, we obtain
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn16.png?pub-status=live)
where
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU9.png?pub-status=live)
First, recall that assuming
$\mathbf{X}$
and
$\mathbf{W}_{t}$
are independent random vectors and
$\Sigma_{\mathbf{W}_{t}}=t\mathbf{I}_{m}$
, [Reference Costa7, Reference Villani17] used the heat equation given by
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU10.png?pub-status=live)
in their proofs. We now need to generalize this heat equation to the case of multivariate random vectors, as below.
Lemma 1.
Suppose that
$\mathbf{W}_{t}$
in channel model (1) has the covariance matrix (2), and let
$\mathbf{X}$
and
$\mathbf{W}_{t}$
be two dependent random vectors whose dependence structure is modeled by the multivariate Gaussian copula (14). Then, we have
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn17.png?pub-status=live)
where
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU11.png?pub-status=live)
Proof. Using (10) and (14), by setting
$\mathbf{T}=(\mathbf{X},\mathbf{W}_{t})$
and
$n=2m$
, we have
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU12.png?pub-status=live)
where
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU13.png?pub-status=live)
in which
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU14.png?pub-status=live)
because
$W_{t,k}$
,
$k=1,2,\ldots,m$
, are normally distributed with zero mean and variance t. Thus,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU15.png?pub-status=live)
By some easy calculations, this expression can be rewritten as (17).
Lemma 2. Based on the same assumptions as in Lemma 1, we have
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn18.png?pub-status=live)
in which
$q(\mathbf{y};\ t) = (q_{1}(\mathbf{y};\ t),q_{2}(\mathbf{y};\ t),\ldots,q_{m}(\mathbf{y};\ t))$
and
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU16.png?pub-status=live)
where
$p_{j}(\mathbf{y};\ t) = {{{{\bf E}}}}_{\mathbf{X}\mid \mathbf{Y}}\big[ \sum_{i=1}^{m}\Phi^{-1}(F_{X_{i}}(X_{i})) + ({1}/{\sqrt{t}})\sum_{k\neq j}(Y_{k}-X_{k}) \mid \mathbf{Y} = \mathbf{y}\big]$
.
Proof. According to Lemma 1, differentiating (17) with respect to t and
$y_{j}$
yields
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn19.png?pub-status=live)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn20.png?pub-status=live)
respectively. Thus, for the second-order derivative of (17) with respect to
$y_{j}$
, we obtain
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn21.png?pub-status=live)
Now, according to (16) and (19), we have
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn22.png?pub-status=live)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn23.png?pub-status=live)
Thus, due to (19), by combining (22) with (23), we obtain
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU17.png?pub-status=live)
where
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU18.png?pub-status=live)
Therefore, using (20), the proof is complete.
Now, we need to derive the first- and second-order derivatives of the differential entropy
$h(\mathbf{Y})$
that are key instruments in establishing our main result.
Theorem 1.
Based on Lemma 2, the first-order derivative of the entropy
$h(\mathbf{Y})$
is derived as
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn24.png?pub-status=live)
where
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU19.png?pub-status=live)
Proof. Using (18), we obtain
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn25.png?pub-status=live)
To apply Green’s identity (6) to the second term in (25), we assume that
$V_{r}$
is the
$m-$
sphere of radius r centered at the origin with boundary
$S_{r}=\partial V_{r}$
. Now, we apply Green’s identity to the second term in (25) with
$\phi(\mathbf{y})=\log f_{\mathbf{Y}}(\mathbf{y};\ t)$
and
$\psi(\mathbf{y})=f_{\mathbf{Y}}(\mathbf{y};\ t)$
, and then take the limit on both sides as
$r\rightarrow +\infty$
. Thus,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn26.png?pub-status=live)
where
$\mathbf{n}_{S_{r}}$
is the unit vector normal in the surface
$S_{r}$
. Consider the identity
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn27.png?pub-status=live)
where
$\mathbf{F}\colon\Bbb{R}^{m}\rightarrow \Bbb{R}^{m}$
. We set
$\mathbf{F}(\mathbf{y})=q(\mathbf{y};\ t)$
and
$\phi(\mathbf{y})=\log f_{\mathbf{Y}}(\mathbf{y};\ t)$
, and then, using Stokes’ theorem (8) and taking limits on both sides as
$r\rightarrow +\infty$
, we get
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn28.png?pub-status=live)
In Appendix A, the surface integrals in (26) and (28) over the surface
$S_{r}$
are shown to vanish as r approaches
$+\infty$
. Therefore, by substituting (26) and (28) into (25), the theorem is proved.
Remark 2. Note that, in Theorem 1, from (24) with
$\rho=0$
, we obtain
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU20.png?pub-status=live)
That is, the first-order derivative of the entropy
$h(\mathbf{Y})$
reduces to the case when
$\mathbf{X}$
and
$\mathbf{W}_{t}$
are independent random vectors with
$\Sigma_{\mathbf{W}_{t}}=tI_{m} $
as in [Reference Costa7].
According to Theorem 1, to provide the second-order derivative of
$h(\mathbf{Y})$
, it is sufficient to derive the first-order derivative of the Fisher information
$J(\mathbf{Y})$
. First, we need the following lemma.
Lemma 3. According to Lemma 2, the following two equations hold:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn29.png?pub-status=live)
where
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn30.png?pub-status=live)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn31.png?pub-status=live)
and
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn32.png?pub-status=live)
Proof. Simply, we know that
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU21.png?pub-status=live)
Also, from (18), we can write
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU22.png?pub-status=live)
which implies (29). To prove (30), we have
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU23.png?pub-status=live)
Also, since
$\nabla. q(\mathbf{y};\ t)=\sum_{j=1}^{m}(\partial/\partial y_{j})q_{j}(\mathbf{y};\ t)$
, (31) is obtained. Now, to prove (32), we obtain
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn33.png?pub-status=live)
where
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU24.png?pub-status=live)
together with (33), this completes the proof.
Theorem 2.
Under the conditions of Lemma 2, the first-order derivative of the Fisher information
$J(\mathbf{Y})$
is as follows:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn34.png?pub-status=live)
where
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU25.png?pub-status=live)
Proof. According to the Fisher information (4), we know that
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn35.png?pub-status=live)
Based on Lemma 2, the first term in (35) is expressed as
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn36.png?pub-status=live)
By applying Green’s identity (6) to the first term in (36) and taking the limit as r tends to
$+\infty$
, we obtain
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn37.png?pub-status=live)
Similarly, using Green’s identity for the second term in (37) and taking the limit, we have
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn38.png?pub-status=live)
The first terms in (37) and (38) can be shown to vanish (see Appendix B), and therefore, by comparing (37) with (38), we can write
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU26.png?pub-status=live)
Substituting this into (36) yields
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn39.png?pub-status=live)
Also, by using (29) in Lemma 3, the second term in (35) can be rewritten as
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn40.png?pub-status=live)
Now, according to (30), we obtain
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU27.png?pub-status=live)
Therefore, from this and (38), we have
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn41.png?pub-status=live)
Thanks to the identity (31), for the second term in (40) we obtain
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn42.png?pub-status=live)
Using Green’s identity, we arrive at
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn43.png?pub-status=live)
whose first term becomes zero (see Appendix B). Using the identity
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU28.png?pub-status=live)
the second term in (43) is rewritten as
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU29.png?pub-status=live)
By combining this with (42) and (43), we get
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn44.png?pub-status=live)
Also, we have
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn45.png?pub-status=live)
whose first term vanishes (see Appendix B), and
$p(\mathbf{y};\ t)=(p_{1}(\mathbf{y};\ t),p_{2}(\mathbf{y};\ t)\ldots,p_{m}(\mathbf{y};\ t))$
. From (45), combining (35), (39), (40), (41), and (44), we obtain
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU30.png?pub-status=live)
Hence, based on the relation (32) in Lemma 3, the proof is complete.
Remark 3. It is interesting to see that, if we put
$\rho=0$
in (34), it reduces to
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU31.png?pub-status=live)
That is, Theorem 2 results in the case where
$\mathbf{X}$
and
$\mathbf{W}_{t}$
are independent random variables as a special case. Hence, Theorem 2 encompasses the result of [Reference Villani17] as a corollary.
Now, we can establish our main result of this manuscript.
Theorem 3.
Let
$\mathbf{X}$
and
$\mathbf{W}_{t}$
in channel model (1) be two dependent random variables whose dependence structure is modeled by the multivariate Gaussian copula. For any
$\rho>-1/(2m-1)$
, under the conditions
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn46.png?pub-status=live)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn47.png?pub-status=live)
the entropy power
$N(\mathbf{X}+\mathbf{W}_{t})$
is concave in t. i.e.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU32.png?pub-status=live)
Proof. Simply, we have
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU33.png?pub-status=live)
Since the entropy power is nonnegative, to show that
$(\partial^{2}/\partial t^{2})N(\mathbf{Y})\leq 0$
, it is sufficient to prove that
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU34.png?pub-status=live)
Based on Theorem 1, this is equivalent to
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU35.png?pub-status=live)
Thus, since
$\rho>-1/(2m-1)$
and
$\delta(\rho ,m)>0$
, due to the condition (46a), we must prove that
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU36.png?pub-status=live)
According to proof of the proposition in [17, p. 3], we have
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn48.png?pub-status=live)
Hence, according to Theorem 3, (47), and assumption (46b), the proof is complete.
4. The one-dimensional case
In this section, by considering the channel model (1) with
$m=1$
, we describe special versions of our main results.
Corollary 1.
Let X and
$W_{t}$
in the channel model
$Y=X+W_{t}$
be dependent one-dimensional random variables, and let
$W_{t}$
be normally distributed with mean zero and variance t. If their dependence structure is modeled by the bivariate Gaussian copula (15), then
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU37.png?pub-status=live)
where
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn49.png?pub-status=live)
in which
$p'(y;\ t)={{{{\bf E}}}}_{X\mid Y}[\Phi^{-1}(F_{X}(X))\mid Y=y]$
.
Proof. Since
$W_{t}$
is normally distributed with mean zero and variance t, from (15),
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU38.png?pub-status=live)
Thus, by some simple calculations, we obtain
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn50.png?pub-status=live)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn51.png?pub-status=live)
Now, by comparing (49) with (50), we obtain
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn52.png?pub-status=live)
in which
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn53.png?pub-status=live)
where
$p'(y;\ t)={{{{\bf E}}}}_{X\mid Y}[\Phi^{-1}(F_{X}(X))\mid Y=y]$
. Hence,
$q_{j}(\mathbf{y};\ t)$
and
$p_{j}(\mathbf{y};\ t)$
in Lemma 2 reduce to
$q'(y;\ t)$
and
$p'(y;\ t)$
, respectively. Now, since X and
$W_{t}$
are one-dimensional, it is sufficient to set
$m=1$
and
$p_{j}(\mathbf{y};\ t)=p'(y;\ t)$
in (24). Therefore, the proof is complete.
Remark 4. Corollary 1 is equivalent to a result in [Reference Khoolenjani and Alamatsaz12].
Now, under the same conditions as in Corollary 1, according to the relations (51) and (52), the first-order derivative of the Fisher information,
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn54.png?pub-status=live)
simply follows by setting
$m=1$
and
$p_{j}(\mathbf{y};\ t)=p'(y;\ t)$
in (34). This coincides with the result in [Reference Asgari, Alamatsaz and Khoolenjani4], where a direct proof of (53) is provided.
Using the first-order derivatives of the entropy and Fisher information of the output signal Y, in what follows the concavity of Shannon’s entropy power for the special one-dimensional case is obtained.
Corollary 2.
Given the channel model (1), assume that X and
$W_{t}$
are dependent random variables modeled by the bivariate Gaussian copula (14). Based on the assumptions
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn55.png?pub-status=live)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn56.png?pub-status=live)
the entropy power
$N(X+W_{t})$
is concave in t.
Example 1. Consider the channel model
$Y=X+W_{t}$
with
$W_{t}=\sqrt{t}W$
. Let X be standard Gaussian and suppose that X and
$W_{t}$
are jointly distributed according to the bivariate Gaussian copula, i.e. X and W are two dependent random variables distributed according to a bivariate standard Gaussian distribution with the PDF
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU39.png?pub-status=live)
We know that Y is normally distributed with mean zero and variance
$1+t+2\sqrt{t}\rho$
. Thus, since
$(X,Y)\sim N_{2}(\mathbf{0},\Sigma_{X,Y})$
with
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU40.png?pub-status=live)
we have
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU41.png?pub-status=live)
Further, we observe that
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU42.png?pub-status=live)
Thus, by (48), we can write
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU43.png?pub-status=live)
As we can see, both conditions (54a) and (54b) are satisfied when
$\rho>0$
. Thus, based on Corollary 2,
$N(X+W_{t})$
is concave in t.
5. Conclusions
In this paper, based on the multivariate Gaussian copula dependence structure, we have derived the first- and second-order derivatives of differential entropy of the output signal in the m-dimensional additive Gaussian noise channel model. Then, by using these derivatives, we have generalized Costa’s concavity inequality for the particular case where the coordinates of the input signal and noise are dependent according to a multivariate Gaussian copula model. In particular, we have studied our results in the one-dimensional case and have provided an illustrative example.
Appendix A. Vanishing surface integrals of Theorem 1
We need to prove that
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn57.png?pub-status=live)
We first assume that
$h(\mathbf{Y})$
is finite. Next, we integrate the surface integral in (55) over
$r\geq 0$
and then, by applying the identity (27) and Stokes’ theorem, we obtain
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn58.png?pub-status=live)
Since the limit in the first part of (56) exists, due to
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU44.png?pub-status=live)
the first term in (56) vanishes. Now, since
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU45.png?pub-status=live)
for the second term in (56) we can write
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn59.png?pub-status=live)
Further, we know that
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn60.png?pub-status=live)
On the other hand, from (20), we have
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn61.png?pub-status=live)
Now, since for all
$j=1,2,\ldots,m$
,
$\vert E(W_{t,j}\mid \mathbf{Y}=\mathbf{y})\vert<+\infty$
, the first and second terms in (59) must be finite too. Therefore, we have
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn62.png?pub-status=live)
and, due to (58), the right-hand side of inequality (57) is finite. Hence, the integral in (56) is finite and, since the limit in (55) exists, the desired result (55) is proved.
Now, we need to prove that
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn63.png?pub-status=live)
in which the integral is taken from
$r=0$
to
$r=+\infty$
on the surface integral. Thus, we have
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn64.png?pub-status=live)
Since
$f_{\mathbf{Y}}(\mathbf{y};\ t)$
converges to zero as
$\mathbf{y}$
approaches
$\pm\infty$
, we have
$f_{\mathbf{Y}}(\mathbf{y};\ t)\log f_{\mathbf{Y}}(\mathbf{y};\ t)\rightarrow 0$
as
$\mathbf{y}\rightarrow\pm\infty$
. Therefore,
$\log f_{\mathbf{Y}}(\mathbf{y};\ t)$
is finite and, due to (59), the right-hand side of (62) becomes finite. Hence, since the limit in (61) exists, we can conclude the relation in (
$61$
).
Appendix B. Vanishing surface integrals of Theorem 2
We intend to prove that
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn65.png?pub-status=live)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn66.png?pub-status=live)
First, we consider the integral of the surface integral in (63) over
$r\geq 0$
;
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn67.png?pub-status=live)
Simply, based on (58) and (60), the right-hand side of (65) becomes finite and, since the limit
$u_{1}$
exists, this proves that
$u_{1}=0$
.
To show that
$u_{2}=0$
, we write
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn68.png?pub-status=live)
Because
$\big\vert\int_{0}^{+\infty}\int_{S_{r}}{f_{\mathbf{Y}}(\mathbf{y};\ t) \|\nabla\log f_{\mathbf{Y}}(\mathbf{y};\ t)\|^{2}\,\mathrm{d} S_{r}}\,\mathrm{d} r\big\vert =\vert J(\mathbf{Y})\vert<+\infty$
and
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU46.png?pub-status=live)
the first term in (66) becomes zero and the absolute value of the second term is finite. Thus, since the limit
$u_{2}$
exists, we have
$u_{2}=0$
.
In a similar way, we consider the integral from
$r=0$
to
$r=+\infty$
of the surface integral in (64):
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn69.png?pub-status=live)
Using (21), we have
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn70.png?pub-status=live)
Also, from (20), we obtain
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqn71.png?pub-status=live)
Since, for all
$j=1,2,\ldots,m$
,
${\bf E}(W_{t,j}^{2}\mid \mathbf{Y}=\mathbf{y})<+\infty$
, the first, third, and fourth terms in (68) are finite too and, due to (69),
$(\partial/\partial y_{j})q_{j}(\mathbf{y};\ t)$
is finite as well. Therefore, from (59), the right-hand side of (67) is finite and, together with the fact that the limit
$u_{3}$
exists, it follows that
$u_{3}=0$
.
Similarly, to show that
$u_{4}=0$
, we find the sequence of relations
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20231221162030379-0164:S0021900222001280:S0021900222001280_eqnU47.png?pub-status=live)
Using similar steps, we can see that
$u_{4}=0$
.
Acknowledgements
We express our gratitude to the associate editor and the anonymous reviewers whose comments had a noticeable impact on improving the manuscript.
Funding information
There are no funding bodies to thank relating to the creation of this article.
Competing interests
There were no competing interests to declare which arose during the preparation or publication process of this article.