RANDOM FINITE SUBSETS WITH EXPONENTIAL DISTRIBUTIONS

Kyle Siegrist

doi:10.1017/S0269964807070088

RANDOM FINITE SUBSETS WITH EXPONENTIAL DISTRIBUTIONS

Published online by Cambridge University Press: 15 December 2006

Kyle Siegrist

Show author details

Kyle Siegrist: Affiliation:
Department of Mathematical Sciences, University of Alabama in Huntsville, Huntsville, AL, E-mail: siegrist@math.uah.edu

Article contents

Abstract
1. INTRODUCTION
2. POSITIVE SEMIGROUPS
3. EXPONENTIAL DISTRIBUTIONS
4. THE POSITIVE SEMIGROUP OF FINITE SUBSETS
5. EXPONENTIAL DISTRIBUTIONS ON Tk
6. ALMOST EXPONENTIAL DISTRIBUTIONS ON S
Acknowledgment
References

Rights & Permissions

Abstract

Let S denote the collection of all finite subsets of . We define an operation on S that makes S into a positive semigroup with set inclusion as the associated partial order. Positive semigroups are the natural home for probability distributions with exponential properties, such as the memoryless and constant rate properties. We show that there are no exponential distributions on S, but that S can be partitioned into subsemigroups, each of which supports a one-parameter family of exponential distributions. We then find the distribution on S that is closest to exponential, in a certain sense. This work might have applications to the problem of selecting a finite sample from a countably infinite population in the most random way.

Type: Research Article
Information: Probability in the Engineering and Informational Sciences , Volume 21 , Issue 1 , January 2007 , pp. 117 - 132

DOI: https://doi.org/10.1017/S0269964807070088 [Opens in a new window]
Copyright: © 2007 Cambridge University Press

1. INTRODUCTION

The motivation for this article is the problem of selecting a finite set of positive integers in “the most random way possible.” The phrase in quotes is ambiguous, of course, but we will at least obtain a partial solution by considering the collection of finite sets as a positive semigroup (S,·) whose associated partial order is set inclusion. Positive semigroups are the natural mathematical home for probability distributions with exponential-type properties, including the memoryless and constant failure rate properties, special properties related to uniform distributions, and maximum entropy properties. All of these relate, in some sense, to the degree of randomness of the distribution.

More specifically, our goal is to describe the semigroup (S,·) and study probability distributions on S that have exponential properties. We will show that there are no exponential distributions on S, but that S naturally partitions into a countable collection of subsemigroups, each of which supports a one-parameter family of exponential distributions. We will then construct a two-parameter family of distributions on S that are close to exponential in a strong sense. For basic information about probability distributions on semigroups, see Högnäs and Mukherjea [2]. For properties and characterizations of the standard exponential distribution, see Azlarov and Volodin [1]. For the general theory of random sets, see Matheron [3].

2. POSITIVE SEMIGROUPS

A semigroup (S,·) consists of a set S and a binary operation · on S that is associative:

A positive semigroup is a semigroup (S, ·) that has an identity e, satisfies the left cancellation law, and has no nontrivial inverses; that is, for all x,y,z ∈ S, the following hold:

xe = ex = x.
xy = xz implies y = z.
xy = e implies x = y = e.

The relation [prcue ] on S defined by x [prcue ] y if and only if y = xt for some t ∈ S is a partial order on S; that is, x [prcue ] x for each x ∈ S (the reflexive property), x [prcue ] y and y [prcue ] x imply x = y (the antisymmetric property), and x [prcue ] y and y [prcue ] z imply x [prcue ] z (the transitive property). If x [prcue ] y, then t ∈ S satisfying xt = y is unique and is denoted x⁻¹y. For each x ∈ S, the mapping t → xt is an order isomorphism from S onto the set xS = {y ∈ S : x [prcue ] y}; that is, x [prcue ] y if and only xt [prcue ] yt for all x,y,t ∈ S. Indeed, the algebraic assumptions are precisely the ones needed for the partially ordered set (S, [prcue ]) to have this self-similarity property.

Positive semigroups are often found embedded in groups. Specifically, suppose that (G,·,[prcue ]) is a left-ordered group; that is, (G,·) is a group and [prcue ] is a partial order on G satisfying x [prcue ] y ⇒ zx [prcue ] zy for x, y, z ∈ G. Let e denote the identity element of G and let S = {x ∈ G : x [sccue ] e} denote the set of positive elements of G. Then (S,·) is a positive semigroup and [prcue ] restricted to S is the associated partial order. Conversely, suppose that (G,·) is a group and that S is a positive subsemigroup of G. For x,y ∈ G, define x [prcue ] y if and only if xt = y for some t ∈ S. Then (G,·,[prcue ]) is a left-ordered group with S as the set of positive elements. On the other hand, as this article hopefully illustrates, there are interesting positive semigroups that cannot be embedded in groups.

In general, topological and measure-theoretic assumptions are imposed on S as well, but in this article we will only be concerned with discrete semigroups (where S is countably infinite). In this case, the one additional assumption needed is that [e,x] = {t ∈ S : t [prcue ] x} is finite for each x ∈ S, so that the partially ordered set (S,·) is locally finite. Note that counting measure # is left-invariant for (S,·); that is, #(xA) = #(A) for x ∈ S and A ⊆ S. Moreover, # is the unique left-invariant measure up to multiplication by positive constants.

If T is a nonempty subset of S and is closed under the operation ·, then (T,·) is also a positive semigroup, since the other algebraic assumptions are simply inherited. However, the partial order associated with T is not, in general, the restriction of [prcue ] to T, but a subpartial order of this restriction; that is, if x, y ∈ T, then x [prcue ]_T y implies x [prcue ] y, but the converse is not true unless x⁻¹y ∈ T. If T is closed under · but does not contain e, then T ∪ {e} is also a closed under · and, hence, is a positive subsemigroup of S.

3. EXPONENTIAL DISTRIBUTIONS

Suppose that (S,·) is a (discrete) positive semigroup and that X is a random variable taking values in S (so that P(X = x) > 0 for each x ∈ S). The tail probability function of X is the mapping x → P(X [sccue ] x). In general, this function does not uniquely determine the distribution of X.

Random variable X has an exponential distribution if

Equivalently, the conditional distribution of x⁻¹X given X [sccue ] x is the same as the distribution of X for each x ∈ S. Random variable X has a memoryless distribution if (1) holds for all x ∈ S and all A of the form yS, where y ∈ S. Equivalently,

so that the conditional tail probability function of x⁻¹X given X [sccue ] x is the same as the tail probability function of X. In the language of reliability theory, X has constant failure rate (or simply constant rate) if the (discrete) probability density function is proportional to the tail probability function:

for some positive constant α. In general,

so that if X has constant rate, then the rate constant must be

There are a number of nice properties and characterizations of exponential distributions on positive semigroups, perhaps a bit surprising given the minimal algebraic assumptions. We mention a few of these; for more details, see Siegrist [5,6,7]. First, X has an exponential distribution on S if and only if X is memoryless and has constant rate. In general, however, a distribution can have one of these properties (memoryless or constant rate), but not the other. Second, F : S → (0,1] is the tail probability function of an exponential distribution on S if and only if F(xy) = F(x)F(y) for all x,y ∈ S and

(the reciprocal of this sum is then the rate constant). This characterization prescribes a method for finding all exponential distributions on a given positive semigroup. Finally, suppose that X and Y are independent and identically distributed (i.i.d.) on S. Then the common distribution is exponential if and only if the conditional distribution of X given XY = z is uniform on [e,z] for each z ∈ S. Furthermore, if X₁,X₂,… are i.i.d. exponential and Y_n = X₁···X_n is the corresponding “gamma” variable of order

, then the conditional distribution of (Y₁,…,Y_n) given Y_n+1 = z is uniform on the set

For general positive semigroups, counting measure # is replaced by a left-invariant measure λ; sums become integrals with respect to λ; and density functions and uniform distributions are with respect to λ as well. At least in terms of the algebraic structure, exponential distributions specify the most random way to select elements of S. Of course, the motivating example for this theory is the positive semigroup ([0,∞),+). The associated partial order is the ordinary order and Lebesgue measure is the invariant measure. The exponential distributions for this positive semigroup are the ordinary exponential distributions. Rowell and Siegrist [4] explore positive semigroups isomorphic to the standard one, in the context of reliability theory. We briefly mention a few discrete examples.

Example 1: Let

denote the set of nonnegative integers. Then

is a positive semigroup, and the associated partial order is the ordinary order ≤. The exponential distribution with rate parameter p ∈ (0,1) has tail probability function and density function given by

Of course, this is the standard geometric distribution with success parameter p.

Example 2: Let

denote the set of positive integers. Then

is a positive semigroup where · is ordinary multiplication. The associated partial order is the division order: x [prcue ] y if and only if x divides y. The exponential distributions turn out to be precisely the Dirichlet distributions with completely multiplicative coefficient functions; the zeta distribution is an important special case (see Siegrist [7]).

Example 3: Let A be a finite alphabet and let S = A* denote the space of all finite words with letters from A. The free semigroup (S,·) is a positive semigroup where · is the concatenation operation. The identity is the “empty word” e, and x [prcue ] y if and only if x is a prefix of y. An exponentially distributed variable has the form

where the random letters X₁,X₂,… are i.i.d. on the alphabet A and where the length L is independent of (X₁,X₂,…) and has a geometric distribution on

4. THE POSITIVE SEMIGROUP OF FINITE SUBSETS

Let S denote the set of all finite subsets of

. It is easy to see that the partially ordered set (S, ⊆) satisfies the self-similarity property described in Section 2. Our goal in this section is to define and study the corresponding positive semigroup.

We identify a nonempty subset x of

with the function given by

with domain {1,2,…,#(x)} if x is finite or with domain

if x is infinite. If x is nonempty and finite, max(x) denotes the maximum value of x; by convention, we take max(Ø) = 0 and max(x) = ∞ if x is infinite. Thus, #(x) ≤ max(x) for every x. If x and y are nonempty subsets of

with max(y) ≤ #(x), we let x ○ y denote the subset whose function is the ordinary composition of x and y: x ○ y(i) = x(y(i)). We also define x ○ Ø = Ø for any

. Note that x ○ y is always defined when x is infinite.

We now define a binary operation · on S by

Note that the operation is well defined since x^c is infinite. The operation might seem contrived, but it is not. Up to a relabeling of the positive integers, there is only one way to associate a positive semigroup with the subset partial order.

Theorem 1: (S,·) is a positive semigroup with the subset partial order.

Proof: The associative rule holds, and in fact

The empty set is the identity

The left-cancellation law holds: Suppose that xy = xz. Then x ∪ (x^c ○ y) = x ∪ (x^c ○ z) by definition and, hence, x^c ○ y = x^c ○ z since the pairs of sets in each union are disjoint. However, then y = z. There are no nontrivial inverses: If xy = Ø, then x ∪ (x^c ○ y) = Ø. Hence, we must have x = Ø and, therefore, also

Finally, the associated partial order is the subset order. Suppose first that xu = y. Then x ∪ (x^c ○ u) = y so x ⊆ y. Conversely, suppose that x ⊆ y. Let

. Then x ∪ (x^c ○ u) = y, so xu = y. █

Note that the irreducible elements of (S, ·) are the singletons {i}, where

. Note also that

Thus, the binary operation is not commutative and the right-cancellation law does not hold, so (S,·) just satisfies the minimal algebraic assumptions of a positive semigroup. In particular, S cannot be embedded in a group. Finally, if i₁ < i₂ < ··· < i_n, then

Proposition 1: For x, y ∈ S,

Proof: #(xy) = #[x ∪ (x^c ○ y)] = #(x) + #(x^c ○ y) since x and x^c ○ y are disjoint. However, clearly, #(x^c ○ y) = #(y).

Equation (3) is trivial if x or y is the identity (Ø), so we will assume that x and y are nonempty. Note that, by definition,

Let i = #(x) and n = max(x). Then n ∈ x and the remaining i − 1 elements of x are in {1,2,…, n − 1}. Hence, x^c contains n − i elements of {1,2,…, n − 1}, together with all of the elements of {n + 1,n + 2,…}. If max(y) ≤ n − i, then max(x^c ○ y) = x^c(max(y)) ≤ n − 1, so max(xy) = n = max(x). If max(y) > n − i, then max(xy) = max(x^c ○ y) = x^c(max(y)) = n + (max(y) − (n − i)) = max(y) + i. █

The semigroup (S, ·) has an interesting structure, but to see it we need some additional notation. For

, let

and let T_k = {Ø} ∪ S_k. For

, let

Of course, S_0,0 = {Ø}. If

and

, then

since x ∈ S_n,k must contain the element n + k and n − 1 elements from {1,2,…,n + k − 1}. If we interpret the binomial coefficient

as 1, then (4) is valid for n = k = 0 also.

Theorem 2: S_k is a subsemigroup of S, and hence T_k is a positive subsemigroup of S, for each

. The associated partial order on T_k is the subset partial order.

Proof: We first show that xy ∈ S_k for x,y ∈ S_k. The result is trivial if x = Ø or y = Ø (which can only happen when k = 0). Thus, we will assume that x and y are nonempty. Then max(y) > max(x) − #(x), since the left-hand side is k + #(y) and the right-hand side is k. By Proposition 1, max(xy) = max(y) + #(x). Hence,

Therefore, xy ∈ S_k. To prove the last statement in the theorem, it suffices to show that if x, y ∈ S_k and x ⊂ y, then x⁻¹y ∈ S_k. Thus, suppose that x ∈ S_m,k, y ∈ S_n,k, where m < n, and xu = y for some u ∈ S (so that x ⊂ y). Then max(y) = n + k > m + k = max(x), so by another application of Proposition 1, max(y) = max(u) + m and, hence, max(u) = n − m + k. However, #(u) = n − m and, hence, u ∈ S_k. █

Note that

. If y ∈ S and #(y) = n, then

In particular, {1,2,…, m}{1,2,…, n} = {1,2,…, m + n}, so (S₀, ·) is isomorphic to

, the positive semigroup in Example 1, and x [map ] #(x) is an isomorphism. Finally, note that Ø ∈ S₀, so T₀ = S₀. To characterize the exponential distributions on T_k, we must first characterize the minimal elements of S_k (which are the irreducible elements of T_k).

Proposition 2: The set of minimal element of S_k is

There are 2^k minimal elements.

Proof: First, we show that if x ∈ S_k is not a minimal element of S_k, then x ∉ M_k. Thus, suppose that x = uv, where u,v ∈ S_k are nonempty. Then max(u) > k and max(u) ∈ u ⊆ uv = x. Moreover, max(u) < max(x), so the rank of max(u) in x is less than #(x) = #(u) + #(v). Therefore, x ∉ M_k.

Next, we show that if x ∈ S_k[setmn ]M_k, then x is not a minimal element of S_k. Thus, suppose that x ∈ S_k and x(i) > k for some i < #(x). Construct u ∈ S as follows: x(i) ∈ u and u contains x(i) − k − 1 elements of x that are smaller than x(i). This can be done since x(i) − i ≤ k and, hence, x(i) − k − 1 ≤ i − 1, and by definition, x contains i − 1 elements smaller than x(i). Now note that max(u) − #(u) = x(i) − (x(i) − k) = k, so u ∈ S_k. By construction, u ⊆ x, so there exists v ∈ S such that uv = x. By Theorem 2, v ∈ S_k and, hence, x is not a minimal element of S_k.

Next, note that if x ∈ S_k and #(x) ≥ k + 2, then x ∉ M_k, since one of the k + 1 elements of x of rank less than #(x) must be at least k + 1. For n ≤ k + 1, the number of elements x ∈ M_k with

, since x must contain n + k and n − 1 elements in {1,2,…, k}. Hence,

█

For example, the minimal elements of S₂ are {3}, {1,4}, {2,4}, and {1,2,5}.

5. EXPONENTIAL DISTRIBUTIONS ON T_k

Theorem 3: There are no memoryless distributions on S and, hence, no exponential distributions.

Proof: Suppose that X is a random variable taking values in S and that X has a memoryless distribution. By the memoryless property,

However, {i}{i} = {i + 1}{i}, as noted earlier, so we must have

for every

. Next, note that if i₁ < i₂ < ··· < i_n then by another application of the memoryless property,

It therefore follows that the events

are i.i.d. Hence, infinitely many of the events must occur with probability 1, so X is infinite—a contradiction. █

Although there are no exponential distributions on S, the subsemigroup T_k has a one-parameter family of exponential distributions for each

Theorem 4: A random variable X taking values in T_k has an exponential distribution if and only if the tail probability function F and density function f have the following form, for some α ∈ (0,1):

Proof: The function F(x) = α^#(x) takes values in (0,1] and satisfies F(xy) = F(x)F(y) for all x, y ∈ T_k. Moreover,

It follows that F and f as given earlier are the tail probability function and density function, respectively, of an exponential distribution.

Conversely, suppose that F is the tail probability function of a memoryless distribution on T_k. As noted earlier, T₀ is isomorphic to

, with # an isomorphism. Thus, if k = 0, F must have the form F(x) = α^#(x), where α = F({1}) ∈ (0,1). For general k, we will show by induction on #(x) that F(x) = α^#(x), where α = F({k + 1}) ∈ (0,1). The result is trivially true if #(x) = 0, since x = Ø. The result is also trivially true if #(x) = 1, since the only such x ∈ T_k is x = {k + 1}. Suppose now that F(x) = α^#(x) for all x ∈ T_k with #(x) ≤ n. Let x ∈ T_k with #(x) = n + 1. If x is not irreducible, then x = uv, where u,v ∈ T_k, #(u) ≤ n, #(v) ≤ n, and #(u) + #(v) = #(x). In this case,

On the other hand, if x is irreducible, let j = min{i ∈ x : i + 1 ∉ x}. Note that j < #(x) since max(x) = #(x) + k. Now let y ∈ T_k be obtained from x by replacing j with j + 1. Note that #(y) = #(x) and, moreover, y^c can be obtained from x^c by replacing j + 1 with j. We claim that xx = yx; that is, x ∪ (x^c ○ x) = y ∪ (y^c ○ x). To see this, note first that if i ≠ j and i ≠ j + 1, then i ∈ x if and only if i ∈ y, and i ∈ x^c if and only if i ∈ y^c. On the other hand, j ∈ x and j ∈ y^c ○ x, since j = y^c(x(1)) (by definition, there are x(1) − 1 elements less than x(1) in y^c; the next element in y^c is j). Similarly, j + 1 ∈ y and j + 1 ∈ x^c ○ x, since j + 1 = x^c(x(1)). Since xx = yx, it follows from the memoryless property that F(x) = F(y). Continuing this process, we find that F(x) = F(y) for some y ∈ S_k that is not minimal, but with #(y) = #(x). It then follows that F(x) = F(y) = α^#(y) = α^#(x) and the proof is complete. █

To illustrate the last part of the proof, suppose that x = {3,4,5,8,15} ∈ T₁₀. Then j = 5, y = {3,4,6,8,15}, and xx = yx = {3,4,5,6,7,8,9,12,15,20}.

Suppose that X has the exponential distribution on T_k given in Theorem 4. From the general theory in Section 3, the expected number of subsets of X in T_k is the reciprocal of the rate parameter in the density function. Thus,

If k = 0 (recall that S₀ = T₀), note that

On the other hand, suppose that

. Then

Thus, the conditional distribution of X given X ∈ S_k has density function

The density function of X depends on x ∈ T_k only through #(x). The following corollary gives the distribution of #(X).

Corollary 1: Suppose that X has the exponential distribution in Theorem 4 and let U = #(X). Then

where we interpret the binomial coefficient as 1 when n = 0.

When k = 0, (9) gives P(U = n) = (1 − α)αⁿ for

, so U has a geometric distribution on

. In general, U has a modified negative binomial distribution. It is easy to see from (10) that for each

, E(U) is a strictly increasing function of α and maps (0,1) onto (0,∞). Thus, the exponential distribution on T_k can be reparameterized by expected cardinality. Moreover, the exponential distribution maximizes entropy with respect to this parameter:

Corollary 2: The exponential distribution in Theorem 4 maximizes entropy over all distributions on T_k with expected value given by (10).

Proof: We use the usual inequality for entropy: if f and g are probability density functions of random variables X and Y, respectively, taking values in T_k, then

If X has the exponential distribution in Theorem 4 and if E(#(Y)) = E(#(X)), then substituting into the right-hand side of (11), we see that the entropy of Y is bounded above by

where c_k,α is the rate parameter of the exponential density in (6) and μ_k,α is the mean cardinality in (10). Of course, the entropy of X achieves this upper bound. █

6. ALMOST EXPONENTIAL DISTRIBUTIONS ON S

There are no exponential distributions on S. However, we can define distributions that are “close” to exponential by forming mixtures of the distributions in (7) and (8). Thus, suppose that X takes values in S with probability mass function

where α_k,β_k ∈ (0,1) for each

and

. Thus, the conditional distribution of X given X ∈ S_k is the same as the corresponding conditional distribution of an exponential variable on T_k (with parameter α_k). Note that the conditional distribution of X on T_k itself is not exponential. In fact, we cannot construct a distribution on S by requiring that the conditional distributions on T_k be exponential for each k, essentially because these semigroups share Ø and, thus, are not disjoint. The distribution of X is as close to exponential as possible, in the sense that X is essentially exponential on each of the subsemigroups S_k, and these semigroups partition S.

There is not much that we can say about the general distribution in (12). In the remainder of this section we will study a special case with particularly nice properties. For our first construction, let N have a geometric distribution on

with rate parameter 1 − r ∈ (0,1), as in Example 1. Next, given N = n, random variable X is distributed on the subsets of {1,2,…, n}, so that i ∈ X, independently, with probability p for each i ∈ {1,2,…, n}. Of course, if N = 0, then X = Ø.

Theorem 5: For x ∈ S,

Proof: For x ∈ S,

If n < max(x), then x is not a subset of {1,2,…, n}, so P(X = x|N = n) = 0. If n ≥ max(x), then x is a subset of {1,2,…, n} and, by assumption, P(X = x|N = n) = p^#(x)(1 − p)^n−#(x). Substituting gives

which simplifies to (13). By a similar argument,

which simplifies to (14). █

The distribution of X depends on x ∈ S only through #(x) and max(x) − #(x). As before, let U = #(X) and now let V = max(X) − #(X).

Corollary 3: For

Corollary 4: For

, the conditional distribution of X given U = n, V = k is uniform on S_n,k.

Corollary 5: For

, the conditional distribution of V given U = n is negative binomial with parameters n and r(1 − p) (when n = 0, the conditional distribution of V is point mass at 0).

Corollary 6: The distribution of U is geometric with parameter (1 − r)/(1 − r + rp).

Of course, Corollaries 4–6 determine the distribution of X and give an alternate way of constructing the distribution in the first place: We first give U a geometric distribution with a parameter a ∈ (0,1); given U = n, we give V a negative binomial distribution with parameters n and b ∈ (0,1); and, finally, given U = n and V = k, we give X the uniform distribution on S_n,k. The two constructions are equivalent, since there is a one-to-one correspondence between the pairs of parameters (r,p) and (a,b).

Our next goal is to study the distribution of the random subset X on the subsemigroups S_k. First, note that

Thus, for

, X has constant rate

on the subsemigroup S_k. In particular, for x ∈ S₀,

Hence, X has the memoryless property on S₀ (in addition to the constant rate property). To find the conditional distribution of X given X ∈ S_k, we first need P(X ∈ S_k) or, equivalently, the probability density function of V.

Corollary 7: V has a modified geometric distribution:

Corollary 8: The conditional distributions of X on S_k are as follows:

Thus, X has an almost exponential distribution in the sense of (12), with α_k = 1 − rp for each

and with the mixing probabilities given in Corollary 7.

From Theorem 3, no exponential distribution on S exists because the events

would have to be independent with a common probability. The next corollary explores these events for the random variable in Theorem 5.

Corollary 9: Suppose that X has the distribution in Theorem 5.

1. P(i ∈ X) = prⁱ for

2. If

with i₁ < i₂ < ··· < i_n then

3. For

, the events {1 ∈ X},{2 ∈ X},…,{j − 1 ∈ X} are conditionally independent given {j ∈ X} with P(i ∈ X| j ∈ X) = p for i < j.

Property 3 in Corollary 9 is clearly a result of the original construction of X. Property 2 is reminiscent of a Markov property. This property implies that the events

are positively correlated, but asymptotically uncorrelated. In fact the correlation decays exponentially, since

From Corollaries 5 and 6, we can compute the expected value of U = #(X) and W = max(X) = U + V:

It is easy to see from (17) and (18) that (E(U),E(W)), as a function of (r,p), maps (0,1)² one-to-one and onto {(c,d) : 0 < c < d < ∞}. Thus, the distribution of X can be reparameterized by expected cardinality and expected maximum. Moreover, the distribution of X maximizes entropy with respect to these parameters. The proof of the following corollary is essentially the same as the proof of Corollary 2

Corollary 10: The distribution in Theorem 5 maximizes entropy among all distributions on S with expected cardinality given by (17) and expected maximum given by (18).

Of fundamental importance in the general theory of random sets [3] is the hitting probability function G:

This function completely determines the distribution of a random set, and in the general setting (which lacks the algebraic structure that we have here), it plays the role of a “distribution function.” Note that G is defined for all subsets of the positive integers, not just finite subsets.

Theorem 6: Suppose that X has the almost exponential distribution with parameters p and r given in Theorem 5. Then

where, as usual, x(i) is the ith smallest element of x.

Proof: Suppose first that x is finite (so that x ∈ S). From the standard inclusion–exclusion formula (or from [3]),

Hence, substituting the result in (14), we have

For infinite x, the formula holds by passing to the limit and using the continuity of probability. █

Acknowledgment

I am grateful to Marcus Pendergrass for suggesting this model.

References

REFERENCES

Azlarov, A.T. & Volodin, N.A. (1986). Characterization problems associated with the exponential distribution. Berlin: Springer-Verlag.CrossRef

Högnäs, G. & Mukherjea, A. (1995). Probability measures on semigroups. New York: Plenum Press.CrossRef

Matheron, G. (1975). Random sets and integral geometry. New York: Wiley.

Rowell, G.H. & Siegrist, K. (1998). Relative aging of distributions. Probability in the Engineering and Informational Sciences 12: 469–478.Google Scholar

Siegrist, K. (1994). Exponential distributions on semigroups. Journal of Theoretical Probability 7: 725–737.Google Scholar

Siegrist, K. (2006). Decomposition of exponential distributions on positive semigroups. Journal of Theoretical Probability 19: 204–220.Google Scholar

Siegrist, K. (2007). Exponential and gamma distributions on positive semigroups, with applications to Dirichlet distributions. Bernoulli (to appear).CrossRef Google Scholar

Article contents

RANDOM FINITE SUBSETS WITH EXPONENTIAL DISTRIBUTIONS

Abstract

1. INTRODUCTION

2. POSITIVE SEMIGROUPS

3. EXPONENTIAL DISTRIBUTIONS

4. THE POSITIVE SEMIGROUP OF FINITE SUBSETS

5. EXPONENTIAL DISTRIBUTIONS ON Tk

6. ALMOST EXPONENTIAL DISTRIBUTIONS ON S

Acknowledgment

References

REFERENCES

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests

5. EXPONENTIAL DISTRIBUTIONS ON T_k