No CrossRef data available.
Published online by Cambridge University Press: 23 March 2005
We present some tips for generating multiple sequences of two- and three-valued random variables.
Suppose Xj,j = 1,…,n, are independent Bernoulli random variables with respective means pj,j = 1,…,n, and that we are interested in using simulation to estimate E [h(X1,…,Xn)] by generating m independent values of h. The usual simulation approach is to
and so on (where Xi,j is Bernoulli with parameter pj).
However, a better approach is to first generate the sequence of values Xi,1,i = 1,…,m, then the sequence of values Xi,2,i = 1,…,m, and so on. This is true because we can generate a sequence of independent and identically distributed (i.i.d.) Bernoulli random variables by generating the times at which the least likely value (0 or 1) occurs; that is, suppose we want to generate Y1,…,Ym, independent Bernoulli random variables with mean p. Let a = min(p,1 −p) and let v equal 1 if a = p and let v equal 0 otherwise. Now, imagining that the random variables Y1,…,Ym are the results of m successive trials, we can generate the trial numbers at which we obtain the value v by generating independent geometric random variables having parameter a. To generate such a geometric random variable G, generate a random number U and set
where [x] is the largest integer less than or equal to x. Continue to generate such geometrics until their sum is at least m. Suppose G1,…,Gr are generated. If
, then trial numbers
result in outcome v and the others in outcome 1 − v; if
, then trial numbers
, result in outcome v and the others in outcome 1 − v.
With R equal to the number of geometrics, we need generate
where N(k) is the number of renewals by time k of the Bernoulli renewal process whose interarrival times are the Gi. Consequently,
that is, the expected number of geometrics that are needed to generate the results of the m trials is (m − 1)a + 1. (Intuitively, the preceding is true because we generate all of the first m − 1 trial numbers that result in successes, as well as the result of trial m.)
Remark: One downside of our suggestion to generate the Bernoulli array in a vertical rather than a horizontal fashion is the additional memory requirements. (If generated in a horizontal fashion, then once a row is generated and h evaluated the Bernoulli row data can be discarded.) However, a variant of our idea can still be utilized even if one wants to generate the complete row vector. For instance, suppose we want to generate X1,…,Xn, where Xi is Bernoulli with parameter pi. Let ai = min(pi,1 − pi) and let vi equal 1 if ai = pi and vi equal 0 otherwise. Start by generating the first Xi, i = 1,…,n, that results in its probability ai value; that is, generate a random number U. If
, set
and stop. Otherwise, let
and set
If F = n, stop; if F < n, use the same procedure to generate the values of XF+1,…,Xn.
Remark: The results reported in [1] were what motivated us to consider more efficient ways to generate the data needed to estimate E [h(X1,…,Xn)].
Presented is a competing approach when you want m i.i.d. Bernoulli random variables having parameter p. Generate N, a binomial (m,p) random variable; N will equal the number of the random variables equal to 1. Now, let M = min(N,m − N) and generate the first M positions of a random permutation of 1,…,m (this requires M random numbers and is easily done; see [2]). If M = N, these are the trial numbers whose value is 1; if M ≠ N, these are the trial numbers whose value is zero. This method requires, on average, 1 + E [M] < 1 + ma random numbers, where a = min(p,1 − p) and where the 1 refers to the generation of the binomial. The binomial method has the advantage over the geometric in that it is easier to generate the first r positions of a random permutation than it is to generate r geometric random variables. On the other hand, it requires generating a binomial random variable (which can be efficiently done by the use of the inverse transform method starting at [mp] and going either up or down (see [2])).
We now obtain an approximation to E [M].
Lemma 1: Let Z be a standard normal random variable with distribution function Φ. Then
Proof: For c > 0,
The argument for c < 0 is similar. █
Proposition 1: If N is a binomial random variable with parameters m and p, then for m large,
where
Proof:
where Z is a standard normal. The proof now follows from Lemma 1. █
Suppose that we want to generate the results from m independent trials when each trial has three possible outcomes, with probabilities P1, P2, and P3. Suppose that Pi1 ≤ Pi2 ≤ Pi3. Two ways to generate the trial results are as follows:
1. Generate the trials that result in outcome i1 by successively generating geometrics with parameter Pi1 as in the previous section. Suppose this results in trials j1,…,jr having value i1. Then determine which of the other m − r trials result in outcome i2 by generating geometrics with parameter Pi2 /(1 − Pi1). For instance, suppose we want 10 trials, where P1 = 0.2, P2 = 0.3, and P3 = 0.5. If the generated geometrics with parameter 0.2 are 4, 5, and 3 (we stopped when the sum of these geometrics was at least 10) and the generated geometrics with parameter 3/8 are 1, 3, and 4 (we stopped when the sum of these geometrics was at least 8), then the trial outcomes are
2. A second approach is to generate a binomial random variable Ni1 with parameters m and Pi1, and then generate a binomial random variable Ni2 with parameters m − Ni1, Pi2 /(1 − Pi1). Let Ni3 = m − Ni1 − Ni2. If Nir ≤ Nis ≤ Nit, generate the first Nir + Nis positions of a random permutation of 1,…,m. The trial numbers resulting in outcome ir are the first Nir values of this permutation, the trial numbers resulting in outcome is are the following Nis values of this permutation, and the other trials result in outcome it.
This work was supported by National Science Foundation grants (DMS-0072207 to H. B., ECS-00245355 to M. M., and DMI-9901053 with the University of California to S. M. R.).