Hostname: page-component-745bb68f8f-lrblm Total loading time: 0 Render date: 2025-02-05T23:02:42.574Z Has data issue: false hasContentIssue false

RARE EVENT SIMULATION

Published online by Cambridge University Press:  12 December 2005

Agnès Lagnoux
Affiliation:
Laboratoire de Statistique et Probabilités, Université Paul Sabatier, 31062 Toulouse Cedex 4, France, E-mail: lagnoux@cict.fr
Rights & Permissions [Opens in a new window]

Abstract

This article deals with estimations of probabilities of rare events using fast simulation based on the splitting method. In this technique, the sample paths are split into multiple copies at various stages in the simulation. Our aim is to optimize the algorithm and to obtain a precise confidence interval of the estimator using branching processes. The numerical results presented suggest that the method is reasonably efficient.

Type
Research Article
Copyright
© 2006 Cambridge University Press

1. INTRODUCTION

The analysis of rare events is of great importance in many fields because of the risk associated with the event. Their probabilities are often about 10−9 to 10−12. One can use many ways to study them: The first is statistical analysis, based on the standard extreme value distributions, but this needs a long observation period (see Aldous [1]); the second is modeling, which leads to estimating the rare event probability either by an analytical approach (see Sadowsky [10]) or by simulation.

In this article we focus on the simulation approach based on the Monte Carlo method. Nevertheless, a crude simulation is impractical for estimating such small probabilities: To estimate probabilities of order 10−10 with acceptable confidence would require the simulation of at least 1012 events (which corresponds to the occurrence of 100 rare events).

To overcome these limits, fast simulation techniques are applied. In particular, importance sampling (IS) is a refinement of Monte Carlo methods. The main idea of IS is to make the occurrence of the rare event more frequent. More precisely, IS consists of selecting a change of measure that minimizes the variance of the estimator. Using another method based on particles systems, Cerou, Del Moral, Legland, and Lezaud [3] gave theoretical results on the convergence of this kind of algorithm. In this article, we deal with the RESTART (REpetitive Simulation Trials After Reaching Thresholds) algorithm presented by Villen-Altamirano and Villen-Altamirano [11] and based on splitting. The basic idea of splitting is to partition the space state of the system into a series of nested subsets and to consider the rare event as the intersection of a nested sequence of events. When a given subset is entered by a sample trajectory, random retrials are generated from the initial state corresponding to the state of the system at the entry point. Thus, the system trajectory has been split into a number of new subtrajectories. However, the analysis of the RESTART model presents numerous difficulties because of the lack of hypothesis and the complexity of formulas.

In this article we build a simple model of splitting for which we are able to derive precise conclusions. It is based on the same idea: Before entering the rare event A, there exists intermediate states visited more often than A by the trajectory: A = BM+1BM ⊂ ··· ⊂ B1. Let

. The fact that a sample trajectory enters Bi is represented by a Bernoulli trial. Every time a sample trajectory enters a subset Bi, i = 1,…,M, it is divided in a number Ri of subtrajectories starting from level i. More precisely, we generate N random variables with common law Bernoulli Ber(P1) and check whether the subset B1 is reached. If so, we duplicate the trials in R1 retrials of Ber(P2) and check whether the subset B2 is reached. Thus,

and an unbiased estimator of P is

where

are independent and identically distributed (i.i.d.), NA is the number of trials that reach A during the simulation, and N is the number of particles initially generated. An optimal algorithm is chosen via the minimization of the variance of

for a given budget. For this, we have to describe the cost of a given simulation: Each time a particle is launched, it generates an average cost that is assumed here to be a function h of the transition probability. Therefore, the (average) cost is

where ri = R1···Ri, i = 1,…,M, r0 = 1, and Pi|0 = P1···Pi, i = 1,…,M + 1, and P0|0 = 1. Then the optimal algorithm is described by

and M is given by M = [ln P/y0] − 1 or M = [ln P/y0], where y0 is the solution of Eq. (30). The optimal sampling number is independent of the budget and this former only determines the optimal number of independent particles first generated. In the special case of h = 1,

Thus, the optimal sampling number and the optimal transition probabilities are independent of the rare event probability. For example, if P = 10−12 and C = 103, M = 16, Pi ≈ 0.2, Ri = 5, and N = 59.

Example 1.1: To analyze the behavior of the different implementations described earlier, we perform a simulation experiment using these methods. We consider a queuing network and we want to estimate the occupancy of the finite buffer queuing system M/M/1/C0. The results are presented in Figure 1. As expected and since we proceed for a given cost C (C = 104), the crude simulation stops after a few iterations, the number of samples run at the beginning being not sufficient. However, note that splitting simulation and theoretical analysis give very close results.

Comparison between the different methods: queuing theory model. Level of confidence 95/100.

Example 1.2: This model can be applied to approximate counting (see Jerrum and Sinclair [7] and Diaconis and Holmes [5]). Given a positive real vector a = (ai)i=1n and a real number b, we want to estimate the number of 0–1 vectors x = (xi)i=1n s.t.

For more details, see Section 3.2.

Remark 1.1: Hereafter we will take all the Ri equal to R and all the Pi equal to P0 = 1/R. Thus, RP0 = 1.

The aim of the article is to give a precise confidence interval of

. The bound involving the variance of

and given by the Markov inequality is not precise enough. Therefore, as done in the theory of large deviations, we introduce the Laplace transform of

, which can be rewritten as

, where fM is the Mth functional iterate of a Bin(R,P0) generating function (g.f.). The elementary theory of branching processes leads to precise bounds of fM and to a precise confidence interval that we can compare to the confidence interval if we only use the variance. For example, for P = 10−9, C = 108, and α = 0.02, the variance gives a bound about 10−2 and the Laplace transform gives a bound approximating 10−12.

The article is organized as follows. Section 2 describes the importance splitting model, presents our model and goals (the analysis of the behavior of the probability P of a rare event), and introduces an estimator

. Section 3 is dedicated to the optimization of the algorithm. In Section 4 we obtain a precise confidence interval of the estimator via branching processes. Finally, in Section 5 we conclude and discuss the merits of this approach and potential directions for further researches.

2. IMPORTANCE SPLITTING MODEL

Our goal is to estimate the probability of a rare event A corresponding, for example, to the hit of a certain level L by a process X(t). The main hypothesis is to suppose that before entering the target event, there exists intermediate states visited more frequently than A by the trajectory. Thus, define a sequence of sets of states Bi such as A = BM+1BM ⊂ ··· ⊂ B1, which determines a partition of the state space into regions BiBi+1 called the importance regions (as represented in Figure 2). In general, these sets are defined through a function Φ called the importance function from the state space to

such that for all i, Bi = {Φ ≤ Ti} for some value Ti called thresholds, with T1T2 ≤ ··· ≤ TML.

Splitting model.

In this model a more frequent occurrence of the rare event is achieved by performing a number of simulation retrials when the process enters regions where the chance of occurrence of the rare event is higher. The fundamental idea consists of generating N Bernoulli Ber(P1) and check whether the subset B1 is reached. If so, we duplicate the trials in R1 retrials of Bernoulli Ber(P2) and check whether the subset B2 is reached. If none of the higher levels is reached, the simulation stops.

Thus, by the Bayes formula,

Then P is the product of M + 1 quantities (conditional probabilities) that are easier to estimate and with more accuracy than the probability P of the rare event itself, for a given simulation effort.

The estimator

defined in (2) can be rewritten as

where 1i0 i1···ij represents the result of the jth trial. In that case,

Moreover, we define

as the probability of reaching A and we suppose that the process forgets the past after reaching a level; this happens as soon as the process is Markov.

3. STUDY OF THE VARIANCE AND OPTIMIZATION

3.1. Variance of the Estimator

First, note that

is unbiased since

As done in [11], the variance of the estimator

is derived by induction and the variance for k thresholds is given by

where

represents the estimator of P in a simulation with k thresholds.

Clearly, the formula holds in straightforward simulation (i.e., when k = 0), since

is a renormalized sum of i.i.d. Bernoulli variables with parameter P.

To go from k to k + 1, assume (12); thus, we have to prove that this formula holds for k + 1 thresholds. First, note that for all X and Y random variables, which are independent given the set B and X σ(B)-measurable, we have

Now let

The random variables Xi0 are i.i.d. with common law Ber(P1), and conditionally at the event B1, Xi0 and Zi0 are independent. Note that each Zi0 is the estimator of P in a model with k thresholds, T2 to Tk+1 for the trajectory issued from the success of Xi0. Thus,

and by the induction hypothesis,

So applying (13) with X ∼ Ber(P1) and ZZi0, we have

Thus, for M thresholds,

Remark 3.1: The induction principle has a concrete interpretation: If in a simulation with M steps, the retrials generated in the first level are not taken into account except one that we call the main trial, we have a simulation with M − 1 steps.

3.2. Optimization of the Parameters

As stated in Section 1, our aim is to minimize the variance for a fixed budget, giving optimal values for N,R1,…,RM, P1,…,PM+1, and M. Therefore, we have to describe the cost of a given simulation: Each time a particle is launched, it generates an average cost function h. We assume the following:

  • The cost h for a particle to reach Bi starting from Bi−1 depends only on Pi (not on the starting level).
  • h is decreasing in x (which means that the smaller the transition probability is, the harder the transition is and the higher the cost is).
  • h is nonnegative.
  • h converges to a constant (in general, small) when x converges to 1.

The (average) cost is then

where Ni is the number of trials that have reached threshold i. Finally,

Example 3.1: We want to study the model of the simple random walk on

starting from zero that we kill as soon as it reaches the level −1 or k (success if we reach k, failure otherwise).

So let Xn be such that

, where {Yn} is a sequence of random variables valued in {−1,1} with

and define Tk = inf{n ≥ 0 : Xn = −1 or k}.

One can easily check that Xn and Xn2n are martingales. By Doob's stopping theorem,

, which yields

(i.e., the cost needed to reach the next level is (1/p) − 1 if p is the success probability).

To minimize the variance of

, the optimal values are derived in three steps:

  1. The optimal values of N,R1,…,RM are derived when we consider that P1,…,PM+1 are constant (i.e., the thresholds Bi are fixed).
  2. Replacing these optimal values in the variance, we derive the optimal transition probabilities: P1,…,PM+1.
  3. Replacing these optimal values in the variance, we derive M, the optimal number of thresholds.

Optimal values for N, R1,…, RM. Using the method of Lagrange multipliers, we get

Optimal values for P1,…,PM+1. Thus, the variance becomes

Proceeding as previously under the constraint P = P1···PM+1, we obtain that all of the Pi's satisfy

. If we assume that there exists a unique solution to this equation, we have Pi = g(λ); hence, P = g(λ)M+1 and g(λ) = P1/(M+1). Finally,

Optimal value for M. The optimal values for P1,…,PM+1 imply that the optimal Ri becomes 1/Pi, i = 1,…,M; thus,

which we want to minimize in M. Note that Ri Pi = 1. Let

whose derivative cancels in

In general, this does not give an integer. We have y0 = ln P/(M + 1) (i.e., M + 1 = [ln P/y0] or [ln P/y0] + 1). Let ln P/y0 = n + x with 0 < x < 1. Then the following hold:

  • If we take M + 1 = n, y = ln P/n.
  • If we take M + 1 = n + 1, y = ln P/(n + 1).

The value of the ratio ρ := f (n − 1)/f (n) gives the best choice for M as follows:

  • If ρ < 1, M = n − 1.
  • If ρ > 1, M = n.

Thus, the optimal number of thresholds is given by M = [ln P/y0] − 1 or M = [ln P/y0], where y0 solves F(y) = 0. Then M minimizes

Example 3.2: For h = 1, we have to solve y = 2(ey − 1). We get y1 = 0 and y2 ≈ −1.5936. y2 is a minimum and the optimal value of M is

With P = 10k, we have

Note that M increases while P decreases, and with this value of M, each Ri and Pi become

Thus, the optimal sampling number and the optimal transition probabilities are independent of the rare event probability.

Moreover, asymptotically, M = n = [ln P/y0] − 1; thus,

Application 3.1: In approximate counting, remember that the goal is to estimate the number of Knapsack solutions (i.e., the cardinal of Ω defined by

for a given positive real vector a = (ai)i=1n and real number b). We might try to apply the Markov chain Monte Carlo method (MCMC) [9]: Construct a Markov chain

with state space Ω = {x ∈ {0,1}n : a·xb} and transitions from each state x = (x1,…,xn) ∈ Ω defined by the following:

  • With probability ½, let y = x; otherwise
  • select i uniformly at random in {1,…,n} and let y′ = (x1,…,xi−1, 1 − xi,xi+1,…,xn)
  • If ay′ ≤ b, then let y = y′, else let y = x.

The new state is y. This random walk on the hypercube truncated by the hyperplane a·x = b converges to the uniform distribution over Ω. This suggests a procedure for selecting Knapsack solutions almost uniformly at random. Starting in state (0,..,0), simulate

for sufficiently many steps that the distribution over states is “close”

1

The problem is to bound the number of steps necessary to make the Markov chain

“close” to stationarity. More precisely, we need a bound of the mixing time:

where Δx(t) = maxS⊂Ω|Pt(x,S) − Π(S)| and Π is the stationary distribution. In [7], it is shown that

steps suffice.

to uniform; then return to the current state. Of course, sampling over Ω is not the same as estimating the size of Ω. However, the first task leads to the second.

Keep the vector a fixed but allow b to vary. Use

instead of

to emphasize the dependence on b. Assume without loss of generality that a1 ≤ ··· ≤ an and define

. One can check that

Now write

The ratio ρi = |Ω(bi)|/|Ω(bi+1)| may be estimated by sampling almost uniformly from Ω(bi+1) using the Markov chain

and computing the fraction of the samples that lie within Ω(bi).

Now take a = [1,2,3,4], b = 3, h = 1, R = 5, and C = 2600. We chose the levels as follows: First, define b1 = 0, b2 = 1, b3 = 3, b4 = 3, and b5 = b; second, define B0 = Ω, B1 = Ω(b4), B2 = Ω(b3), B3 = Ω(b2), and B4 = Ω(b1). Thus, here, M = n − 1, N = C/n, and nstep = 1020. Obviously, Card(Ω) = 5. We run three different simulations: The first, suggested in [7], consists of estimating the n ratios independently, the crude and splitting ones. We obtain different estimations for Card(Ω):

  • Estimation by crude simulation = 4.088
  • Estimation by the n ratios independently = 5.44
  • Estimation by splitting = 5.019

Even though the levels are not optimal, splitting provides an improvement.

Let us describe briefly the possible solutions of (30). Remember that we want to solve (30); that is, if z = ey and z ≠ 0,1,

First, let z0 be the solution of 2(z − 1) = ln z. Since h′ ≤ 0, H is negative and a quick survey shows that L is positive on]0,z0 [and negative on]z0,1[. As a consequence, the solutions of (37) lie in]z0,1[, if they exist. Thus, solving (30) is equivalent to studying the intersections between H and L. A quick survey of these functions shows that we have two cases (see Fig. 3):

Case 1: An odd number of intersections between L and H

Case 2: An even number of intersections or 0 between L and H

Note that y = 0 is a solution of (30). In case 1, it corresponds to a maximum, and in case 2, it corresponds to a minimum. The second case is excluded since we made the assumption h(1) > 0.

Behavior of H and L.

Remark 3.2: The solution y = 0 corresponds to the following optimal values:

However Pi = 1 implies that P = 1 and Ri = 1 means that we just perform a crude simulation.

Example 3.3: Here, P = 10−12 and C = 104.

1. In Example 1, h(1) = 0 and we are in the second case: The unique solution y = 0 is the minimum.

2. Let h(x) = 1/x − 8x2 + 12x − 5. h(1) = 0 and we are in the first case: y = 0 and y ≈ −0.9919 are the solutions. y = 0 is the maximum and the other solution is the minimum. Taking y ≈ −0.9919, we obtain

and we can take R = 3 and N = 23.

3. Let h(x) = (1/x − 1)2e6x. h(1) = 0 and we are in the second case: y = 0, y1 ≈ −0.4612, and y2 ≈ −0.5645 are the solutions. y = 0 is the minimum and the second solution is the maximum.

4. Let h(x) = 1/x. Here, h(1) = 1. We want to solve (30), whose solutions are y = 0 and y ≈ −0.6438. Taking y ≈ −0.6438, we obtain

and we can take R = 2 and N = 34.

Thus, the control of the variance of

gives a crude confidence interval for P. Indeed, we get

This estimation is, in general, useless. For example, for h = 1, M = 12, and α = 10−2, the upper bound becomes ≈ 5 × 105/N. To obtain a bound lower than 1, we need N ≥ 5 × 105. To improve it, we will use Chernoff's bounding method instead of the Markov inequality: For all λ > 0,

where

is the log-Laplace of

. Optimization on λ > 0 provides

Similarly,

Let ψ* be the Crämer transform of ψ: ψ*(τ) = supλ[λτ − ψ(λ)]. Thus,

So we want to obtain an accurate lower bound of ψ*.

Remark 3.3: Although we would therefore like to take Ri so that Ri Pi = 1, we are constrained to choose Ri to be a positive integer. Hereafter, we suppose that we are in the optimal case, where Ri = 1/Pi is an integer.

4. LAPLACE TRANSFORM OF

To study the Laplace transform of

, we turn to the theory of branching processes (see Harris [6], Lyons [8], and Athreya and Ney [2]). More precisely, we consider our splitting model as a Galton–Watson process, the thresholds representing the different generations.

4.1. Description of the Model and First Results

We consider a Galton–Watson model (Zn), where the size of the nth-generation Zn is the number of particles that have reached the level Bn, with one particle run at the beginning. Then Z0 = 1 and Zn satisfies the following recurrence relation:

where Xin is the number of particles among Ri that have reached the (n + 1)-st level. The (Xin)n≥1 are i.i.d. with common law Binomial, with parameters (Rn,Pn+1) and Xi0 ∼ Ber(P1). Take the optimal values of Section 3.2:

Let

, the g.f. of Z1. Then the g.f. of Zn is the nth iterate of f. Since

, we get

where g is the g.f. of a Ber(P0) and f the g.f. of a Bin(R,P0). Thus, we are interested in the expression of fM, the Mth functional iterate of f.

Here,

, so we are in the critical case of the branching process that ensures that the algorithm of the simulation stops with probability 1 when M → ∞ (see [6]), since if f(3)(1) < ∞,

This emphasizes the rarity character when the number M of thresholds increases and the probabilities between the levels decrease.

In our case,

The iterated function fM has no explicit tractable form and we will derive bounds for fM(s) around s = 1. To do this, we state a general result on the Laplace transform in critical Galton–Watson models, which we could not find in the literature.

4.2. Bounds of fn(s) for 0 ≤ s < 1 and m = 1

Remark 4.1: Remember that fn and its derivatives are convex. Furthermore, for all 0 ≤ s ≤ 1, sf (s) ≤ f (1) = 1, and by induction, f (s) ≤ f2(s) ≤ ··· ≤ 1. Finally, we obtain fn(s) → 1 since fn(s) ≥ fn(0).

Proposition 4.1: Let α1 = f′′(1)/2, C = (maxs∈[0,1] f′′′(s))/6α1, and γn = nα1 [1 − (C/n)(log n + 1)] − α1. Then, for s close to 1 and large n,

Proof: Upper bound: Using Taylor's expansion, with fn(s) ≤ θnfn(1) = 1,

since f′(1) = 1. Let rn = 1 − fn(s); rn satisfies

Now let α0 = f′′(0)/2. Define the decreasing sequences (an) and (bn) satisfying

Then

1. bn's upper bound: Since 0 ≤ bj ≤ 1, we have

Thus,

2. an's lower bound: Apply this upper bound to an0 becoming α1):

By substituting (66) in

, we get

Finally, (63) and (67) lead to the upper bound of fn in (58).

Lower bound: In fact, we prove by induction that

For n = 1, the left-hand side of (68) is given by Remark 4.1. Then note that

; thus, for n large enough,

. For all 1 − (1/n) ≤ s ≤ 1,

However, by induction, we have

, and so, since f is increasing, taking

,

(i.e.,

, where

. Note that

; more precisely, we have

and we finally obtain the left-hand side of (58) since γ → hγ is increasing. █

In the particular case of f (s) = (P0 s + 1 − P0)R, we can derive a more precise lower bound:

Proposition 4.2: For s close to 1,

Observe that this is precise at s = 1.

Proof: Let h(s) = 1 − ((1 − s)/(1 + α1(1 − s))). Since f (1) = h(1) = 1, f′(1) = h′(1) = 1, and f′′(1) = h′′(1) = 2α1, the sign of fh trivially depends on the sign of the third derivative of fh, which is obviously negative here. Then hf. Since f is increasing, we deduce (74) by induction. █

We plot in Figure 4a the upper bound and the two lower bounds for P = 10−12 and s near 1.

Bounds of fM(s).

4.3. Bounds of fn(s) for 1 ≤ s and m = 1

Remark 4.2: First, let us note that, by convexity, for all s ≥ 1,

hence, f (s) ≥ s, and by induction on n,

We remark that for s > 1, the iterated function increases rapidly to infinity.

Proposition 4.3: Let γn′ = nα1 [1 + (C/n)(log n + 1)] − α1. Then, for s close to 1 and large n,

Proof: Proceeding as in Proposition 4.1 leads to the upper bound. Here, fnn→+∞ ∞, which prevents us from making a Taylor expansion around 1. To overcome this difficulty, consider kn, the inverse function of fn; it is the nth functional iterate of the g.f. k (inverse function of f) that takes the value 1 in 1, whose derivative is 1 in 1 and second derivative is negative, and knn→+∞ 1. Thus, making a Taylor development and using the same tools as previously, we get

where β2 = k′′(s)/2 and sn := 1 + (1/nα1). Using the link between kn and fn and the upper bound of kn,

The lower bound of kn leads to an upper bound of fn. However, it provides no improvement. █

As done earlier, we can derive a more precise upper bound in the particular case of f (s) = (P0 s + 1 − P0)R:

Proposition 4.4: For s close to 1,

We plot in Figure 4b these three bounds for P = 10−12 and s near 1.

About the geometric distribution. If the law of X is such that the probabilities pk are in a geometric proportion (

for k = 1,2…. and p0 = 1 − p1p2… with b,c > 0 and b ≤ 1 − c), then the associated g.f. is a rational function:

Taking b = (1 − c)2 and c = α1 /(1 + α1) leads to

So we have compared the nth functional iterate of a Binomial g.f. to the one of a geometric g.f. It suggests comparing the importance splitting models with Binomial and with geometric laws. The geometric laws model is set in the following way: We run particles one after the other. As long as the next level is not reached, we keep on generating particles; then we start again from the level the particle is at (the geometric distribution is the law of the first success).

This link is also stressed by Cosnard and Demongeot in [4]: for m = 1 and σ2 = f′′(1) = 2α1, the asymptotic behavior of f2n is the same as the geometric distribution with the same variance (i.e., h).

4.4. Optimization of the Crämer Transform

Remember that

Considering the gradient of the functions, we prove that the supremum for λ ≥ 0 is reached near zero. So we can use the upper bounds for fM obtained in the previous subsection, which leads to lower bounds for ψ*:

where

Finally,

One can easily obtain explicit but complex expressions for F(x) and G(x). We plot in Figure 5 the upper bounds obtained by the variance and by the Laplace transform, for different values of α, the prescribed error of the confidence interval. We take P = 10−9 and the optimal values obtained above for the parameters. Note that the upper bound given by the Laplace transform is better than the bound given by Chebychev's inequality, with the variance. We obtain

. In the preceding example where P = 10−9, if we fix α = 0.05 and L close to 0.01, then the corresponding costs needed are 3 × 107 for the variance and 3 × 106 for the Laplace transform.

Upper bounds obtained by the variance and the Laplace transform.

5. CONCLUSION

The simplified model described here has two main faults. First, we cannot choose in general the optimal level Pi. In practice, we just have an empirical estimation of the Pi, and we can adjust the levels according to them. A more precise analysis is then needed to get confidence intervals of the estimation. Second, the optimal sampling number at each level is not an integer in general. Therefore, in practice, the number of particles generated at each step should be chosen at random, either such that

. Thus, we finally need to work in a random environment. This requires a precise asymptotic of random iterates of the Laplace transform, where analysis is more delicate than the one presented here and will be the subject of a forthcoming paper.

Acknowledgments

I gratefully acknowledge my supervisors Dominique Bakry and Pascal Lezaud for their helpful suggestions and comments. I would also like to thank Persi Diaconis for telling me about approximate counting and Pieter-Tjerk de Boer for making a number of valuable remarks that improved the precision of the figures.

References

REFERENCES

Aldous, D. (1989). Probability approximations via the Poisson clumping heuristic. Applied Mathematical Sciences. Vol. 77. New York: Springer-Verlag.CrossRef
Athreya, K.B. & Ney, P.E. (1972). Branching processes. Die Grundlehren der mathematischen Wissenschaften Band 196. New York: Springer-Verlag.CrossRef
Cerou, F., Del Moral, P., Legland, F., & Lezaud, P. (2002). Genetic genealogical models in rare event analysis. Preprint. Available from http://www.lsp.ups-tlse.fr/delmoral/preprints.html.
Cosnard, M. & Demongeot, J. (1984). Théorèmes de point fixe et processus de Galton–Watson. Annales des sciences mathématiques du Québec 8(1): 521.Google Scholar
Diaconis, P. & Holmes, S. (1995). Three examples of Monte-Carlo Markov chains: At the interface between statistical computing, computer science, and statistical mechanics. In D. Aldous, P. Diaconis, J. Spencer, & J.M. Steele (eds.), Discrete probability and algorithms. IMA Volumes in Mathematics and its Applications, Vol. 72. New York: Springer-Verlag, pp. 4356.
Harris, T.E. (2002). The theory of branching processes. Dover Phoenix Editions. Mineola, NY: Dover.
Jerrum, M. & Sinclair, A. (1996). The Markov chain Monte Carlo method: An approach to approximate counting and integration. In D. Hochbaum (ed.), Approximation algorithms for NP-hard problems. Boston: PWS Publishing, pp. 482520.
Lyons, R. (2002). Probability on trees and networks. Preprint. Available from http://mypage.iu.edu/rdlyons/.
Norris, J.R. (1998). Markov chains. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge: Cambridge University Press.
Sadowsky, J.S. (1996). On Monte Carlo estimation of large deviations probabilities. Annals of Applied Probability 6(2): 399422.Google Scholar
Villen-Altamirano, J. & Villen-Altamirano, M. (1994). RESTART: A method for accelerating rare event simulations. Amsterdam: North-Holland, pp. 7176.
Figure 0

Comparison between the different methods: queuing theory model. Level of confidence 95/100.

Figure 1

Splitting model.

Figure 2

Behavior of H and L.

Figure 3

Bounds of fM(s).

Figure 4

Upper bounds obtained by the variance and the Laplace transform.