share
MathematicsSamples from the Dirichlet measure
[+1] [0] fishiwhj
[2013-08-09 09:27:58]
[ probability-theory stochastic-processes self-learning simulation ]
[ https://math.stackexchange.com/questions/463472/samples-from-the-dirichlet-measure ]

In Ferguson, 1973 [1], Definition 2, he defines a sample of size $n$ from a random probability measure $G$ on $(\mathcal{X}, \mathcal{B})$ as:

$$ P(X_1 \in C_1, \cdots, X_n \in C_n | G(A_1), \cdots, G(A_m), G(C_1), \cdots, G(C_n)) = \prod_{j=1}^n G(C_j) $$ almost surely, where $m = 1, 2, \ldots$ and $A_1, \ldots, A_m, C_1, \ldots, C_n$ are measurable sets.

I'm confused about this definition of generating samples from a random probability measure. Ferguson shows that $G$ is a probability measure over the set of all probability measures from $\mathcal{B}$ to $[0, 1]$(Ferguson 1973, Section 3), and by his construction we know that the joint distribution of $(G(B), G(B^c))$ follows from a dirichlet distribution with parameters $(\alpha(B), \alpha(B^c))$. Since the marginal distribution of $G(B)$ is already known, we can draw samples from $G(B)$.

One question is, by his definition, why do we need the condition that given $G(C_1), \ldots, G(C_n)$, the events $\{X_i \in C_i\}$ for $i = 1, \ldots, n$ are independent of the rest of the process?

Furthermore, given measurable sets $C_1, \ldots, C_n$, there may have two ways to generate samples from a random probability measure $G$:

  1. Draw samples from the joint distribution of $(G(C_1), \ldots, G(C_n))$ directly, or
  2. Draw each $X_i \in C_i$ from the distribution of $G(C_i)$ for $i = 1, \ldots, n$ separately.

The difference is samples from the first method may not fall into the region ${C_i}$. Is this similar to generating the dirichlet samples from Gamma random variables?