Using the methods finding directions of arrows in directed graphical models, it turns out that in several cases independent variables like subject’s genotype Copyright © 2003, Idea Group Inc. Copying or distributing in print or electronic forms without written permission of Idea Group Inc. is prohibited. A Survey of Bayesian Data Mining 23 or age come out as caused by variables that are dependent, like blood tests or brain region sizes. This shows that application of causality methods give misleading results in a case like this, where it is known that many important variables cannot be measured, and are even not known.

One without classes or with a different number of classes. One can also have a variable number of classes and evaluate by finding the posterior distribution of the number of classes. The data probability is obtained by integrating, over the Dirichlet distribution, the sum over all assignments of cases to classes, of the assignment probability times the product of all resulting case probabilities according to the respective class model. Needless to say, this integration is feasible only for a handful of cases where the data is too meager to permit any kind of significant conclusion on the number of classes and their distributions.

