They can use the prior distribution to encode experts’ knowledge about the problem—their answer to Hume’s question. For example, we can design an initial Bayesian network for medical diagnosis by interviewing doctors, asking them which symptoms they think depend on which diseases, and adding the corresponding arrows. This is the “prior network,” and the prior distribution can penalize alternative networks by the number of arrows that they add or remove from it. But doctors are fallible, so we’ll let the data override them: if the increase in likelihood from adding an arrow outweighs the
...more
This is also true of the FinTech vertical where use of experts to create labels, encoders for future distributions, et al.