we should keep in mind that just because we have a lot of data, it does not mean that there are underlying rules that can be learned. We should make sure that there are dependencies in the underlying process and that the collected data provides enough information for them to be learned with acceptable accuracy.

