Economic history becomes BIG DATA: Ran Abramitzky, Roy Mi...

Economic history becomes BIG DATA: Ran Abramitzky, Roy Mill, and Santiago P��rez: Linking Individuals Across Historical Sources: a Fully Automated Approach: "Linking individuals across historical datasets relies on information such as name and age that is both non-unique and prone to enumeration and transcription errors...



...These errors make it impossible to find the correct match with certainty. We suggest a fully automated method for linking historical datasets that enables researchers to create samples that minimize type I (false positives) and type II (false negatives) errors. The first step of the method uses the Expectation-Maximization (EM) algorithm, a standard tool in statistics, to compute the probability that each two observations correspond to the same individual. The second step uses these estimated probabilities to determine which records to use in the analysis. We provide codes to implement this method...




#shouldread
 •  0 comments  •  flag
Share on Twitter
Published on April 26, 2018 16:48
No comments have been added yet.


J. Bradford DeLong's Blog

J. Bradford DeLong
J. Bradford DeLong isn't a Goodreads Author (yet), but they do have a blog, so here are some recent posts imported from their feed.
Follow J. Bradford DeLong's blog with rss.