2019 : Probabilistic Record Matching for Entity Resolution Using Markov Logic Networks

Mochamad Hariadi ST., M.Sc., Ph.D


Abstract

Entity resolution (ER) is a problem in identifying objects referring to the same real-world entity into a single representation. In the context of the database, ER is also known as record linkage to determine records that refer to the same entities in which statistical probabilistic approach of this type of ER is called probabilistic record linkage (PRL). In addition, PRL has been used for various ER problems, including derivatives that use machine learning as an improvement. However, this probabilistic approach has one problem in ER for dealing with missing data that commonly occur in unreliable datasets. Such unreliable data can lead to more uncertainty and can reduce the quality of the final result. This paper discusses an alternative approach of PRL using a Markov logic networks (MLN) to infer the matching of record pairs in unreliable datasets, especially for datasets with a high rate of missing data. The proposed …