Open Knowledge Graph Canonicalization
Open Information Extraction approaches leads to creation of large Knowledge bases (KB) from the web. The problem with such methods is that their entities and relations are not canonicalized, which leads to storage of redundant and ambiguous facts. For example, an Open KB storing <Barack Obama, was born in, Honolulu> and <Obama, took birth in, Honolulu> doesn’t know that Barack Obama and Obama mean the same entity. Similarly, took birth in and was born in also refer to the same relation. Problem of Open KB canonicalization involves identifying groups of equivalent entities and relations in the KB.
Datasets
Datasets |
# Gold Entities |
#NPs |
#Relations |
#Triples |
Base |
150 |
290 |
3K |
9K |
Ambiguous |
446 |
717 |
11K |
37K |
ReVerb45K |
7.5K |
15.5K |
22K |
45K |
Noun Phrase Canonicalization
Go back to the README