85 — Open-World Knowledge Graph Completion

Baoxu Shi & Tim Weninger (1711.03438)

Read on 14 November 2017
#neural-network  #graph-theory  #knowledge-graph  #natural-language  #nlp 

Knowledge graphs are essentially relationship edgelists, where each edge is a relationship between $h$ (head), with a relationship $r$ to $t$ (tail).

For example, $<Earth, BelongsTo, SolarSystem>$ and $<Earth, Orbits, Sol>$ are both relationships that use Earth as a head entity. $<Mars, Orbits, Sol>$ also uses Sol as a tail entity.

These graphs are commonly “closed-world”, meaning that the KG includes every relationship that is applicable to a world, and every world rule is represented in the graph.

Another common implementation of KGs — “open world” — enables the “expectation” of probablistically extant but unknown rules. This task is known as Knowledge Graph Completion. For example, in the above KG, $<Mars, BelongsTo, SolarSystem>$ is one relationship that might be predicted without being supplied.

Shi and Weninger design an energy function $L(T)$ that, for a given tuple $T$, determines a “closeness” metric for the tuple compared to other tuples in the graph; this can be used both to fill in components of a tuple as well as determine the probability that a completely novel tuple exists and belongs to the graph.

They present ConMask, a TensorFlow-based system that performs KG completion using a neural-network-based $L$ based on natural language processing from Wikipedia pages or other text sources.

ConMask is particularly performant compared to its peers; predictive systems like this also have the added advantage of taking less storage space because the relationships can be guessed at runtime.

Future work aims to expand ConMask to also predict relationships that cannot be found in literature; that is, it will be able to determine relationships between objects where a relationship was not necessarily expected to exist.