Abstract | Relation extraction, the task of extracting facts from natural language text and creating machine
readable knowledge, is a great dream of artificial intelligence. Today, most approaches to relation
extraction are based on machine learning and thus starved by scarce training data. Distant supervision,
which automatically creates training data, only works with relations that already populate
a knowledge base. In particular, most dynamic, time dependent event relations are ephemeral and
are rarely stored in a pre-existing knowledge base. This drawback seriously limits the usability of
distant supervision.
To address the challenges of relation extraction, we present four novel techniques VELVET,
NEWSSPIKE-PARA, NEWSSPIKE-RE, NEWSSPIKE-SCALE. They are based on two key ideas.
The first is ontological smoothing, that allows us to map the target relations to database views over
a background knowledge base, and thus allow distant supervision to work on the user specified
relations. The second is temporal correspondence, that allows us to exploit parallel news streams
to generate accurate training sentences for large sets of event relations.
In this dissertation, we formalize the characteristics necessary for ontological smoothing and
temporal correspondence. We develop the algorithms that automatically learn scalable relation
extractors. The results of our experiments show that the learned extractors predict high quality
extractions for both static and event relations. |