Sunday, November 10, 2019
Review of New Types of Relation Extraction Methods
This is explained by the fact that patterns do not tend to uniquely identify the given relation. The systems which participated in MUCH and deal with relation extraction also rely on rich rules for identifying relations (Fought et al. 1 998; Gargling et al. 1998; Humphreys et al. 1998). Humphreys et al. 1998) mention that they tried to add only those rules which were (almost) certain never to generate errors in analysis; therefore, they had adopted a low recall and high precision approach. However, in this case, many relations may be missed due to the lack of unambiguous rules to extract them.To conclude, knowledge-based methods are not easily portable to other domains and involve too much manual labor. However, they can be used effectively if the main aim is to get results quickly in well-defined domains and document collections. 5 Supervised Methods Supervised methods rely on a training set where domain-specific examples eave been tagged. Such systems automatically learn extractors for relations by using machine-learning techniques. The main problem of using these methods is that the development of a suitably tagged corpus can take a lot of time and effort.On the other hand, these systems can be easily adapted to a different domain provided there is training data. There are different ways that extractors can be learnt in order to solve the problem of supervised relation extraction: kernel methods (Shoo and Grossman 2005; Bunches and Mooney 2006), logistic regression (Kamala 2004), augmented parsing (Miller et al. 2000), Conditional Random Fields CRY) (Calcutta et al. 2006). In RE in general and supervised RE in particular a lot of research was done for IS-A relations and extraction of taxonomies.Several resources were built based on collaboratively built Wisped (YOGA ââ¬â (Issuance et al. 2007); Depended ââ¬â (Rue et al. 2007); Freebase ââ¬â (Blacker et al. 2008); Wicking (Instates et al. 2010)). In general, Wisped is becoming more and more popula r as a source for RE. E. G. (Opponent and Strobe 2007; Unguent et al. AAA, b, c). Query logs are also considered a valuable source of information for RE and their analysis is even argued to give better results than other suggested methods in the field (Passes 2007, 2009). 5. 19 Weakly-supervised Methods Some supervised systems also use bootstrapping to make construction of the training data easier. These methods are also sometimes referred to as ââ¬Å"huckleberries information extractionâ⬠. Bring (1998) describes the DIPPER (Dual Iterative Pattern Relation Expansion) method used for identifying authors of the books. It uses an initial small set of seeds or a set of hand- constructed extraction patterns to begin the training process. After the occurrences of needed information are found, they are further used for recognition of new patterns.Regardless of how promising bootstrapping can seem, error propagation becomes a serious problem: mistakes in extraction at the initial stag es generate more mistakes at later stages and decrease the accuracy of the extraction process. For example, errors that expand to named entity recognition, e. G. Extracting incomplete proper names, result in choosing incorrect seeds for the next step of bootstrapping. Another problem that can occur is that of semantic drift. This happens when senses of the words are not taken into account and therefore each iteration results in a move from the original meaning.Some researchers (Korea and How 2010; Hove et al. 2009; Korea et al. 2008) have suggested ways to avoid this problem and enhance the performance of this method by using doubly- anchored patterns (which include both the class name and a class member) as well as graph structures. Such patterns have two anchor seed positions ââ¬Å"{type} such as {seed} and *â⬠and also one open position for the terms to be learnt, for example, pattern ââ¬Å"Presidents such as Ford and {X}â⬠can be used to learn names of the presidents .Graphs are used for storing information about patterns, found words and links to entities they helped to find. This data is further used for calculating popularity and productivity of the candidate words. This approach helps to enhance the accuracy of bootstrapping and to find high-quality information using only a few seeds. Korea (2012) employs a similar approach for the extraction Of cause-effect relations, where the pattern for bootstrapping has a form of ââ¬Å"X and Y verb Zâ⬠, for example, and virus cause Human-based evaluation reports 89 % accuracy on 1500 examples. Self-supervised Systems Self-supervised systems go further in making the process of information extraction unsupervised. The Knolling Web II system (Edition et al. 2005), an example of a self-supervised system, learns ââ¬Å"to label its own training examples using only a small set of domain-independent extraction patternsâ⬠. It uses a set of generic patterns to automatically instantiate relation-specif ic extraction rules and then learns domain-specific extraction rules and the whole process is repeated iteratively. The Intelligence in Wisped (IPP) project (Weld et al. 2008) is another example of a self-supervised system.It bootstraps from the Wisped corpus, exploiting the fact that each article corresponds to a primary object and that any articles contain infusions (brief tabular information about the article). This system is able to use Wisped infusions as a starting point for training 20 the classifiers for the page type. IPP trains extractors for the various attributes and they can later be used for extracting information from general Web pages. The disadvantage of IPP is that the amount of relations described in Wisped infusions is limited and so not all relations can be extracted using this method. . 1 Open Information Extraction Edition et al. (2008) introduced the notion of Open Information Extraction, which is opposed to Traditional Relation Extraction. Open information e xtraction is ââ¬Å"a novel extraction paradigm that tackles an unbounded number of relationsâ⬠. This method does not presuppose a predefined set of relations and is targeted at all relations that can be extracted. The Open Relation extraction approach is relatively a new one, so there is only a small amount of projects using it. Texturing (Bank and Edition 2008; Bank et al. 2007) is an example of such a system.A set of relinquishment's lexicon-syntactic patterns is used to build a relation- independent extraction model. It was found that 95 % Of all relations in English can be described by only 8 general patterns, e. G. ââ¬Å"El Verb E ââ¬Å". The input of such a system is only a corpus and some relation-independent heuristics, relation names are not known in advance. Conditional Random Fields (CRY) are used to identify spans of tokens believed to indicate explicit mentions of relationships between entities and the whole problem of relation extraction is treated as a problem of sequence labeling.The set of linguistic features used in this system is similar to those used by other state of-the-art relation extraction systems and includes e. G. Part-of-speech tags, regular expressions for detection of capitalization and punctuation, context words. At this stage of development this system ââ¬Å"is able to extract instances of the four most frequently observed relation types: Verb, Noun+Prep, Verb+Prep and Infinitiveâ⬠. It has a number of limitations, which are however common to all RE systems: it extracts only explicitly expressed relations that are primarily word-based; relations should occur between entity names within the same sentence.Bank and Edition (2008) report a precision of 88. 3 % and a recall of 45. 2 Even though the system shows very good results the relations are not pacified and so there are difficulties in using them in some other systems. Output Of the system consists Of tepees stating there is some relation between two entities, but there is no generalization of these relations. Www and Weld (2010) combine the idea of Open Relation Extraction and the use of Wisped infusions and produce systems called Weepers and Weeps . Weepers improves Texturing dramatically but it is 30 times slower than Texturing.However, Weeps does not have this disadvantage and still shows an improved F-measure over Texturing between 1 5 % to 34 % on three corpora. Fader et al. 201 1) identify several flaws in previous works in Open Information Extraction: ââ¬Å"the learned extractors ignore both ââ¬Å"holisticâ⬠aspects of the relation phrase (e. G. , is it contiguous? ) as well as lexical aspects (e. G. , how many instances of this relation are there? )â⬠. They target these problems by introducing syntactic constraints (e. G. , they require the relation phrase to match the POS tag 21 pattern) and lexical constraints.Their system Revere achieves an AUK which is 30 % better than WOE (Www and Weld 201 0) and Texturing (Bank and Denton 2008). Unshackles et al. (AAA) approach this problem from another angle. They try to mine for patterns expressing various relations and organism then in hierarchies. They explore binary relations between entities and employ frequent items mining (Augural et al. 1993; Syrians and Augural 1 996) to identify the most frequent patterns. Their work results in a resource called PATTY which contains 350. 69 pattern sunsets and substitution relations and achieves 84. 7 % accuracy. Unlike Revere (Fader et al. 201 1) which constrains patterns to verbs or verb phrases that end with prepositions, PATTY can learn arbitrary patterns. The authors employ so called syntactic- ontological-lexical patterns (SOL patterns). These patterns constitute a sequence of words, POS-tags, wildcats, and ontological types. For example, the pattern ââ¬Å"persons [ads] voice * songâ⬠would match the strings my Heinousness soft voice in Rehab and Elvis Presley solid voice in his song All shook up.Their approach is based on collecting dependency paths from the sentences where two named entities are tagged (YACHT (Hoffa et al. 2011) is used as a database of all Ones). Then the textual pattern is extracted by finding the shortest paths connecting two entities. All of these patterns are transformed into SOL (abstraction of a textual pattern). Frequent items quinine is used for this: all textual patterns are decomposed into n-grams (n consecutive words). A SOL pattern contains only the n-grams that appear frequently in the corpus and the remaining word sequences are replaced by wildcats.The support set of the pattern is described as the set of pairs of entities that appear in the place Of the entity placeholders in all strings in the corpus that match the pattern. The patterns are connected in one sunset (so are considered synonymous) if their supporting sets coincide. The overlap of the supporting sets is also employed to identify substitution relations between various sunsets. . 2 Di stant Learning Mint et al. (2009) introduce a new term ââ¬Å"distant supervisionâ⬠. The authors use a large semantic database Freebase containing 7,300 relations between 9 million named entities.For each pair of entities that appears in Freebase relation, they identify all sentences containing those entities in a large unlabeled corpus. At the next step textual features to train a relation classifier are extracted. Even though the 67,6 % of precision achieved using this method has room for improvement, it has inspired many researchers to further investigate in this direction. Currently there are a number of papers ring to enhance ââ¬Å"distant learningâ⬠in several directions. Some researchers target the heuristics that are used to map the relations in the databases to the texts, for example, (Takeouts et al. 01 2) argue that improving matching helps to make data less noisy and therefore enhances the quality of relation extraction in general. Hay et al. (2010) propose us ing an undirected graphical model for relation extraction which employs ââ¬Å"distant learning' but enforces selection preferences. Ridded et al. (2010) reports 31 % error reduction compared to (Mint et al. 2009). 22 Another problem that has been addressed is language ambiguity (Hay et al. 01 1, 2012). Most methods cluster shallow or syntactic patterns of relation mentions, but consider only one possible sense per pattern.However, this assumption is often violated in reality. Hay et al. (201 1) uses generative probabilistic models, where both entity type constraints within a relation and features on the dependency path between entity mentions are exploited. This research is similar to DIRT (Line and Panatela 2001 ) which explores distributional similarity of dependency paths in order to discover different representations of the same semantic relation. However, Hay et al. (2011) employ another approach and apply IDA (Belie et al. 2003) with a slight modification: observations are re lation tepees and not words.So as a result of this modification instead of representing semantically related words, the topic latent variable represents a relation type. The authors combine three models: Reel-LAD, Reel-LDAP and Type-LAD. In the third model the authors split the features of a duple into relation level features and entity level features. Relation level features include the dependency path, trigger, lexical and POS features; entity level features include the entity mention itself and its named entity tag. These models output clustering of observed relation tepees and their associated textual expressions.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.