A model for information extraction in portuguese based on text patterns - Citegraph

Paper Info

Title
A model for information extraction in portuguese based on text patterns

Abstract
This paper proposes an information extraction model that identifies text patterns representing relations between two entities. It is proposed that, given a set of entity pairs representing a specific relation, it is possible to find text patterns representing such relation within sentences from documents containing those entites. After those text patterns are identified, it is possible to attempt the extraction of a complementary entity, considering the first entity of the relation and the related text patterns are provided. The pattern selection relies on regular expressions, frequency and identification of less relevant words. Modern search engines APIs and HTML parsers are used to retrieve and parse web pages in real time, eliminating the need of a pre-established corpus. The retrieval of document counts within a timeframe is also used to aid in the selection of the entities extracted.

Year	DOI	Venue
2013	10.1007/978-3-642-37256-8_30	CICLing (2)
Keywords	Field	DocType
text pattern,pattern selection,information extraction model,modern search engines apis,document count,html parsers,specific relation,entity pair,complementary entity,related text pattern	Text mining,Regular expression,Search engine,Information retrieval,Web page,Computer science,Portuguese,Information extraction,Natural language processing,Artificial intelligence,Parsing,Relationship extraction	Conference
Citations	PageRank	References
0	0.34	8
Authors
2

Authors (2 rows)

Cited by (0 rows)

References (8 rows)

Name	Order	Citations	PageRank
Tiago Luis Bonamigo	1	11	1.00
Renata Vieira	2	82	11.44

1