Information Extraction


Selected Tools

Amilcare, (University of Sheffield)
  Amilcare is an adaptive Information Extraction (IE) tool developed at the University of Sheffield. It induces symbolic rules for information extraction from XML annotated documents. It has been developed as part of the AKT project. Amilcare has been released to a number of companies and academic institutions. It has been integrated into the MnM annotation tool (developed at KMI, Open University) and in the Ontomat-Annotizer (developed at AIFB, University of Karlsruhe) as IE-based support to annotation. The Annie Gate tool is used for preprocessing.
GATE, (University of Sheffield)
  GATE comprises an architecture, framework (or SDK) and graphical development environment. The system has been used for many language processing projects; in particular for Information Extraction in many languages. The system supports the full lifecycle of language processing components, from corpus collection and annotation through system evaluation.
Intelligent Miner for Text , (IBM)
  **Includes a wide range of text analysis tools for feature extraction,
clustering, categorization and summarization.
  SProUT is a platform for development of multilingual shallow text processing systems. It consists of several linguistic processing resources and provides grammar development and testing environment. Additionally, it can be used for building higher-level linguistic components by flexibly combining existing resources. Currently, the platform provides three linguistic processing resources: Tokenizer, gazetteer Checker and Morphology (lexical resources are provided for English, German, French, Italian, Spanish, Chinese, and Japanese). A grammar in SProUT consists of a set of rules, where the left-hand side is a regular expression over typed feature structures representing the recognition pattern, and the right-hand side is a typed feature structure which specifies how the output structure is constructed.