Ontology Learning from Text

Ontologies are formal, explicit specifications of shared conceptualizations, representing concepts and their relations that are relevant for a given domain of discourse [4]. Currently, ontologies are mostly constructed by hand, which proves to be very ineffective and may cause a major barrier to their large-scale use in knowledge markup for the Semantic Web. Creating ambitious Semantic Web applications based on ontological knowledge implies the development of new, highly adaptive and distributed ways of handling and using knowledge that enable semi-automatic con-struction and refinement of ontologies. Automation of ontology construction can be implemented by a combined use of linguistic analysis and machine learning approaches for text mining, which provides facilities for ontology construction and refinement (see e.g. [5] and the overview of ontology learning methods in [3]).

As human language is a primary mode of knowledge transfer, ontology development could be based more directly on the linguistic analysis of relevant documents. In recent years, a number of systems have been introduced that are designed along these lines, e.g. ASIUM [2], TextToOnto [6], Ontolearn [7] and OntoLT [1], all of which combine a certain level of linguistic analysis with machine learning algorithms to find potentially interesting concepts and relations between them.

A typical approach in ontology learning from text first involves the extraction of (more or less complex) terms from a domain-specific
corpus. Extracted terms are statistically processed to determine their relevance for the domain corpus at hand and clustered into groups with the purpose of identifying a taxonomy of potential classes. Additionally, relations can be identified, mostly by computing a statistical measure of connectedness between identified clusters.


[1]Buitelaar P. / Olejnik D. / Sintek M.: A Protege Plug-In for Ontology Extraction from Text Based on Linguisitc Analysis. In: Proceedings of the European Semantic Web Symposium ESWS-2004, Greece, May 2004.




[2] Faure D. / Nédellec C. / Rouveirol C.: Acquisition of Semantic Knowledge using Machine learning methods: The System ASIUM. Technical report number ICS-TR-88-16, 1998.




[3] Gomez-Perez A. / Manzano-Macho D.: A Survey of Ontology Learning Methods and Techniques. Deliverable 1.5, OntoWeb Project, 2003.




[4] Gruber T.: Towards principles for the design of ontologies used for knowledge sharing. Int. Journal of Human and Computer Studies 43(5/6), 1994, 907-928.




[5] Maedche A.: Ontology Learning for the Semantic Web. The Kluwer International Series in Engineering and Computer Science, Volume 665, 2003.




[6] Maedche, A. / Staab, S.: Semi-automatic Engineering of Ontologies from Text. In: Proceedings of the 12th International Conference on Software Engineering and Knowledge Engineering, 2000.




[7] Navigli R. / Velardi P. / Gangemi A.: Ontology Learning and its application to automated terminology translation. IEEE Intelligent Systems, vol. 18:1, January/February 2003.