Building a multilingual database with wordnets for several European languages.
EuroWordNet was a European resources and development project supported by the
Human Language Technology sector of the Telematics Applications Programme (also see the description of all the funded projects ).EuroWordNet is a multilingual database with wordnets for several European languages (Dutch, Italian, Spanish, German, French, Czech and Estonian). The wordnets are structured in the same way as the American wordnet for English ( Princeton WordNet, Miller et al 1990) in terms of synsets (sets of synonymous words) with basic semantic relations between them. Each wordnet represents a unique language-internal system of lexicalizations. In addition, the wordnets are linked to an Inter-Lingual-Index, based on the Princeton wordnet. Via this index, the languages are interconnected so that it is possible to go from the words in one language to similar words in any other language. The index also gives access to a shared top-ontology of 63 semantic distinctions. This top-ontology provides a common semantic framework for all the languages, while language specific properties are maintained in the individual wordnets. The database can be used, among others, for monolingual and cross-lingual information retrieval, which was demonstrated by the users in the project.
The EuroWordNet project was completed in the summer of 1999. The design of the database, the defined relations, the top-ontology and the Inter-Lingual-Index are now frozen. Nevertheless, many other institutes and research groups are developing similar wordnets in other languages (European and non-European) using the EuroWordNet specification. If compatible, these wordnets can be added to the above database and, via the index, connected to any other wordnet. The EuroWordNet format is defined by the EuroWordNet Database Editor Polaris. A specification can be found in the user-manual of the database. To our knowledge, wordnets are currently developed for Swedish, Norway, Danish, Greek, Portuguese, Basque, Catalan, Romanian, Lithuan, Russian, Bulgarian, Slovenic.
The cooperative framework of EuroWordNet is continued through the Global WordNet Association. This is a free and public association that builds on EuroWordNet and Princeton WordNet. The aim is to stimulate further building of wordnets, further standardization and interlinking and the development of tools, dissemination of information.
Further Contents:
CTO
Irion Technologies, Delft
Email: Piek.Vossen@irion.nl