Digital Dictionary of the German Language
Script error: No such module "Draft topics".
Script error: No such module "AfC topic".
The Digital Dictionary of the German Language (German: Digitales Wörterbuch der deutschen Sprache, DWDS) is a project of the Berlin-Brandenburg Academy of Sciences and Humanities whose aim is to create a digital dictionary system based on very large electronic text corpora.
It builds on the six-volume Dictionary of Contemporary German (German: Wörterbuch der deutschen Gegenwartssprache, WDG) and links it with its own text and dictionary resources. It provides the user with the latest spelling, pronunciation in the form of audio files and a wide range of information on the form, use and meaning of its headwords.
Components[edit]
In the current version of the DWDS, the Word Information System, four types of lexical information are linked: the dictionary articles from the WDG, automatically generated information on synonyms, hyponyms, hyperonyms from the WDG, textual examples from the DWDS core corpus as well as statistical co-occurrence information from the core corpus (the so-called collocations, which indicate the frequencies of occurrence of neighbouring words).
Dictionary[edit]
The Dictionary of Contemporary German (WDG) was compiled in the German Academy of Sciences at Berlin (from 7 October 1972: Academy of Sciences of the GDR) between 1952 and 1977 under the direction of Ruth Klappenbach. The WDG comprises over 4,500 pages and contains 60,000 or, with the addition of the composites, 121,000 keywords. From February 2002 to March 2004, the WDG was digitally recorded, structured and prepared for research under the leadership of the Berlin-Brandenburg Academy of Sciences and Humanities. The text corpus was compiled and expanded between 2000 and 2003 with the support of the German Research Foundation (DFG) and has been available as a reference work on a website since March 2003.
Text corpora[edit]
The text corpora for the DWDS are being continuously expanded. As of May 2018, they comprise 13 billion current text words and consist of two large sub-corpora: the core corpus and the supplementary corpus.
The DWDS core corpus comprises about 100 million text words; it is evenly spread over the entire 20th century and balanced according to text types. Four text types form the basis of the corpus: fiction (28.42 %), newspaper (27.36 %), academic texts (23.15 %) and utility texts (21.05 %). Since it was not possible to achieve complete temporal balance for the transcribed texts of spoken language, this is available as an independent corpus under Special Corpora. The DWDS core corpus is the first reference corpus of the German language of the 20th century and is at least equal in quality to the British National Corpus (BNC), which was previously considered the standard.
The DWDS has concluded usage agreements with over 20 publishers and numerous public and private text providers for texts with rights and is able to provide, for example, works by Thomas and Heinrich Mann, Martin Walser, Heinrich Böll, Jürgen Habermas or Victor Klemperer for internet research. The DWDS is the first reference corpus of the German language of the 20th century.[1]
The supplementary corpus comprises over 1.5 billion text words in about 3.5 million documents. It is designed less for balance than for volume and topicality and consists mainly of newspaper sources from the years 1980-2006. All sources are bibliographically referencable, and care was taken in the preparation to ensure a spread of content and quality.
Open corpora[edit]
The DWDS corpora can be searched free of charge. However, due to the usage agreements with the rights holders, prior registration is necessary for a large number of texts. More than 10,000 users are registered in the DWDS word information system.
- DWDS core corpus
- Der Tagesspiegel corpus (1996–2005)
- Berliner Zeitung corpus (1946–1993)
- Berliner Zeitung corpus (1994–2005)
- Corpus of Jewish periodicals of the 19th and 20th centuries.
- DDR corpus (9 million words).
- neues deutschland corpus (1946–1990)
- Die ZEIT corpus (1946–2016)
- spoken language corpus
[edit]
The DWDS dictionary is based in its core on the Wörterbuch der deutschen Gegenwartssprache (Dictionary of Contemporary German). Approximately 2600 of the 90,000 entries in the WDG that had GDR-typical content or wording were revised by the DWDS project group. The meanings and examples were formulated in more neutral terms by a group of lexicographers, or, if they illustrated an actual GDR-specific usage, were marked accordingly. This revision affected approximately 2500 entries.[2]
References[edit]
External links[edit]
de:Digitales Wörterbuch der deutschen Sprache
This article "Digital Dictionary of the German Language" is from Wikipedia. The list of its authors can be seen in its historical and/or the page Edithistory:Digital Dictionary of the German Language. Articles copied from Draft Namespace on Wikipedia could be seen on the Draft Namespace of Wikipedia and not main one.