1.8 KiB
1.8 KiB
Dutch Word List
Last updated: 2023-03-10
This repository contains the official OpenTaal Dutch word list, comprising over 400,000 words compiled from contributions and curated sources. The list is provided in UTF-8 encoding and is alphabetically sorted.
Contents
Primary File
wordlist.txt– Complete UTF-8 word list (one word per line).
Metadata
datetimeversion.txt– Timestamp and version information.
Component Files
elements/basiswoorden-gekeurd.txt– Approved base words (~200k entries).elements/basiswoorden-ongekeurd.txt– Unapproved base words, including proper nouns and compounds (~41k entries).elements/flexies-ongekeurd.txt– Unapproved inflections (~170k entries).elements/wordparts.tsv– Word parts containing spaces (TSV format).elements/corrections.tsv– Common misspellings with corrections (TSV format).elements/romeinse-cijfers.txt– Roman numerals (~4k entries).elements/wordlist-ascii.txt– ASCII-only subset (excludes accented characters).elements/wordlist-non-ascii.txt– Entries containing non-ASCII characters.
Character Set
Includes standard Latin letters (a–z, A–Z), Dutch diacritics (e.g., é, ë, ï), superscript/subscript digits (e.g., ², ³), and punctuation: ' . - / + & @ ?.