Files
puzzle-generator/vocab/README.md
2025-12-19 14:02:07 +01:00

1.8 KiB
Raw Permalink Blame History

GitHub last commit GitHub commit activity GitHub Repo stars GitHub watchers GitHub Sponsors Liberapay patrons

Dutch Word List

Last updated: 2023-03-10

This repository contains the official OpenTaal Dutch word list, comprising over 400,000 words compiled from contributions and curated sources. The list is provided in UTF-8 encoding and is alphabetically sorted.

Contents

Primary File

  • wordlist.txt Complete UTF-8 word list (one word per line).

Metadata

  • datetimeversion.txt Timestamp and version information.

Component Files

  • elements/basiswoorden-gekeurd.txt Approved base words (~200k entries).
  • elements/basiswoorden-ongekeurd.txt Unapproved base words, including proper nouns and compounds (~41k entries).
  • elements/flexies-ongekeurd.txt Unapproved inflections (~170k entries).
  • elements/wordparts.tsv Word parts containing spaces (TSV format).
  • elements/corrections.tsv Common misspellings with corrections (TSV format).
  • elements/romeinse-cijfers.txt Roman numerals (~4k entries).
  • elements/wordlist-ascii.txt ASCII-only subset (excludes accented characters).
  • elements/wordlist-non-ascii.txt Entries containing non-ASCII characters.

Character Set

Includes standard Latin letters (az, AZ), Dutch diacritics (e.g., é, ë, ï), superscript/subscript digits (e.g., ², ³), and punctuation: ' . - / + & @ ?.