DWD weather forecast parser

Parser for DWD open weather data into JSON. Detailed or simple worldwide hourly 10-day forecasts can be automatically generated without needing a special API.

Deutscher Wetterdienst (DWD, German Meteorological Service) provides free weather forecasts on their open data server. Amongst others, the so-called MOSMIX forecasts are pretty detailed and include hourly data for 10 days in advance. While there is no official live API, the corresponding file can easily be downloaded and parsed into JSON, which is more suitable for further processing.

This project is the basis for Wäsche-Wetter, the Germany-wide live forecast on the best conditions for drying laundry.

The provided script parses through the XML-based upstream format and is designed to be easy to setup and use. It should also be relatively fast with a small memory footprint to be suited for very small (or very big) deployments. Also, no real API is needed when it is sufficient to access or serve the generated JSON file(s) locally. The output format should be convenient for post-processing or plotting using custom scripts. (If not importing the parser class for own implementations.)

In case working with or extending a whole framework is an option, the Wetterdienst project might be an alternative worth looking into. The parser is very similar but does not fit the standalone usecase at hand.

Usage

The MOSMIX parser is provided as a single Python script and can be invoked directly on KMZ or KML files:

usage: parse_dwd_mosmix.py [-h] --in-file MOSMIX.KMZ --out-file FILE.JSON {timestamps,stations,forecasts} ...

Parser for (possibly compressed) DWD MOSMIX KML XML files into JSON.

optional arguments:
  -h, --help            show this help message and exit
  --in-file MOSMIX.KMZ  input kmz/kml file to read
  --out-file FILE.JSON  output json file to write

parser modes:
    timestamps          parse declared forecast timestamps
    stations            parse station information
      --timezones       determine timezones from coordinates
    forecasts           parse per-station forecasts
      --limit STATIONS  comma-separated list of stations

Most recent forecasts can be obtained from either:

Apart from input/output files, one of the three different “modes” must be given:

Typically, 240 sample points are provided, representing future hourly values (so 10 days). These intervals as standard integer UTC unix timestamps will be written via:

./parse_dwd_mosmix.py --in-file MOSMIX_S_LATEST_240.kmz --out-file timestamps.json timestamps

The list of contained stations should stay relatively constant but can be extracted by:

./parse_dwd_mosmix.py --in-file MOSMIX_S_LATEST_240.kmz --out-file stations.json stations --timezones

With --timezones, the tz property will contain the station’s timezone as guessed from its coordinates (if the timezonefinder package is installed). A typical result would look like:

{
    "desc": "HANNOVER",
    "ele": 56,
    "lat": 52.47,
    "lng": 9.68,
    "name": "10338",
    "tz": "Europe/Berlin"
},

The actual forecasts for any or a set of stations (using --limit) can then be obtained via:

./parse_dwd_mosmix.py --in-file MOSMIX_S_LATEST_240.kmz --out-file forecasts.json forecasts --limit "HANNOVER,STUTTGART-ECHT.,NEW YORK"

This will give entries such as:

"HANNOVER": {
    "FF": [7.72, 7.2, 6.69, 6.69, 6.69, 6.69, 6.17, 6.69, 6.69, 6.17, 5.66, 5.66, 5.14, 5.14, 5.14, 5.14, 4.63, …
    "DD": [166.0, 176.0, 185.0, 191.0, 198.0, 197.0, 193.0, 192.0, 192.0, 203.0, 196.0, 199.0, 191.0, 198.0, …
    "TTT": [286.75, 287.05, 287.55, 288.45, 288.95, 289.45, 289.55, 289.05, 288.45, 287.75, 287.65, 287.05, …
    "Rh50": [null, null, null, null, null, null, null, null, null, 6.0, null, null, null, null, null, null, null, …
    …
},

Installation

No installation needed. The script is self-contained and can be directly executed locally without additional dependencies required. Tested for Python 3.8.

However, for a more convenient invocation, it can also easily be installed via pip (using sudo for system-wide installation):

pip install .[extra]
pip uninstall parse-dwd-mosmix

With extra, also optional extra packages are installed: lxml for improved XML parsing performance, and timezonefinder for getting station timezones from their coordinates.

Background

The MOSMIX forecast approach relies on statistical combination of weather prediction models with meteorological observations. The data is provided in KMZ file format, which are merely compressed KML XML files. More documentation on the broader topic on freely available information can be found at the DWD open data reference page.

Weather Stations

There are currently 5971 worldwide stations listed, with emphasis on Europe (and Germany in particular):

cat stations.json | jq -r '.[] .tz' | cut -d / -f 1 | sort | uniq -c
  182 Africa
  489 America
   11 Antarctica
    3 Arctic
  559 Asia
   30 Atlantic
   41 Australia
  225 Etc
 4374 Europe
   17 Indian
   40 Pacific

The official list is provided as plain text file.

Forecast Values

For example, the following forecast values seem to be in especially interesting for general purpose applications:

TTT
Temperature 2m above surface (K)
Td
Dewpoint 2m above surface (K). Note that this value can be used to calculate a relative humidity value via the Magnus formula.
FF
Wind speed (m/s)
DD
Wind direction (0°–360°)
N
Total cloud cover (%)
wwM
Probability for fog within the last hour (%)
RR1c
Total precipitation (i.e., rain, snow, etc.) during the last hour (kg/m2). Note that for rain water under normal circumstances, 1 kg/m2 is 1 l/m2 and also 1 mm/m2 (SI units FTW).
ww
Significant weather as single value describing the overall conditions (predefined constants)
Rad1h
Global irradiance (kJ/m2)
SunD1
Sunshine duration during the last hour (s)

As actual reference, there is structured information available as well as an overview of elements included in the “S” and/or “L” variants.

Code & Download