m.crawl.thredds - List dataset urls from a Thredds Data Server (TDS) catalog.


temporal, import, download, data, metadata, netcdf, thredds, opendap


m.crawl.thredds --help
m.crawl.thredds input=string [print=string[,string,...]] services=string[,string,...] [filter=string] [skip=string[,string,...]] [output=name] [separator=character] [modified_before=string] [modified_after=string] [authentication=name] [nprocs=Number of cores] [--overwrite] [--help] [--verbose] [--quiet] [--ui]


Allow output files to overwrite existing files
Print usage summary
Verbose module output
Quiet module output
Force launching GUI dialog


input=string [required]
URL of a catalog on a thredds server
Additional information to print
Options: service, dataset_size
services=string[,string,...] [required]
Services of thredds server to crawl
Comma separated list of services names (lower case) of thredds server to crawl, typical services are: httpserver, netcdfsubset, opendap, wms
Default: httpserver
Regular expression for filtering dataset and catalog URLs
Default: .*
Regular expression(s) for skipping sub-catalogs / URLs (e.g. ".*jpeg.*,.*metadata.*)"
Name of the output file (stdout if omitted)
Default: -
Field separator
Special characters: pipe, comma, space, tab, newline
Default: pipe
Latest modification timestamp of datasets to include in the output
ISO-formated date or timestamp (e.g. "2000-01-01T12:12:55.03456Z" or "2000-01-01")
Earliest modification timestamp of datasets to include in the output
ISO-formated date or timestamp (e.g. "2000-01-01T12:12:55.03456Z" or "2000-01-01")
Authentication for thredds server
File with authentication information (username and password) for thredds server
nprocs=Number of cores
Number of cores to use for crawling thredds server
Default: 1

An increasing amount of spatio-temporal data, like climate observations and forecast data or satellite imagery is provided through Thredds Data Servers (TDS).

m.crawl.thredds crawls the catalog of a Thredds Data Server (TDS) starting from the catalog-URL provided in the input. It is a wrapper module around the Python library thredds_crawler. m.crawl.thredds returns a list of dataset URLs, optionally with additional information on the service type and data size. Depending on the format of the crawled datasets, the output of m.crawl.thredds may be used as input to t.rast.import.netcdf.

The returned list of datasets can be filtered:

When crawling larger Thredds installations, skipping irrelevant branches of the server`s tree of datasets can greatly speed-up the process. In the skip option, branches (and also leaf datasets) can be excluded from the search by a comma-separated list of regular expression strings, e.g. ".*metadata.*" would direct the module to not look for datasets inside a "metadata" directory.

Authentication to the Thredds Server (if required) can be provided either through a text-file, where the first line contains the username and the second the password, or interactive user input (if authentication=-). Alternatively, username and password can be passed through environment variables THREDDS_USER and THREDDS_PASSWORD.


The Thredds data catalog is crawled recursively. Providing the URL to the root of a catalog on a Thredds server with many hierarchies and datasets can therefore be quite time consuming, even if executed in parallel (nprocs > 1).


List modelled climate observation datasets from the Norwegian Meteorological Institute (
# Get a list of all data for "seNorge"
m.crawl.thredds input=""

# Get a list of the most recent data for "seNorge"
m.crawl.thredds input="" modified_after="2021-02-01"

# Get a list of the most recent data for "seNorge" that match a regular expression
# Note the "." beofor the "*"
m.crawl.thredds input="" \
modified_after="2021-02-01" filter=".*2018_202.*"
List Sentinel-2A data from the Norwegian Ground Segment (NBS) for the 2. Feb 2021
# Get a list of all Sentinel-2A data for 2. Feb 2021 with dataset size
m.crawl.thredds input="" print="data_size"|107.6

# Get a list of WMS end-points to all Sentinel-2A data for 2. Feb 2021
m.crawl.thredds input="" services="wms"


m.crawl.thredds is a wrapper around the thredds_crawler Python library.

SEE ALSO, t.rast.import.netcdf


Stefan Blumentrath, Norwegian Institute for Nature Research (NINA), Oslo


Available at: m.crawl.thredds source code (history)

