Data input

The data input component provides functions for reading remote sensing images, hotspot locations and (effective) wind fields.

Source databases

The location and type of emission sources are used input for plume detection and emission quantification. ddeq uses xarray datasets to store point source information. The dataset contain source names (source), longitudes (lon_o), latitudes (lat_o), labels for visualization (label) and source types (type).

CSV files

ddeq includes a small list of sources as a comma-separated values (CSV) file that primarily contains cities and power plants used in previous studies. User-defined files containing other sources can be prepared in the same format.

The CSV file can be read with the following function:

ddeq.misc.read_point_sources(filename=None)

Read list of point sources and converts them to format used by the plume detection algorithm.

Parameters:

filename (str, default: None) – Name of CSV file with point source information (see “sources.csv” in ddeq.DATA_PATH for an example).

Returns:

xarray dataset containing point source locations

Return type:

xr.Dataset

CoCO2 point source database

ddeq includes the CoCO2 global emission point source database:

ddeq.coco2.read_ps_catalogue(filename=None)

Read CoCO2 point source catalogue [Guevara2023] in the format supported by ddeq.

Parameters:

filename (str, default: None) – Name of CSV file with point source information from CoCO2 database (see “coco2_ps_catalogue_v1.1.csv” in ddeq.DATA_PATH for an example).

Returns:

xarray dataset containing point source locations

Return type:

xr.Dataset

Notes

[Guevara2023]

Guevara, M., Enciso, S., Tena, C., Jorba, O., Dellaert, S., Denier van der Gon, H., and Pérez García-Pando, C.: A global catalogue of CO2 emissions and co-emitted species from power plants at a very high spatial and temporal resolution, Earth Syst. Sci. Data Discuss. [preprint], https://doi.org/10.5194/essd-2023-95, in review, 2023.

Remote sensing images

ddeq requires that trace gas images are provided as xr.Dataset with variables providing the trace gas columns and their uncertainties (e.g. “CO2” and “CO2_precision”) that need to have a units attribute for automatic unit conversion and a noise_level attribute that is used as random uncertainty. In addition, the central longitude and latitude of the pixels need to be provided as lon and lat.

The variable time is used to link the remote sensing image with other input data such as wind fields and a priori information (i.e. sources dataset) and output datasets. The dimensions are currently (“time”,), i.e. a single time per image.

Sentinel-5P/TROPOMI

Sentinel-5P/TROPOMI images can read using xr.open_dataset after being downloaded and prepared by the ddeq.download_S5P module.

To iterate over TROPOMI data, a dataset class can be used, which can be used to load TROPOMI data on demand. The class is used by the divergence method.

class ddeq.sats.Level2TropomiDataset
__init__(pattern, root='', qa_value=0.75)

Level-2 class for TROPOMI NO2 product.

Parameters:
  • pattern (str) – A filename pattern used to match the TROPOMI files based on given date. Date formatting is used to find the correct file using, for example, “S5P_NO2_%Y%m%d.nc”.

  • root (str) – Data path to TROPOMI files.

  • qa_value (float) – Sets the minimum quality assurance (qa) value. Recommended by the TROPOMI team is to accept values above 0.75

classmethod __new__(*args, **kwargs)
read_date(date)

Returns a list of TROPOMI NO2 Level-2 data.

Parameters:

date (datetime.datetime)

Returns:

List of TROPOMI datasets for given date.

Return type:

list of xr.Dataset

Synthetic CO2M images from the SMARTCARB dataset

ddeq.smartcarb.read_level2(filename, co2_noise_scenario='medium', co2_cloud_threshold=0.01, co2_scaling=1.0, no2_noise_scenario='high', no2_cloud_threshold=0.3, no2_scaling=1.0, co_noise_scenario=None, co_cloud_threshold=0.05, co_scaling=1.0, make_no2_error_cloud_dependent=True, use_constant=False, seed='orbit', only_observations=True, add_background=False)

Read synthetic XCO2, NO2 and CO observations from SMARTCARB project [Kuhlmann2020] .

Parameters:
  • filename (str) – Name of SMARTCARB Level-2 file

  • co2_noise_scenario (str, optional) – Noise scenario used to add random uncertainty to the CO2 observations for vegetation albedo and solar zenith angle of 50° (VEG50 scenario): “low” -> 0.5 ppm, “medium” -> 0.7 ppm and “high” -> 1.0 ppm.

  • co2_cloud_threshold (float, optional) – Cloud fraction used for masking bad pixels with 1% default cloud fraction.

  • co2_cloud_threshold – Cloud fraction used for masking bad pixels with 1% default cloud fraction.

  • co2_scaling (float, optional) – Scaling applied to model tracer with anthropogenic CO2 emissions

  • no2_noise_scenario (str, optional) – Noise scenario used to add random uncertainty to the NO2 observations: “low” -> 1e15 molecules cm-2 or 15% (whichever is larger) and “high” -> 2e15 molecules cm-2 or 20% (whichever is larger)

  • no2_cloud_threshold (float, optional) – Cloud fraction used for masking bad pixels with 30% default cloud fraction.

  • no2_scaling (float, optional) – Scaling applied to model tracer with anthropogenic NO2 emissions.

  • co_noise_scenario (str, optional) – Noise scenario used to add random uncertainty to the CO observations: “low” -> 4e17 molecules cm-2 or 10% (whichever is larger) and “high” -> 4e17 molecules cm-2 or 20% (whichever is larger)

  • co_cloud_threshold (float, optional) – Cloud fraction used for masking bad pixels with 5% default cloud fraction.

  • co_scaling (float, optional) – Scaling applied to model tracer with anthropogenic CO emissions

  • make_no2_error_cloud_dependent (boolean, optional) – If True, NO2 uncertainty depends on cloud fraction.

  • use_constant (boolean, optional) – Use constant emissions if True and time-varying emissions otherwise.

  • seed (string, optional) – “seed” used before generating the random noise for the Level-2 images. If seed==’orbit’, the seed is calculated based on the trace gas, satellite and orbit number, resulting in the same image every time data is read, which is useful for benchmarking studies.

  • only_observations (boolean, optional) – If False, noise-free trace gas array without cloud filtering will be added to the dataset.

  • add_background (boolean, optional) – If True, add array containing the background tracers, i.e. from anthropogenic emissions outside the model domain and, for CO2, biospheric fluxes.

Returns:

CO2M Level-2 orbit from SMARTCARB dataset.

Return type:

xr.Dataset

Notes

[Kuhlmann2020]

Kuhlmann, G., Clément, V., Marshall, J., Fuhrer, O., Broquet, G., Schnadt-Poberaj, C., Löscher, A., Meijer, Y., & Brunner, D. (2020). Synthetic XCO2, CO and NO2 observations for the CO2M and Sentinel-5 satellites [Data set]. Zenodo. https://doi.org/10.5281/zenodo.4048228

class ddeq.smartcarb.Level2Dataset
__init__(data_path, constellation='ace', co2_noise_scenario='medium', co2_cloud_threshold=0.01, co2_scaling=1.0, no2_noise_scenario='high', no2_cloud_threshold=0.3, no2_scaling=1.0, co_noise_scenario=None, co_cloud_threshold=0.05, co_scaling=1.0, make_no2_error_cloud_dependent=True)

A container class to provide access to SMARTCARB Level-2 data for given constellation and uncertainty scenario.

Parameters:
  • data_path (str) – Path tof SMARTCARB Level-2 files

  • constellation (str, optional) – Code used for CO2M constellation.

  • co2_noise_scenario (str, optional) – Noise scenario used to add random uncertainty to the CO2 observations for vegetation albedo and solar zenith angle of 50° (VEG50 scenario): “low” -> 0.5 ppm, “medium” -> 0.7 ppm and “high” -> 1.0 ppm.

  • co2_cloud_threshold (float, optional) – Cloud fraction used for masking bad pixels with 1% default cloud fraction.

  • co2_cloud_threshold – Cloud fraction used for masking bad pixels with 1% default cloud fraction.

  • co2_scaling (float, optional) – Scaling applied to model tracer with anthropogenic CO2 emissions

  • no2_noise_scenario (str, optional) – Noise scenario used to add random uncertainty to the NO2 observations: “low” -> 1e15 molecules cm-2 or 15% (whichever is larger) and “high” -> 2e15 molecules cm-2 or 20% (whichever is larger)

  • no2_cloud_threshold (float, optional) – Cloud fraction used for masking bad pixels with 30% default cloud fraction.

  • no2_scaling (float, optional) – Scaling applied to model tracer with anthropogenic NO2 emissions.

  • co_noise_scenario (str, optional) – Noise scenario used to add random uncertainty to the CO observations: “low” -> 4e17 molecules cm-2 or 10% (whichever is larger) and “high” -> 4e17 molecules cm-2 or 20% (whichever is larger)

  • co_cloud_threshold (float, optional) – Cloud fraction used for masking bad pixels with 5% default cloud fraction.

  • co_scaling (float, optional) – Scaling applied to model tracer with anthropogenic CO emissions

  • make_no2_error_cloud_dependent (boolean, optional) – If True, NO2 uncertainty depends on cloud fraction.

classmethod __new__(*args, **kwargs)
read_date(date)

Returns a list of SMARTCARB Level-2 data for given date using constellation and uncertainty scenario of instance.

Synthetic CO2M images from the CoCO2 library of plumes

ddeq.coco2.read_level2(filename, data_path='.', co2_noise=0.7, no2_noise=3.3e-05, mask_out_of_domain=False, drop_duplicates=True)

Read CO2M-like Level-2 from CoCO2 library of plumes [Koene2022].

Parameters:
  • filename (str) – {team}_{region}_{suffix}.nc

  • data_path (str, optional) – Data path to filename.

  • co2_noise (float, optional) – Random noise added to CO2 fields (default: 0.7 ppm)

  • no2_noise (float, optional) – Random noise added to NO2 fields (default: 33 µmol m-2 = 2e15 cm-2)

  • mask_out_of_domain (boolean, optional) – For MicroHH simulations, remove CO2/NO2 values from CAMS outside MicroHH model domain.

  • drop_duplicates (boolean, optional) – If True, drop duplicated times.

Return type:

xr.Dataset

Notes

[Koene2022]

Erik Koene, & Dominik Brunner. (2022). CoCO2 WP4.1 Library of Plumes (1.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.7448144

Wind fields

Effective winds are used by all methods to estimate emissions. A wind dataset is a xr.Dataset with dimensions of (“time”, “lon”, “lat”) or (“time”, “source”).

ddeq.era5.read(sng_filename=None, lvl_filename=None, method=None, levels=None, heights=None, weights=None, level_units='', times=None, extent=None, sources=None, lons=None, lats=None, height_offset=None)

Read and prepare ERA-5 data using different methods.

Parameters:
  • sng_filename (str) – Filename to ERA-5 data on single model level with the following variables: - lon: longitude - lat: latitude - sp: surface pressure - z: geopotential - u10: u-wind at 10 m - v10: v-wind at 10 m - u100: u-wind at 100 m - v100: v-wind at 100 m - blh: boundary layer height

  • lvl_filename (str, optional) – Filename of ERA-5 data on model or pressure levels with the following variables: - z: geopotential (only pressure levels) - u: u-wind - v: v-wind - t: temperature (only model levels) - q: specific humidity (only model levels)

  • method (str) –

    Method used to compute the effective wind speed: - “10m” Wind speed at 10 meters - “100m” Wind speed at 100 meters - “levels” Average of model/pressure levels that are provided by

    the levels parameter. If level_units is “index”, levels are selected using the “isel” method, while otherwise the sel method is used.

    • ”heights” Wind for heights given by heights parameter.

    • ”pbl-mean” Mean wind in the planetary boundary layer.

    • ”pbl-mid” Wind at the middle of planetary boundary layer.

    • ”GNFR-A” Wind weighted by emission profile for the

      public power sector (i.e. GNFR category A).

  • levels (int, float) – Model/pressure levels used with “levels” method (see method description).

  • level_units (str,) – Units of values provided by levels parameter (see method description).

  • heights (number) – Heights used for interpolation with “heights” method.

  • times (pd.Timestamp) – Select times from ERA5 nearest to the provided times.

  • extent (dict) – Clip ERA5 field to provided extent using -180,+180 for longitude using dict with “north”, “west”, “south”, “east”.

  • sources (xr.Dataset) – Source dataset with longitude and latitude of sources. If provided, winds will be interpolated to source locations.

  • lons (xr.DataArray) – Longitude on which model fields are interpolated.

  • lats (xr.DataArray) – Latitude on which model fields are interpolated.

  • height_offset (xr.DataArray) – Offset for height (TODO).

Returns:

xr.Dataset (dims – Dataset of effective wind speed either on a longitude-latitude grid or for each source.

Return type:

time, lon, lat) or (time, source)

ddeq.wind.read_smartcarb(time, lon, lat, radius=None, data_path='.', method='linear', average=False)

Read SMARTCARB winds at given location at given time. The location is interpolated from SMARTCARB model grid to given lon and lat. If lon and lat are given as scalar, it is possible to provide a radius around location for which are extracted.

radius :: size of square around location given in rotated degrees used for averaging data_path :: path to SMARTCARB wind fields method :: interpolation method (used by xr.DataArray.interp method) average :: average extracted winds

Return xr.Dataset with wind components U and V as well as wind speed and direction.