Utilities for large scale climate data analysis

The xarrayutils.utils module contains several functions that have proven useful in several of my day to day projects with both observational and model data.

Linear regression

One of the operations many scientists do is calculating a linear trend along a specified dimension (e.g. time) on each grid point of a dataset. linear_trend makes this very easy. For demonstration purposes lets load some monthly gridded Argo data from APDRC

[22]:
import xarray as xr
import numpy as np

%matplotlib inline
[23]:
path = 'http://apdrc.soest.hawaii.edu:80/dods/public_data/Argo_Products/monthly_mean/monthly_mixed_layer'
ds = xr.open_dataset(path, use_cftime=True)
ds
<frozen importlib._bootstrap>:219: RuntimeWarning: numpy.ndarray size changed, may indicate binary incompatibility. Expected 80 from C header, got 88 from PyObject
[23]:
<xarray.Dataset>
Dimensions:  (lat: 60, lon: 120, time: 240)
Coordinates:
  * time     (time) object 2001-01-15 00:00:00 ... 2020-12-15 00:00:00
  * lat      (lat) float64 -88.5 -85.5 -82.5 -79.5 -76.5 ... 79.5 82.5 85.5 88.5
  * lon      (lon) float64 1.5 4.5 7.5 10.5 13.5 ... 349.5 352.5 355.5 358.5
Data variables:
    mld      (time, lat, lon) float32 ...
    smld     (time, lat, lon) float32 ...
    nmld     (time, lat, lon) float32 ...
    ild      (time, lat, lon) float32 ...
    sild     (time, lat, lon) float32 ...
    nild     (time, lat, lon) float32 ...
    ttd      (time, lat, lon) float32 ...
    sttd     (time, lat, lon) float32 ...
    nttd     (time, lat, lon) float32 ...
    blt      (time, lat, lon) float32 ...
    sblt     (time, lat, lon) float32 ...
    nblt     (time, lat, lon) float32 ...
    tid      (time, lat, lon) float32 ...
    stid     (time, lat, lon) float32 ...
    ntid     (time, lat, lon) float32 ...
    mlt      (time, lat, lon) float32 ...
    smlt     (time, lat, lon) float32 ...
    nmlt     (time, lat, lon) float32 ...
    mls      (time, lat, lon) float32 ...
    smls     (time, lat, lon) float32 ...
    nmls     (time, lat, lon) float32 ...
Attributes:
    title:          3x3 bin-averaged Mixed Layer Monthly mean (from 2001)
    Conventions:    COARDS\nGrADS
    dataType:       Grid
    documentation:  http://apdrc.soest.hawaii.edu/projects/Argo/index.html
    history:        Wed Jan 06 09:40:44 HST 2021 : imported by GrADS Data Ser...
[24]:
ds.load()
[24]:
<xarray.Dataset>
Dimensions:  (lat: 60, lon: 120, time: 240)
Coordinates:
  * time     (time) object 2001-01-15 00:00:00 ... 2020-12-15 00:00:00
  * lat      (lat) float64 -88.5 -85.5 -82.5 -79.5 -76.5 ... 79.5 82.5 85.5 88.5
  * lon      (lon) float64 1.5 4.5 7.5 10.5 13.5 ... 349.5 352.5 355.5 358.5
Data variables:
    mld      (time, lat, lon) float32 nan nan nan nan nan ... nan nan nan nan
    smld     (time, lat, lon) float32 nan nan nan nan nan ... nan nan nan nan
    nmld     (time, lat, lon) float32 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0
    ild      (time, lat, lon) float32 nan nan nan nan nan ... nan nan nan nan
    sild     (time, lat, lon) float32 nan nan nan nan nan ... nan nan nan nan
    nild     (time, lat, lon) float32 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0
    ttd      (time, lat, lon) float32 nan nan nan nan nan ... nan nan nan nan
    sttd     (time, lat, lon) float32 nan nan nan nan nan ... nan nan nan nan
    nttd     (time, lat, lon) float32 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0
    blt      (time, lat, lon) float32 nan nan nan nan nan ... nan nan nan nan
    sblt     (time, lat, lon) float32 nan nan nan nan nan ... nan nan nan nan
    nblt     (time, lat, lon) float32 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0
    tid      (time, lat, lon) float32 nan nan nan nan nan ... nan nan nan nan
    stid     (time, lat, lon) float32 nan nan nan nan nan ... nan nan nan nan
    ntid     (time, lat, lon) float32 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0
    mlt      (time, lat, lon) float32 nan nan nan nan nan ... nan nan nan nan
    smlt     (time, lat, lon) float32 nan nan nan nan nan ... nan nan nan nan
    nmlt     (time, lat, lon) float32 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0
    mls      (time, lat, lon) float32 nan nan nan nan nan ... nan nan nan nan
    smls     (time, lat, lon) float32 nan nan nan nan nan ... nan nan nan nan
    nmls     (time, lat, lon) float32 0.0 0.0 0.0 0.0 0.0 ... 0.0 0.0 0.0 0.0
Attributes:
    title:          3x3 bin-averaged Mixed Layer Monthly mean (from 2001)
    Conventions:    COARDS\nGrADS
    dataType:       Grid
    documentation:  http://apdrc.soest.hawaii.edu/projects/Argo/index.html
    history:        Wed Jan 06 09:40:44 HST 2021 : imported by GrADS Data Ser...

Lets find out how much the salinity in each grid point changed over the full period (20 years)

[25]:
from xarrayutils.utils import linear_trend
[26]:
# create an array
salinity_regressed = linear_trend(ds.mls, 'time')
salinity_regressed
[26]:
<xarray.Dataset>
Dimensions:    (lat: 60, lon: 120)
Coordinates:
  * lat        (lat) float64 -88.5 -85.5 -82.5 -79.5 ... 79.5 82.5 85.5 88.5
  * lon        (lon) float64 1.5 4.5 7.5 10.5 13.5 ... 349.5 352.5 355.5 358.5
Data variables:
    slope      (lat, lon) float64 nan nan nan nan ... 0.08067 0.07533 -0.005028
    intercept  (lat, lon) float64 nan nan nan nan ... 20.33 22.07 22.88 32.38
    r_value    (lat, lon) float64 nan nan nan nan nan ... 1.0 0.9608 1.0 -0.116
    p_value    (lat, lon) float64 nan nan nan nan nan ... nan 0.03919 nan 0.884
    std_err    (lat, lon) float64 nan nan nan nan ... 0.0 0.01646 0.0 0.03045

Now we can plot the slope as a map

[27]:
salinity_regressed.slope.plot(robust=True)
[27]:
<matplotlib.collections.QuadMesh at 0x7f18a6d33280>
_images/utils_9_1.png

linear_trend converts the dimension over which to integrate into logical indicies, so the units of the plot above are (salinity/timestep of the original product), so here PSS/month.

Correlation maps

But what about a bit more complex task? Lets find out how mixedlayer salinity and temperature correlate. For this we use xr_linregress (for which linear_trend is just a thin wrapper):

[28]:
from xarrayutils.utils import linear_trend, xr_linregress
[29]:
tempxsalt = xr_linregress(ds.mlt, ds.mls, dim='time')
[30]:
tempxsalt.r_value.plot()
[30]:
<matplotlib.collections.QuadMesh at 0x7f18a6857700>
_images/utils_15_1.png

This works in any dimension the dataset has:

[31]:
tempxsalt = xr_linregress(ds.mlt, ds.mls, dim='lon')
/srv/conda/envs/notebook/lib/python3.8/site-packages/xarray/core/computation.py:724: RuntimeWarning: invalid value encountered in sqrt
  result_data = func(*input_data)
[32]:
tempxsalt.r_value.plot(x='time')
[32]:
<matplotlib.collections.QuadMesh at 0x7f18a5a87850>
_images/utils_18_1.png

This map shows that in lower latitudes spatial patterns of salinity are generally anticorrlated with temperature and vice versa in the high latitudes.

Hatch sign agreement

It can often be useful to indicate if the sign along an averaged (or otherwise aggregated) dimension. For instance to show if forced changes have a consistent sign in reference to a multi-member mean. sign_agreement makes this easy. Consider this synthetic example:

[33]:
x = np.linspace(-np.pi, np.pi, 25)
y = np.linspace(-np.pi, np.pi, 25)
xx, yy = np.meshgrid(x,y)

data1 = np.sin(xx)
data2 = np.sin(yy)
data3 = np.ones_like(xx)

np.fill_diagonal(data3,np.nan)
np.fill_diagonal(data3[1:],np.nan)
np.fill_diagonal(data3[:,1:],np.nan)
np.fill_diagonal(data3[:,2:],np.nan)

da = xr.DataArray(np.array([data1, data2, data3]), dims=['member','x', 'y'])

da.plot(col='member')
[33]:
<xarray.plot.facetgrid.FacetGrid at 0x7f18a651b4f0>
_images/utils_22_1.png

Taking the mean of these fields, suggests that values increase in the upper-left, upper-right, lower-right quadrant, and the missing values in the third layer distort the mean.

[34]:
da.mean('member').plot()
[34]:
<matplotlib.collections.QuadMesh at 0x7f18a6545520>
_images/utils_24_1.png

Lets produce a mask to see where all elements along the member dimension have the same sign:

[35]:
from xarrayutils.utils import sign_agreement

sign_agreement(da, da.mean('member'), 'member', threshold=1.0).plot()
[35]:
<matplotlib.collections.QuadMesh at 0x7f18a6172940>
_images/utils_26_1.png

You could use this information to indicate the areas of the average, where the members do not agree by hatching:

[36]:
da.mean('member').plot()
sign_agreement(
    da, da.mean('member'), 'member'
).plot.contourf(
    colors='none',
    hatches=['..', None],
    levels=[0,0.5],
    add_colorbar=False
)
[36]:
<matplotlib.contour.QuadContourSet at 0x7f18a66cb2b0>
_images/utils_28_1.png

Masking values in the mixed layer

Sometimes it is helpful to analyze data by excluding the values in the mixed layer. This can be easily done with mask_mixedlayer. Let’s see how:

First load a CMIP6 dataset from the cloud

[37]:
import intake
url = "https://raw.githubusercontent.com/NCAR/intake-esm-datastore/master/catalogs/pangeo-cmip6.json"
col = intake.open_esm_datastore(url)
cat = col.search(
    table_id='Omon',
    grid_label='gn',
    experiment_id='historical',
    member_id='r1i1p1f1',
    variable_id=['thetao','mlotst'],#,
    source_id=["ACCESS-ESM1-5"]
)
ddict = cat.to_dataset_dict(
    zarr_kwargs={'consolidated':True, 'decode_times':True},
)

--> The keys in the returned dictionary of datasets are constructed as follows:
        'activity_id.institution_id.source_id.experiment_id.table_id.grid_label'
100.00% [1/1 00:00<00:00]
[38]:
ds = ddict['CMIP.CSIRO.ACCESS-ESM1-5.historical.Omon.gn']
ds
[38]:
<xarray.Dataset>
Dimensions:             (bnds: 2, i: 360, j: 300, lev: 50, member_id: 1, time: 1980, vertices: 4)
Coordinates:
  * i                   (i) int32 0 1 2 3 4 5 6 ... 353 354 355 356 357 358 359
  * j                   (j) int32 0 1 2 3 4 5 6 ... 293 294 295 296 297 298 299
    latitude            (j, i) float64 dask.array<chunksize=(300, 360), meta=np.ndarray>
    longitude           (j, i) float64 dask.array<chunksize=(300, 360), meta=np.ndarray>
  * time                (time) datetime64[ns] 1850-01-16T12:00:00 ... 2014-12...
    time_bnds           (time, bnds) datetime64[ns] dask.array<chunksize=(1980, 2), meta=np.ndarray>
  * member_id           (member_id) <U8 'r1i1p1f1'
  * lev                 (lev) float64 5.0 15.0 25.0 ... 5.499e+03 5.831e+03
    lev_bnds            (lev, bnds) float64 dask.array<chunksize=(50, 2), meta=np.ndarray>
Dimensions without coordinates: bnds, vertices
Data variables:
    mlotst              (member_id, time, j, i) float32 dask.array<chunksize=(1, 196, 300, 360), meta=np.ndarray>
    vertices_latitude   (j, i, vertices) float64 dask.array<chunksize=(300, 360, 4), meta=np.ndarray>
    vertices_longitude  (j, i, vertices) float64 dask.array<chunksize=(300, 360, 4), meta=np.ndarray>
    thetao              (member_id, time, lev, j, i) float32 dask.array<chunksize=(1, 5, 50, 300, 360), meta=np.ndarray>
Attributes:
    grid:                    native atmosphere N96 grid (145x192 latxlon)
    forcing_index:           1
    table_id:                Omon
    variant_label:           r1i1p1f1
    source:                  ACCESS-ESM1.5 (2019): \naerosol: CLASSIC (v1.0)\...
    tracking_id:             hdl:21.14100/4ca9ce2d-6374-4e22-8ad3-1a5a6dfb8fd...
    initialization_index:    1
    physics_index:           1
    run_variant:             forcing: GHG, Oz, SA, Sl, Vl, BC, OC, (GHG = CO2...
    branch_time_in_parent:   21915.0
    history:                 2019-11-15T15:38:06Z ; CMOR rewrote data to be c...
    cmor_version:            3.4.0
    grid_label:              gn
    nominal_resolution:      250 km
    intake_esm_varname:      mlotst\nthetao
    experiment_id:           historical
    realization_index:       1
    parent_activity_id:      CMIP
    sub_experiment_id:       none
    source_id:               ACCESS-ESM1-5
    branch_time_in_child:    0.0
    parent_experiment_id:    piControl
    activity_id:             CMIP
    data_specs_version:      01.00.30
    realm:                   ocean
    product:                 model-output
    Conventions:             CF-1.7 CMIP-6.2
    experiment:              all-forcing simulation of the recent past
    mip_era:                 CMIP6
    institution:             Commonwealth Scientific and Industrial Research ...
    branch_method:           standard
    frequency:               mon
    institution_id:          CSIRO
    title:                   ACCESS-ESM1-5 output prepared for CMIP6
    table_info:              Creation Date:(30 April 2019) MD5:e14f55f257ccea...
    parent_variant_label:    r1i1p1f1
    parent_mip_era:          CMIP6
    parent_time_units:       days since 0101-1-1
    further_info_url:        https://furtherinfo.es-doc.org/CMIP6.CSIRO.ACCES...
    sub_experiment:          none
    source_type:             AOGCM
    version:                 v20191115
    parent_source_id:        ACCESS-ESM1-5
    license:                 CMIP6 model data produced by CSIRO is licensed u...
    intake_esm_dataset_key:  CMIP.CSIRO.ACCESS-ESM1-5.historical.Omon.gn

We can remove the values in the mixed layer

[39]:
from xarrayutils.utils import mask_mixedlayer
ds_wo_ml = mask_mixedlayer(ds, ds.mlotst)
ds_wo_ml
/home/jovyan/xarrayutils/xarrayutils/utils.py:805: UserWarning: Cell bounds [{z_bounds}] not found in input. Masking is performed with cell centers, which might be less accurate
  warnings.warn(
[39]:
<xarray.Dataset>
Dimensions:             (bnds: 2, i: 360, j: 300, lev: 50, member_id: 1, time: 1980, vertices: 4)
Coordinates:
  * i                   (i) int32 0 1 2 3 4 5 6 ... 353 354 355 356 357 358 359
  * j                   (j) int32 0 1 2 3 4 5 6 ... 293 294 295 296 297 298 299
    latitude            (j, i) float64 dask.array<chunksize=(300, 360), meta=np.ndarray>
    longitude           (j, i) float64 dask.array<chunksize=(300, 360), meta=np.ndarray>
  * time                (time) datetime64[ns] 1850-01-16T12:00:00 ... 2014-12...
    time_bnds           (time, bnds) datetime64[ns] dask.array<chunksize=(1980, 2), meta=np.ndarray>
  * member_id           (member_id) <U8 'r1i1p1f1'
  * lev                 (lev) float64 5.0 15.0 25.0 ... 5.499e+03 5.831e+03
    lev_bnds            (lev, bnds) float64 dask.array<chunksize=(50, 2), meta=np.ndarray>
Dimensions without coordinates: bnds, vertices
Data variables:
    mlotst              (member_id, time, j, i, lev) float32 dask.array<chunksize=(1, 196, 300, 360, 50), meta=np.ndarray>
    vertices_latitude   (j, i, vertices, lev, member_id, time) float64 dask.array<chunksize=(300, 360, 4, 50, 1, 196), meta=np.ndarray>
    vertices_longitude  (j, i, vertices, lev, member_id, time) float64 dask.array<chunksize=(300, 360, 4, 50, 1, 196), meta=np.ndarray>
    thetao              (member_id, time, lev, j, i) float32 dask.array<chunksize=(1, 5, 50, 300, 360), meta=np.ndarray>
Attributes:
    grid:                                 native atmosphere N96 grid (145x192...
    forcing_index:                        1
    table_id:                             Omon
    variant_label:                        r1i1p1f1
    source:                               ACCESS-ESM1.5 (2019): \naerosol: CL...
    tracking_id:                          hdl:21.14100/4ca9ce2d-6374-4e22-8ad...
    initialization_index:                 1
    physics_index:                        1
    run_variant:                          forcing: GHG, Oz, SA, Sl, Vl, BC, O...
    branch_time_in_parent:                21915.0
    history:                              2019-11-15T15:38:06Z ; CMOR rewrote...
    cmor_version:                         3.4.0
    grid_label:                           gn
    nominal_resolution:                   250 km
    intake_esm_varname:                   mlotst\nthetao
    experiment_id:                        historical
    realization_index:                    1
    parent_activity_id:                   CMIP
    sub_experiment_id:                    none
    source_id:                            ACCESS-ESM1-5
    branch_time_in_child:                 0.0
    parent_experiment_id:                 piControl
    activity_id:                          CMIP
    data_specs_version:                   01.00.30
    realm:                                ocean
    product:                              model-output
    Conventions:                          CF-1.7 CMIP-6.2
    experiment:                           all-forcing simulation of the recen...
    mip_era:                              CMIP6
    institution:                          Commonwealth Scientific and Industr...
    branch_method:                        standard
    frequency:                            mon
    institution_id:                       CSIRO
    title:                                ACCESS-ESM1-5 output prepared for C...
    table_info:                           Creation Date:(30 April 2019) MD5:e...
    parent_variant_label:                 r1i1p1f1
    parent_mip_era:                       CMIP6
    parent_time_units:                    days since 0101-1-1
    further_info_url:                     https://furtherinfo.es-doc.org/CMIP...
    sub_experiment:                       none
    source_type:                          AOGCM
    version:                              v20191115
    parent_source_id:                     ACCESS-ESM1-5
    license:                              CMIP6 model data produced by CSIRO ...
    intake_esm_dataset_key:               CMIP.CSIRO.ACCESS-ESM1-5.historical...
    mixed_layer_values_removed_based_on:  lev

Or to have the mixed layer values only

[40]:
from xarrayutils.utils import mask_mixedlayer
ds_ml_only = mask_mixedlayer(ds, ds.mlotst, mask='inside')
ds_ml_only
/home/jovyan/xarrayutils/xarrayutils/utils.py:805: UserWarning: Cell bounds [{z_bounds}] not found in input. Masking is performed with cell centers, which might be less accurate
  warnings.warn(
[40]:
<xarray.Dataset>
Dimensions:             (bnds: 2, i: 360, j: 300, lev: 50, member_id: 1, time: 1980, vertices: 4)
Coordinates:
  * i                   (i) int32 0 1 2 3 4 5 6 ... 353 354 355 356 357 358 359
  * j                   (j) int32 0 1 2 3 4 5 6 ... 293 294 295 296 297 298 299
    latitude            (j, i) float64 dask.array<chunksize=(300, 360), meta=np.ndarray>
    longitude           (j, i) float64 dask.array<chunksize=(300, 360), meta=np.ndarray>
  * time                (time) datetime64[ns] 1850-01-16T12:00:00 ... 2014-12...
    time_bnds           (time, bnds) datetime64[ns] dask.array<chunksize=(1980, 2), meta=np.ndarray>
  * member_id           (member_id) <U8 'r1i1p1f1'
  * lev                 (lev) float64 5.0 15.0 25.0 ... 5.499e+03 5.831e+03
    lev_bnds            (lev, bnds) float64 dask.array<chunksize=(50, 2), meta=np.ndarray>
Dimensions without coordinates: bnds, vertices
Data variables:
    mlotst              (member_id, time, j, i, lev) float32 dask.array<chunksize=(1, 196, 300, 360, 50), meta=np.ndarray>
    vertices_latitude   (j, i, vertices, lev, member_id, time) float64 dask.array<chunksize=(300, 360, 4, 50, 1, 196), meta=np.ndarray>
    vertices_longitude  (j, i, vertices, lev, member_id, time) float64 dask.array<chunksize=(300, 360, 4, 50, 1, 196), meta=np.ndarray>
    thetao              (member_id, time, lev, j, i) float32 dask.array<chunksize=(1, 5, 50, 300, 360), meta=np.ndarray>
Attributes:
    grid:                                 native atmosphere N96 grid (145x192...
    forcing_index:                        1
    table_id:                             Omon
    variant_label:                        r1i1p1f1
    source:                               ACCESS-ESM1.5 (2019): \naerosol: CL...
    tracking_id:                          hdl:21.14100/4ca9ce2d-6374-4e22-8ad...
    initialization_index:                 1
    physics_index:                        1
    run_variant:                          forcing: GHG, Oz, SA, Sl, Vl, BC, O...
    branch_time_in_parent:                21915.0
    history:                              2019-11-15T15:38:06Z ; CMOR rewrote...
    cmor_version:                         3.4.0
    grid_label:                           gn
    nominal_resolution:                   250 km
    intake_esm_varname:                   mlotst\nthetao
    experiment_id:                        historical
    realization_index:                    1
    parent_activity_id:                   CMIP
    sub_experiment_id:                    none
    source_id:                            ACCESS-ESM1-5
    branch_time_in_child:                 0.0
    parent_experiment_id:                 piControl
    activity_id:                          CMIP
    data_specs_version:                   01.00.30
    realm:                                ocean
    product:                              model-output
    Conventions:                          CF-1.7 CMIP-6.2
    experiment:                           all-forcing simulation of the recen...
    mip_era:                              CMIP6
    institution:                          Commonwealth Scientific and Industr...
    branch_method:                        standard
    frequency:                            mon
    institution_id:                       CSIRO
    title:                                ACCESS-ESM1-5 output prepared for C...
    table_info:                           Creation Date:(30 April 2019) MD5:e...
    parent_variant_label:                 r1i1p1f1
    parent_mip_era:                       CMIP6
    parent_time_units:                    days since 0101-1-1
    further_info_url:                     https://furtherinfo.es-doc.org/CMIP...
    sub_experiment:                       none
    source_type:                          AOGCM
    version:                              v20191115
    parent_source_id:                     ACCESS-ESM1-5
    license:                              CMIP6 model data produced by CSIRO ...
    intake_esm_dataset_key:               CMIP.CSIRO.ACCESS-ESM1-5.historical...
    mixed_layer_values_removed_based_on:  lev
[41]:
import matplotlib.pyplot as plt
roi = dict(i=150, time=0, j=slice(50, 200))
plt.figure(figsize=[10, 15])
for di, (data, label) in enumerate(zip([ds, ds_wo_ml, ds_ml_only], ['full data', 'mixed layer removed', 'mixed layer only'])):
    plt.subplot(3,1,di+1)
    data.thetao.isel(lev=slice(0,20),**roi).plot(yincrease=False)
    ds.mlotst.isel(**roi).plot(x='j')
    plt.title(label)
_images/utils_36_0.png

In this case the cell bounds are not available for the model output, but mask_mixedlayer has the option to take those into account and e.g. remove cells only if the lower boundary is within the mixed layer. All you need to do is correctly specify z_bounds with the variable/coordinate name of the cell bounds.

Removing bottom values

For similar reasons you might sometimes want to mask out the values closest to the ocean bottom. remove_bottom_values will mask out the bottom-most grid cell at every location.

[42]:
from xarrayutils.utils import remove_bottom_values
ds_bot_removed = remove_bottom_values(ds)
ds_bot_removed
[42]:
<xarray.Dataset>
Dimensions:             (bnds: 2, i: 360, j: 300, lev: 50, member_id: 1, time: 1980, vertices: 4)
Coordinates:
  * i                   (i) int32 0 1 2 3 4 5 6 ... 353 354 355 356 357 358 359
  * j                   (j) int32 0 1 2 3 4 5 6 ... 293 294 295 296 297 298 299
    latitude            (j, i) float64 dask.array<chunksize=(300, 360), meta=np.ndarray>
    longitude           (j, i) float64 dask.array<chunksize=(300, 360), meta=np.ndarray>
  * time                (time) datetime64[ns] 1850-01-16T12:00:00 ... 2014-12...
    time_bnds           (time, bnds) datetime64[ns] dask.array<chunksize=(1980, 2), meta=np.ndarray>
  * member_id           (member_id) <U8 'r1i1p1f1'
  * lev                 (lev) float64 5.0 15.0 25.0 ... 5.499e+03 5.831e+03
    lev_bnds            (lev, bnds) float64 dask.array<chunksize=(50, 2), meta=np.ndarray>
Dimensions without coordinates: bnds, vertices
Data variables:
    mlotst              (member_id, time, j, i, lev) float32 dask.array<chunksize=(1, 196, 300, 360, 50), meta=np.ndarray>
    vertices_latitude   (j, i, vertices, lev) float64 dask.array<chunksize=(300, 360, 4, 50), meta=np.ndarray>
    vertices_longitude  (j, i, vertices, lev) float64 dask.array<chunksize=(300, 360, 4, 50), meta=np.ndarray>
    thetao              (member_id, time, lev, j, i) float32 dask.array<chunksize=(1, 5, 50, 300, 360), meta=np.ndarray>
Attributes:
    grid:                    native atmosphere N96 grid (145x192 latxlon)
    forcing_index:           1
    table_id:                Omon
    variant_label:           r1i1p1f1
    source:                  ACCESS-ESM1.5 (2019): \naerosol: CLASSIC (v1.0)\...
    tracking_id:             hdl:21.14100/4ca9ce2d-6374-4e22-8ad3-1a5a6dfb8fd...
    initialization_index:    1
    physics_index:           1
    run_variant:             forcing: GHG, Oz, SA, Sl, Vl, BC, OC, (GHG = CO2...
    branch_time_in_parent:   21915.0
    history:                 2019-11-15T15:38:06Z ; CMOR rewrote data to be c...
    cmor_version:            3.4.0
    grid_label:              gn
    nominal_resolution:      250 km
    intake_esm_varname:      mlotst\nthetao
    experiment_id:           historical
    realization_index:       1
    parent_activity_id:      CMIP
    sub_experiment_id:       none
    source_id:               ACCESS-ESM1-5
    branch_time_in_child:    0.0
    parent_experiment_id:    piControl
    activity_id:             CMIP
    data_specs_version:      01.00.30
    realm:                   ocean
    product:                 model-output
    Conventions:             CF-1.7 CMIP-6.2
    experiment:              all-forcing simulation of the recent past
    mip_era:                 CMIP6
    institution:             Commonwealth Scientific and Industrial Research ...
    branch_method:           standard
    frequency:               mon
    institution_id:          CSIRO
    title:                   ACCESS-ESM1-5 output prepared for CMIP6
    table_info:              Creation Date:(30 April 2019) MD5:e14f55f257ccea...
    parent_variant_label:    r1i1p1f1
    parent_mip_era:          CMIP6
    parent_time_units:       days since 0101-1-1
    further_info_url:        https://furtherinfo.es-doc.org/CMIP6.CSIRO.ACCES...
    sub_experiment:          none
    source_type:             AOGCM
    version:                 v20191115
    parent_source_id:        ACCESS-ESM1-5
    license:                 CMIP6 model data produced by CSIRO is licensed u...
    intake_esm_dataset_key:  CMIP.CSIRO.ACCESS-ESM1-5.historical.Omon.gn
[43]:
roi = dict(i=150, time=0, j=slice(50, 100))
plt.figure(figsize=[10, 15])
for di, (data, label) in enumerate(zip([ds, ds_bot_removed], ['full data', 'bottom layer removed'])):
    plt.subplot(2,1,di+1)
    data.thetao.isel(lev=slice(38, -2),**roi).plot(yincrease=False, robust=True)
    plt.title(label)
    plt.grid()
_images/utils_40_0.png
[ ]: