Skip to content

Comparing Two Climate Models

In this notebook, I will be comparing two climate reanalysis models:

  • NCEP-DOE Reanalysis 2: Surface
  • ERA5

I will be looking at the following variables:

  • Surface Pressure
  • Mean Sea Level Pressure
  • Total Column Water

The idea is simple: these two models should have very similar properties. I will be trying to user RBIG in order to assess how similar these models are. I'll be looking at the following IT measures

  • Entropy
  • Total Correlation
  • Mutual Information

If these climate models are that similar, then they should exhibit similar IT measures.

Data - Climate Models

import os, sys
cwd = os.getcwd()
source_path = f"{cwd}/../../../"
sys.path.insert(0, f'{source_path}')

import numpy as np


# Data Loaders
from src.data.climate.amip import DataDownloader
from src.data.climate.amip import DataLoader

# ESDC tools
sys.path.insert(0, f'/home/emmanuel/code/py_esdc')
from esdc.standardize import normalize_temporal
from esdc.transform import regrid_data

import pandas as pd
import xarray as xr
from tqdm import tqdm
from sklearn import preprocessing

import seaborn as sns
import matplotlib.pyplot as plt
plt.style.use('ggplot')
%matplotlib inline

%load_ext autoreload
%autoreload 2
The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload
amip_data_path = f"/home/emmanuel/projects/2020_rbig_rs/data/climate/raw/amip/"
era5_path = f"/home/emmanuel/projects/2020_rbig_rs/data/climate/raw/era5/"
ncep_path = f"/home/emmanuel/projects/2020_rbig_rs/data/climate/raw/ncep/"
results_path = f"/home/emmanuel/projects/2020_rbig_rs/data/climate/results/"
fig_path = f"/home/emmanuel/projects/2020_rbig_rs/reports/figures/climate/"

Dataset - GISS

loader = DataLoader()

dataset = 'ipsl_cm5b_lr'

cmip5_data = loader.load_amip_data(dataset)

cmip5_data
<xarray.Dataset>
Dimensions:    (bnds: 2, lat: 96, lon: 96, time: 360)
Coordinates:
  * time       (time) object 1979-01-16 12:00:00 ... 2008-12-16 12:00:00
  * lat        (lat) float64 -90.0 -88.11 -86.21 -84.32 ... 86.21 88.11 90.0
  * lon        (lon) float64 0.0 3.75 7.5 11.25 15.0 ... 345.0 348.8 352.5 356.2
Dimensions without coordinates: bnds
Data variables:
    time_bnds  (time, bnds) object dask.array<chunksize=(360, 2), meta=np.ndarray>
    lat_bnds   (lat, bnds) float64 dask.array<chunksize=(96, 2), meta=np.ndarray>
    lon_bnds   (lon, bnds) float64 dask.array<chunksize=(96, 2), meta=np.ndarray>
    psl        (time, lat, lon) float32 dask.array<chunksize=(360, 96, 96), meta=np.ndarray>
Attributes:
    institution:            IPSL (Institut Pierre Simon Laplace, Paris, France)
    institute_id:           IPSL
    experiment_id:          amip
    source:                 IPSL-CM5B-LR (2011) : atmos : LMDZ5 (LMDZ5_NPv3.1...
    model_id:               IPSL-CM5B-LR
    forcing:                Nat,Ant,GHG,SA,Oz,LU,SS,Ds,BC,MD,OC,AA
    parent_experiment_id:   N/A
    parent_experiment_rip:  N/A
    branch_time:            0.0
    contact:                ipsl-cmip5 _at_ ipsl.jussieu.fr Data manager : Se...
    comment:                This atmospheric only simulation include natural ...
    references:             Model documentation and further reference availab...
    initialization_method:  1
    physics_version:        1
    tracking_id:            b06de09a-a87a-4f4d-bb64-63ad47eab808
    product:                output
    experiment:             AMIP
    frequency:              mon
    creation_date:          2012-06-11T18:49:23Z
    history:                2012-06-11T18:49:23Z CMOR rewrote data to comply ...
    Conventions:            CF-1.4
    project_id:             CMIP5
    table_id:               Table Amon (31 January 2011) 53b766a395ac41696af4...
    title:                  IPSL-CM5B-LR model output prepared for CMIP5 AMIP
    parent_experiment:      N/A
    modeling_realm:         atmos
    realization:            1
    cmor_version:           2.7.1

Test I - AMIP vs. ERA5

ERA5

def get_era5():


    era5_data = xr.open_mfdataset(f"{era5_path}*.nc", combine="by_coords")
    era5_data = era5_data.rename({'msl': 'mslp', 'latitude': 'lat', 'longitude': 'lon'})
    era5_data.attrs['model_id'] = 'era5'
    era5_data = era5_data.rename({'mslp': 'psl'})
    return era5_data
era5_data = get_era5()
<xarray.Dataset>
Dimensions:  (lat: 721, lon: 1440, time: 487)
Coordinates:
  * lon      (lon) float32 0.0 0.25 0.5 0.75 1.0 ... 359.0 359.25 359.5 359.75
  * lat      (lat) float32 90.0 89.75 89.5 89.25 ... -89.25 -89.5 -89.75 -90.0
  * time     (time) datetime64[ns] 1979-01-01 1979-02-01 ... 2019-07-01
Data variables:
    psl      (time, lat, lon) float32 dask.array<chunksize=(487, 721, 1440), meta=np.ndarray>
    sp       (time, lat, lon) float32 dask.array<chunksize=(487, 721, 1440), meta=np.ndarray>
Attributes:
    Conventions:  CF-1.6
    history:      2019-10-07 09:20:10 GMT by grib_to_netcdf-2.10.0: /opt/ecmw...
    model_id:     era5

ReGrid

cmip5_coords = len(cmip5_data.lat) + len(cmip5_data.lon)
era5_coords = len(era5_data.lat) + len(era5_data.lon)

if cmip5_coords >= era5_coords:
    cmip5_data = regrid_data(era5_data, cmip5_data)
else:
    era5_data = regrid_data(cmip5_data, era5_data)
Create weight file: nearest_s2d_721x1440_96x96.nc
Remove file nearest_s2d_721x1440_96x96.nc
era5_data
<xarray.Dataset>
Dimensions:  (lat: 96, lon: 96, time: 487)
Coordinates:
  * time     (time) datetime64[ns] 1979-01-01 1979-02-01 ... 2019-07-01
  * lon      (lon) float64 0.0 3.75 7.5 11.25 15.0 ... 345.0 348.8 352.5 356.2
  * lat      (lat) float64 -90.0 -88.11 -86.21 -84.32 ... 84.32 86.21 88.11 90.0
Data variables:
    psl      (time, lat, lon) float64 dask.array<chunksize=(487, 96, 96), meta=np.ndarray>
    sp       (time, lat, lon) float64 dask.array<chunksize=(487, 96, 96), meta=np.ndarray>

Test II - NCEP Data

ncep_data = xr.open_mfdataset(f"{ncep_path}*mon.mean.nc", combine="by_coords")
ncep_data = ncep_data.rename({'mslp': 'psl'})
ncep_data.attrs['model_id'] = 'ncar_ncep_doe_2'
ncep_data
<xarray.Dataset>
Dimensions:    (lat: 73, lon: 144, nbnds: 2, time: 489)
Coordinates:
  * lat        (lat) float32 90.0 87.5 85.0 82.5 ... -82.5 -85.0 -87.5 -90.0
  * lon        (lon) float32 0.0 2.5 5.0 7.5 10.0 ... 350.0 352.5 355.0 357.5
  * time       (time) datetime64[ns] 1979-01-01 1979-02-01 ... 2019-09-01
Dimensions without coordinates: nbnds
Data variables:
    psl        (time, lat, lon) float32 dask.array<chunksize=(489, 73, 144), meta=np.ndarray>
    time_bnds  (time, nbnds) datetime64[ns] dask.array<chunksize=(489, 2), meta=np.ndarray>
    pr_wtr     (time, lat, lon) float32 dask.array<chunksize=(489, 73, 144), meta=np.ndarray>
    pres       (time, lat, lon) float32 dask.array<chunksize=(489, 73, 144), meta=np.ndarray>
Attributes:
    Conventions:    CF-1.0
    title:          Monthly NCEP/DOE Reanalysis 2
    history:        created 2002/03 by Hoop (netCDF2.3)
    comments:       Data is from \nNCEP/DOE AMIP-II Reanalysis (Reanalysis-2)...
    platform:       Model
    source:         NCEP/DOE AMIP-II Reanalysis (Reanalysis-2) Model
    institution:    National Centers for Environmental Prediction
    dataset_title:  NCEP-DOE AMIP-II Reanalysis
    References:     https://www.esrl.noaa.gov/psd/data/gridded/data.ncep.rean...
    source_url:     http://www.cpc.ncep.noaa.gov/products/wesley/reanalysis2/
    model_id:       ncar_ncep_doe_2

ReGrid

cmip5_coords = len(cmip5_data.lat) + len(cmip5_data.lon)
ncep_coords = len(ncep_data.lat) + len(ncep_data.lon)

if cmip5_coords >= era5_coords:
    cmip5_data = regrid_data(ncep_data, cmip5_data)
else:
    ncep_data = regrid_data(cmip5_data, ncep_data)
Create weight file: nearest_s2d_73x144_96x96.nc
Remove file nearest_s2d_73x144_96x96.nc
ncep_data
<xarray.Dataset>
Dimensions:  (lat: 96, lon: 96, time: 489)
Coordinates:
  * time     (time) datetime64[ns] 1979-01-01 1979-02-01 ... 2019-09-01
  * lon      (lon) float64 0.0 3.75 7.5 11.25 15.0 ... 345.0 348.8 352.5 356.2
  * lat      (lat) float64 -90.0 -88.11 -86.21 -84.32 ... 84.32 86.21 88.11 90.0
Data variables:
    psl      (time, lat, lon) float64 dask.array<chunksize=(489, 96, 96), meta=np.ndarray>
    pr_wtr   (time, lat, lon) float64 dask.array<chunksize=(489, 96, 96), meta=np.ndarray>
    pres     (time, lat, lon) float64 dask.array<chunksize=(489, 96, 96), meta=np.ndarray>