Skip to content

Thesis Outline

---

Chapter I - Introduction


1.1 Earth Science in the Wild

1.1.1 Problems

  • Earth Observation
  • Climate
  • Extreme Events - Drought
  • Ocean

1.1.2 Data

  • Remote Sensing
  • Physical Models
  • Generalized Models
  • Emulation

1.1.3 Drawbacks

  • What/Which? - best model?
  • How? - how does the model work?
  • Why? - get knowledge

1.2 - Fundamental ML Problems

1.2.1 - Representations

  • Discriminative
  • Generative
  • Information

1.2.2 - Interpretation

  • Affect (Sensitivity)
  • Noise
  • Uncertainty

1.2.3 - Understanding

  • Correlation
  • Dependence
  • Causation

1.3 - This Thesis

1.3.1 - Investigation

  • Research Question: Can we push forward the notion of uncertainty in EO applications?
  • Discriminative Approach - GPs
    • Sensitivity
    • Uncertainty
  • Generative Approach - RBIG
    • IT measures
  • Intermmediate Steps: Investigations that needed to happen
  • Uncertain GPs
  • HSIC Parameter Estimation
  • Applications:
  • Earth Science Data Cubes
  • Ocean Data
  • Drought Indices
  • Climate Models

1.3.2 - OutReach

  • Reproducibility
  • Blog Posts
  • PyCon (?)

1.3.3 - My Contributions

  • Code

    I try to include links to model zoos whenever possible with tutorials of how to do certain things from scratch.

  • Toy Examples

    I will do many toy examples to highlight important concepts.

  • Blogs

    This highlights things that I think the ML community should know

  • Sleeper Theorems

    There are many things that you should know to read the thesis. Just like many papers. And sometimes it's not possible to put them in the appendix. So I will put it in boxes.

  • Supplementary Material

    I am a strong advocate of telling the story without the need to go through unnecessary mathematics. So I dump all necessary derivations in the appendix. Including Notation and supplementary material

1.3.4 - Organization

  • Part I - Data Representation Approach
  • Part II - Discriminative Modeling Approach
  • Part III - Generative Modeling Approach

---

Chapter II - Data Representation


2.1 Kernel Methods

  • Theory
  • Regression (Classification), (KRR, GPR, SVM)
  • Dimensionality Reduction (KPCA, KECA, DRR)
  • Density (KDE, Distribution Regression)
  • Information Theory (Renyi Stuff)
  • Similarity Measures - Covariance v.s. Correlation (HSIC, KA)

Sleeper Theorems

  • Mercer's Theorem
  • HSIC \equiv MMD

2.2 Density Estimation

  • Parametric
  • Gaussian
  • Mixture of Gaussians
  • Classical
  • Binning (Histogram) -
  • Kernel - Smooth
  • kNN - Adaptive
  • Neural Density Estimation
  • Normalizing Flows
  • Density Destructor
  • Gaussianization
  • Conditional Density Estimation

2.3 Neural Networks

  • Discriminative Neural Networks
  • Probabilistic Neural Networks
  • Fully Bayesian Neural Networks

2.4 Other

  • Generalized Linear Models
  • Ensemble Methods

2.4 All Connected

  • Kernel Entropy Components
  • Conditional

2.3 Modeling Approaches

2.3.1 Discriminative Models

2.3.2 Generative Models

2.2.3 Information Theory

  • Signal
  • ...
  • Approximate
  • Measures (I, H, MI, TC)
Change of Variables

---

Chapter III - Discriminative Model


1.1 Regression

1.1.1 Kernel Ridge Regression

1.1.2 Gaussian Process Regression


1.3 Sensitivity

1.3.1 Concept

1.3.2 Derivative


1.2 Uncertainty


1.4 Applications

  • ESDC
  • Sampling
  • Principal Curves
  • ESDC - HSIC Sensitivity

---

Chapter IV - Generative Model


4.1 Outline


4.2 Probability

  • Concepts
  • Bayesian Formulation
  • Change of Variables

Sleeper Theorems

  • Variational Inference
  • Jensen's Inequality
  • Change of Variables

4.3 Generative Models

  • Density Destructors
  • Normalizing Flows

Sleeper Theorems

  • MSE vs LL vs KLD

4.4 Information Theory

4.4.1 IT Measures

  • Signal
  • ...
  • Approximate
  • Measures (I, H, MI, TC)

4.4.2 Estimators

  • Gaussian
  • KNN/KDP
  • RBIG
  • HSIC/KA

4.5 - Applications

4.5.1 Spatial-Temporal Analysis

  • EGU18 (C)
  • CI19 (W)
  • RBIG 4 RS (J)

4.5.1 Droughts

  • CI19 (W)
  • RBIG 4 RS (J)

4.5.3 Climate Models

  • AGU (C)
  • RBIG 4 RS (J)

4.6 - Lab Notebooks

  • Change of Variables Proof of Concept
  • RBIG - Step-by-Step

Kernel Derivatives (Regression)

  • Regression: KRR, GPR,
  • Classification: SVM
  • Feature Selection: O/KECA
  • Dependence: HSIC, rHSIC
  • Kernel Model Zoo

Application

  • Regression: Sampling
  • Regression: Sensitivity Analysis
  • Input Uncertainty

Appendix

  • Kernel Methods
  • Theorem
  • Sensitivity

Input Uncertainty

  • Extended Literature Review
  • Connection with Kalman Filters
  • GP Model Zoo (GPy, TF, GPyTorch)

Context

  • Uncertainty in the Literature

Application

  • IASI
  • Ocean bbP
  • Alvaro Data

Appendix

  • Taylor Series
  • Moment Matching
  • Variational Inference

IT Measures

Application

  • Earth Science Data Cubes (spatial-temporal)
  • Drought Indices
  • Climate Models

Dependence

  • Covariance vs Correlation
  • HSIC and Kernel Alignment
  • Mutual Information


Appendices

Mathematical Preliminaries

  • Linear Algebra
  • Probability Theory