Skip to content

Multi-Output Models

In regression, this is when we try to learn a function f or multiple functions to predict multiple output y .


Concretly, we have a regression problem and we are trying to predict y\in \mathbb{R}^{N\times P} where N is the number of samples and P is the number of outputs. Now the question is, how do we model all of these outputs jointly?

There are some questions:

  • Are there correlations between the outputs or independent?
  • Is there missing data?
  • Do we actually need to model all of them at the same time and/or jointly?


Firstly, there is some confusion about the terminology. I've heard the following names:


when we have different outputs. BUT possibly with different learning objectives and even different data types. e.g. one output is a regression task, one output is a classification problem, one output is a meta-learning task, etc. (a.k.a. the different tasks). The easiest parallel I can think of is self-driving cars. The objective is to drive, but you need to do many different tasks in order to reach the objective: drive without crashing.


typically when we have one task but just more than one output. e.g. multiple outputs for regression or multiple outputs for classification. So a concrete example is if we are predicting temperature and windspeed from some inputs.


same as multi-output but catered more to situations where we know one of the outputs is of lower quality. e.g. a regression problem where one of the targets has less resolution or perhaps missing some data.

These definitions come from a discussion I had with the GPFlow Community. I have yet to see a paper that is consistent with how these are done. I have broken up each section based off of their name but as seen from the names, there is a lot of overlap.

🆕 New!

You'll find these below but I wanted to highlight them because they're pretty cool.

Scalable Exact Inference in Multi-Output Gaussian Processes - Bruinsma et al (2020)

They show a nice trick where you learn an invertible projection on your output space to reduce the crazy amount of outputs.

📜 Paper

💻📝 Code | Julia

📺 ICML Prezi

A Framework for Interdomain and Multioutput Gaussian Processes - by Van der Wilk et al (2020)

A full framework in GPFlow where they implement a framework that allows maximum flexibility when working with multi-output GPs.

📜 Paper

💻📝 Demo Notebook

📺 Lectures

GPSS Summer School - Alvarez (2017)

📺 Video

📋 Slides

📜 Literature


Problems like these tend to be when we have a multi-output problem but we don't necessarily have all outputs. We also assume we have correlated dimensions. From (Bonilla & Williams, 2008), known as the intrinsic model of coregionalization (ICM), We have the form:

\begin{aligned} \text{cov}(f_i(X), f_j(X')) &=k(X,X') \cdot B[i,j] \\ \mathbf{B}&=\mathbf{WW^\top} + \text{diag}(k) \end{aligned}

This is useful for problems with a small number of dimensions because it's quite an expensive method.

Multi-task Gaussian Process prediction - Bonilla et al. (2007

-> paper

-> GPSS 2008 Slides


Efficient multioutput Gaussian processes through variational inducing kernels - Alvarez et al. (2011)

-> paper

Remarks on multi-output Gaussian process regression - Liu et. al. (2018)

-> pdf

Heterogeneous Multi-output Gaussian Process Prediction - Moreno-Muñez et. al. (2018)

-> paper

-> code

A Framework for Interdomain and Multioutput Gaussian Processes - by Van der Wilk et al (2020)

-> Paper

-> Demo Notebook

Fast Approximate Multi-output Gaussian Processes - Joukov & Kulic (2020)

-> paper

Scalable Exact Inference in Multi-Output Gaussian Processes - Bruinsma et al (2020)

They show a nice trick where you learn an invertible projection on your output space to reduce the crazy amount of outputs.

📜 Paper

💻📝 Code | Julia

📺 ICML Prezi


Deep Multi-fidelity Gaussian Processes - Raissi & Karniadakis (2016)

-> paper

-> blog

Deep Gaussian Processes for Multi-fidelity Modeling - Cutjar et. al. (2019)

-> paper

-> notebook

-> poster

-> Code


I decided to include a special section about the software because there is no real go-to library for dealing with multioutput GPs as of now.

Exact GP

  • Demo Notebook.

    Use this if you have correlated outputs with a low number of dimensions and samples.


Sparse GP
