Guide

This guide demonstrates the usage of PairPlots and shows several ways you can customize it to your liking.

Set up:

using CairoMakie
using PairPlots
using DataFrames

CairoMakie is great for making high quality static figures. Try GLMakie or WGLMakie for interactive plots!

We will use DataFrames here to wrap our tables and provide pleasant table listings. You can use any Tables.jl compatible source, including simple named tuples of vectors for each column.

Single Series

Let's create a basic table of data to vizualize.

N = 100_000
α = [2randn(N÷2) .+ 6; randn(N÷2)]
β = [3randn(N÷2); 2randn(N÷2)]
γ = randn(N)
δ = β .+ 0.6randn(N)

df = DataFrame(;α, β, γ, δ)
8×4 DataFrame
Rowαβγδ
Float64Float64Float64Float64
13.2650.7370360.7213021.01452
23.08130.6991661.550580.977892
35.50537-1.94619-0.675681-1.17317
45.35906-0.6410861.6856-0.420589
56.61853-2.96523-0.313896-2.8728
65.262810.953755-1.187371.01822
75.176913.729520.96213.61009
87.269681.93363-0.01437161.19763

We can plot this data directly using pairplot, and add customizations iteratively.

pairplot(df)
Example block output

Override the axis labels:

pairplot(
    df,
    labels = Dict(
        # basic string
        :α => "parameter 1",
        # Makie rich text
        :β => rich("parameter 2", font=:bold, color=:blue),
        # LaTeX String
        :γ => L"\frac{a}{b}",
    )
)
Example block output

Let's move onto more complex examples. The full syntax of the pairplot function is:

pairplot(
    PairPlots.Series(source) => (::PairPlots.VizType...),
)

That is, it accepts a list of pairs of PairPlots.Series => a tuple of "vizualiation layers". As we'll see later on, you can pass keyword arguments with a series, or a specific vizualization layer to customize their behaviour and appearance. If you don't need to adjust any parameters for a whole series, you can just pass in a data source and PairPlots will wrap it for you:

pairplot(
    source => (::PairPlots.VizType...),
)

Let's see how this works by iteratively building up the default vizualiation. First, create a basic histogram plot:

pairplot(
    df => (PairPlots.Hist(),) # note the comma
)
Example block output
Note

A tuple or list of vizualization types is required, even if you just want one. Make sure to include the comma in these examples.

Or, a histogram with hexagonal binning:

pairplot(
    df => (PairPlots.HexBin(),)
)
Example block output

Scatter plots:

pairplot(
    df => (PairPlots.Scatter(),)
)
Example block output

Filled contour plots:

pairplot(
    df => (PairPlots.Contourf(),)
)
Example block output

Outlined contour plots:

pairplot(
    df => (PairPlots.Contour(),)
)
Example block output

Now let's combine a few plot types. Scatter and contours:

pairplot(
    df => (PairPlots.Scatter(), PairPlots.Contour())
)
Example block output

Scatter and contours, but hiding points above $2\sigma$:

pairplot(
    df => (PairPlots.Scatter(filtersigma=2), PairPlots.Contour())
)
Example block output

Placing a HexBin series underneath:

pairplot(
    df => (
        PairPlots.HexBin(colormap=Makie.cgrad([:transparent, :black])),
        PairPlots.Scatter(filtersigma=2, color=:black),
        PairPlots.Contour(color=:black)
    )
)
Example block output

Margin plots

We can add additional vizualization layers to the diagonals of the plots using the same syntax.

pairplot(
    df => (
        PairPlots.HexBin(colormap=Makie.cgrad([:transparent, :black])),
        PairPlots.Scatter(filtersigma=2, color=:black),
        PairPlots.Contour(color=:black),

        # New:
        PairPlots.MarginDensity()
    )
)
Example block output

Adjust margin density KDE bandwidth (note: this multiplies the default bandwidth. A value larger than 1 increases smoothing, less than 1 decreases smoothing).

pairplot(
    df => (
        PairPlots.HexBin(colormap=Makie.cgrad([:transparent, :black])),
        PairPlots.Scatter(filtersigma=2, color=:black),
        PairPlots.Contour(color=:black),

        PairPlots.MarginDensity(bandwidth=0.5)
    )
)
Example block output

Adding a histgoram instead of a smoothed kernel density estimate:

pairplot(
    df => (
        PairPlots.HexBin(colormap=Makie.cgrad([:transparent, :black])),
        PairPlots.Scatter(filtersigma=2, color=:black),
        PairPlots.Contour(color=:black),

        # New:
        PairPlots.MarginHist(),
        PairPlots.MarginConfidenceLimits(),
    )
)
Example block output

Truth Lines

You can quickly add lines to mark particular values of each variable on all subplots using Truth:

pairplot(
    df,
    PairPlots.Truth(
        (;
            α = [0, 6],
            β = 0,
            γ = 0,
            δ = [-1, 0, +1],
        ),
        label="Mean Values"
    )
)
Example block output

Customize Axes

You can customize the axes of the subplots freely in two ways. For these examples, we'll create a variable that is log-normally distributed.

dfln = DataFrame(;α, β, γ=10 .^ γ, δ)

First, you can pass axis parameters for all plots along the diagonal using the diagaxis keyword or all plots below the diagonal using the bodyaxis parameter.

Turn on grid lines for the body axes:

pairplot(dfln, bodyaxis=(;xgridvisible=true, ygridvisible=true))
Example block output

Apply a pseduo-log scale on the margin plots along the diagonal:

pairplot(dfln, diagaxis=(;yscale=Makie.pseudolog10, ygridvisible=true))
Example block output

The second way you can control the axes is by table column. This allows you to customize how an individual variable is presented across the pair plot.

For example, we can apply a log scale to all axes that the γ variable is plotted against:

pairplot(
    dfln => (PairPlots.Scatter(), PairPlots.MarginStepHist()),
    axis=(;
        γ=(;
            scale=log10
        )
    )
)
Example block output
Note

We do not prefix the attribute with x or y. PairPlots.jl will add the correct prefix as needed.

Warning

Log scale variables usually work best with Scatter series. Histogram and contour based series sometimes extend past zero, breaking the scale.

There is also special support for setting the axis limits of each variable:

pairplot(
    dfln => (PairPlots.Scatter(), PairPlots.MarginStepHist()),
    axis=(;
        α=(;
            lims=(;low=-10, high=+10)
        ),
        γ=(;
            scale=log10
        )
    )
)
Example block output

This applies the correct limits either to the vertical axis or horizontal axis as appropriate. Note that the parameters low and/or high must be passed as a named tuple.

Adding a title

fig = pairplot(df)
Label(fig[0,:], "This is the title!")
fig
Example block output

Layouts

The pairplot function integrates easily within larger Makie Figures.

Customizing the figure:

fig = Figure(size=(400,400))
pairplot(fig[1,1], df => (PairPlots.Contourf(),))
fig
Example block output
Note

If you only need to pass arguments to Figure, for convenience you can use pairplot(df, figure=(;...)).

You can plot into one part of a larger figure:

fig = Figure(size=(800,800))

scatterlines(fig[1,1], randn(40))

pairplot(fig[1,2], df)

lines(fig[2,:], randn(200))


colsize!(fig.layout, 2, 450)
rowsize!(fig.layout, 1, 450)

fig
Example block output

Adjust the spacing between axes inside a pair plot:

fig = Figure(size=(600,600))

# Pair Plots must go into a Makie GridLayout. If you pass a GridPosition instead,
# PairPlots will create one for you.
# We can then adjust the spacing within that GridLayout.

gs = GridLayout(fig[1,1])
pairplot(gs, df)

rowgap!(gs, 0)
colgap!(gs, 0)

fig
Example block output

Multiple Series

You can plot multiple series by simply passing more than one table to pairplot They don't have to have all the same column names.

# The simplest table format is just a named tuple of vectors.
# You can also pass a DataFrame, or any other Tables.jl compatible object.
table1 = (;
    x = randn(10000),
    y = randn(10000),
)

table2 = (;
    x = 1 .+ randn(10000),
    y = 2 .+ randn(10000),
    z = randn(10000),
)

pairplot(table1, table2)
Example block output

You may want to add a legend:

c1 = Makie.wong_colors(0.5)[1]
c2 = Makie.wong_colors(0.5)[2]

pairplot(
    PairPlots.Series(table1, label="table 1", color=c1, strokecolor=c1),
    PairPlots.Series(table2, label="table 2", color=c2, strokecolor=c2),
)
Example block output

You can customize each series independently if you wish.

pairplot(
    table2 => (PairPlots.HexBin(colormap=:magma), PairPlots.MarginDensity(color=:orange),  PairPlots.MarginConfidenceLimits(color=:black)),
    table1 => (PairPlots.Contour(color=:cyan, strokewidth=5),),
)
Example block output