Cross-Validation

Note

This tutorial is currently a stub and will be expanded in future. In the meantime, if you have questions please open an issue on GitHub.

Calculating Pointwise Likelihoods

After you have defined a model and sampled from its posterior (eg. via octofit), you can see how each datapoint is influencing the posterior via the following functions.

# already have defined `model` and `chain` ...

likelihood_mat = Octofitter.pointwise_like(model, chain)

likelihood_mat is now a Nsample x Ndata matrix.

Note

The columns are ordered the same as how the data are defined in the model.

Warning

You may see a few additional entries you didn't expect. Each UniformCircular and ObsPriorAstromONeil2019 adds an additional likelihood object under the hood.

Pareto-Smoothed Importance Sampling

After you have generated the likelihood_mat you can use the Julia package ParetoSmooth.jl to efficiently calculate a leave-one-out cross-validataion score. This technique takes a single posterior chain and, using the pointwise likelihoods, generates N_datapoints new chains where each chain is adjusted as if that datapoint was held out from the model.

In broad terms, one might say that this test verifies that no individual datapoints are overly skewing the results.

using ParetoSmooth
result = psis_loo(
    collect(likelihood_mat'),
    chain_index=ones(Int,size(chain,1))
)

Plot like so:

using CairoMakie

fig = Figure()

ax = Axis(
    fig[1,1],
    xlabel="data #",
    ylabel="Pareto K"
)
scatter!(ax, result.pointwise(:pareto_k))


ax = Axis(
    fig[2,1],
    xlabel="data #",
    ylabel="MCSE"
)
scatter!(ax, result.pointwise(:mcse))


ax = Axis(
    fig[3,1],
    xlabel="data #",
    ylabel="P_EFF"
)
scatter!(ax, result.pointwise(:p_eff))


fig