Basic Astrometry Fit
Here is a worked example of a one-planet model fit to relative astrometry (positions measured between the planet and the host star).
Start by loading the Octofitter and Distributions packages:
using Octofitter, Distributions
Specifying the data
We will create a likelihood object to contain our relative astrometry data. We can specify this data in several formats. It can be listed in the code or loaded from a file (eg. a CSV file, FITS table, or SQL database).
astrom_like = PlanetRelAstromLikelihood(
(epoch = 50000, ra = -505.7637580573554, dec = -66.92982418533026, σ_ra = 10, σ_dec = 10, cor=0),
(epoch = 50120, ra = -502.570356287689, dec = -37.47217527025044, σ_ra = 10, σ_dec = 10, cor=0),
(epoch = 50240, ra = -498.2089148883798, dec = -7.927548139010479, σ_ra = 10, σ_dec = 10, cor=0),
(epoch = 50360, ra = -492.67768482682357, dec = 21.63557115669823, σ_ra = 10, σ_dec = 10, cor=0),
(epoch = 50480, ra = -485.9770335870402, dec = 51.147204404903704, σ_ra = 10, σ_dec = 10, cor=0),
(epoch = 50600, ra = -478.1095526888573, dec = 80.53589069730698, σ_ra = 10, σ_dec = 10, cor=0),
(epoch = 50720, ra = -469.0801731788123, dec = 109.72870493064629, σ_ra = 10, σ_dec = 10, cor=0),
(epoch = 50840, ra = -458.89628893460525, dec = 138.65128697876773, σ_ra = 10, σ_dec = 10, cor=0),
)
PlanetRelAstromLikelihood Table with 6 columns and 8 rows:
epoch ra dec σ_ra σ_dec cor
┌────────────────────────────────────────────
1 │ 50000 -505.764 -66.9298 10 10 0
2 │ 50120 -502.57 -37.4722 10 10 0
3 │ 50240 -498.209 -7.92755 10 10 0
4 │ 50360 -492.678 21.6356 10 10 0
5 │ 50480 -485.977 51.1472 10 10 0
6 │ 50600 -478.11 80.5359 10 10 0
7 │ 50720 -469.08 109.729 10 10 0
8 │ 50840 -458.896 138.651 10 10 0
In Octofitter, epoch
is always the modified Julian date (measured in days). If you're not sure what this is, you can get started by just putting in arbitrary time offsets measured in days.
In this case, we specified ra
and dec
offsets in milliarcseconds. We could instead specify sep
(projected separation) in milliarcseconds and pa
in radians. You cannot mix the two formats in a single PlanetRelAstromLikelihood
but you can create two different likelihood objects, one for each format.
You can also specify it in separation (mas) and positon angle (rad):
astrom_like_2 = PlanetRelAstromLikelihood(
(epoch = 50000, sep = 505.7637580573554, pa = deg2rad(24.1), σ_sep = 10, σ_pa =deg2rad(1.2), cor=0),
# ...etc.
)
Another way we could specify the data is by column:
astrom_like = PlanetRelAstromLikelihood(Table(;
epoch= [50000, 50120, 50240, 50360,50480, 50600, 50720, 50840,],
ra = [-505.764, -502.57, -498.209, -492.678,-485.977, -478.11, -469.08, -458.896,],
dec = [-66.9298, -37.4722, -7.92755, 21.6356, 51.1472, 80.5359, 109.729, 138.651, ],
σ_ra = fill(10.0, 8),
σ_dec = fill(10.0, 8),
cor = fill(0.0, 8)
))
PlanetRelAstromLikelihood Table with 6 columns and 8 rows:
epoch ra dec σ_ra σ_dec cor
┌────────────────────────────────────────────
1 │ 50000 -505.764 -66.9298 10.0 10.0 0.0
2 │ 50120 -502.57 -37.4722 10.0 10.0 0.0
3 │ 50240 -498.209 -7.92755 10.0 10.0 0.0
4 │ 50360 -492.678 21.6356 10.0 10.0 0.0
5 │ 50480 -485.977 51.1472 10.0 10.0 0.0
6 │ 50600 -478.11 80.5359 10.0 10.0 0.0
7 │ 50720 -469.08 109.729 10.0 10.0 0.0
8 │ 50840 -458.896 138.651 10.0 10.0 0.0
Finally we could also load the data from a file somewhere. Here is an example of loading a CSV:
using CSV # must install CSV.jl first
astrom_like = CSV.read("mydata.csv", PlanetRelAstromLikelihood)
You can also pass a DataFrame or any other table format.
Creating a planet
We now create our first planet model. Let's name it planet b
. The name of the planet will be used in the output results.
In Octofitter, we specify planet and system models using a "probabilistic programming language". Quantities with a ~
are random variables. The distributions on the right hand sides are priors. You must specify a proper prior for any quantity which is allowed to vary.
We now create our planet b
model using the @planet
macro.
@planet b Visual{KepOrbit} begin
a ~ truncated(Normal(10, 4), lower=0.1, upper=100)
e ~ Uniform(0.0, 0.5)
i ~ Sine()
ω ~ UniformCircular()
Ω ~ UniformCircular()
θ ~ UniformCircular()
tp = θ_at_epoch_to_tperi(system,b,50420)
end astrom_like
In the model definition, b
is the name of the planet (it can be anything), Visual{KepOrbit}
is the type of orbit parameterization (see the PlanetOrbits.jl documentation page).
After the begin
comes our variable specification. Quantities with a ~
are random variables aka. our priors. You must specify a proper prior for any quantity which is allowed to vary. "Uninformative" priors like 1/x
must be given bounds, and can be specified with LogUniform(lower, upper)
.
Make sure that variables like mass and eccentricity can't be negative. You can pass a distribution to truncated
to prevent this, e.g. M ~ truncated(Normal(1, 0.1),lower=0)
.
Priors can be any univariate distribution from the Distributions.jl package.
For a KepOrbit
you must specify the following parameters:
a
: Semi-major axis, astronomical units (AU)i
: Inclination, radianse
: Eccentricity in the range [0, 1)ω
: Argument of periastron, radiusΩ
: Longitude of the ascending node, radians.tp
: Epoch of periastron passage
Many different distributions are supported as priors, including Uniform
, LogNormal
, LogUniform
, Sine
, and Beta
. See the section on Priors for more information. The parameters can be specified in any order.
You can also hardcode a particular value for any parameter if you don't want it to vary. Simply replace eg. e ~ Uniform(0, 0.999)
with e = 0.1
. This =
syntax works for arbitrary mathematical expressions and even functions. We use it here to reparameterize tp
.
tp
is a date which sets the location of the planet around its orbit. It repeats every orbital period and the orbital period depends on the scale of the orbit. This makes it quite hard to sample from. We therefore reparameterize using θ
parameter, representing the position angle of the planet at a given reference epoch. This parameterization speeds up sampling quite a bit!
After the variables block are zero or more Likelihood
objects. These are observations specific to a given planet that you would like to include in the model. If you would like to sample from the priors only, don't pass in any observations.
For this example, we specify PlanetRelAstromLikelihood
block. This is where you can list the position of a planet relative to the star at different epochs.
When you have created your planet, you should see the following output. If you don't, you can run display(b)
to force the text to be output:
Planet b
Derived:
ω, Ω, θ, tp,
Priors:
a Truncated(Distributions.Normal{Float64}(μ=10.0, σ=4.0); lower=0.1, upper=100.0)
e Distributions.Uniform{Float64}(a=0.0, b=0.5)
i Sine()
ωy Distributions.Normal{Float64}(μ=0.0, σ=1.0)
ωx Distributions.Normal{Float64}(μ=0.0, σ=1.0)
Ωy Distributions.Normal{Float64}(μ=0.0, σ=1.0)
Ωx Distributions.Normal{Float64}(μ=0.0, σ=1.0)
θy Distributions.Normal{Float64}(μ=0.0, σ=1.0)
θx Distributions.Normal{Float64}(μ=0.0, σ=1.0)
Octofitter.UnitLengthPrior{:ωy, :ωx}: √(ωy^2+ωx^2) ~ LogNormal(log(1), 0.02)
Octofitter.UnitLengthPrior{:Ωy, :Ωx}: √(Ωy^2+Ωx^2) ~ LogNormal(log(1), 0.02)
Octofitter.UnitLengthPrior{:θy, :θx}: √(θy^2+θx^2) ~ LogNormal(log(1), 0.02)
PlanetRelAstromLikelihood Table with 6 columns and 8 rows:
epoch ra dec σ_ra σ_dec cor
┌────────────────────────────────────────────
1 │ 50000 -505.764 -66.9298 10.0 10.0 0.0
2 │ 50120 -502.57 -37.4722 10.0 10.0 0.0
3 │ 50240 -498.209 -7.92755 10.0 10.0 0.0
4 │ 50360 -492.678 21.6356 10.0 10.0 0.0
5 │ 50480 -485.977 51.1472 10.0 10.0 0.0
6 │ 50600 -478.11 80.5359 10.0 10.0 0.0
7 │ 50720 -469.08 109.729 10.0 10.0 0.0
8 │ 50840 -458.896 138.651 10.0 10.0 0.0
Creating a system
A system represents a host star with one or more planets. Properties of the whole system are specified here, like parallax distance and mass of the star. This is also where you will supply data like images, astrometric acceleration, or stellar radial velocity since they don't belong to any planet in particular.
@system Tutoria begin
M ~ truncated(Normal(1.2, 0.1), lower=0.1)
plx ~ truncated(Normal(50.0, 0.02), lower=0.1)
end b
Tutoria
is the name we have given to the system. It could be eg PDS70
, or anything that will help you keep track of the results.
The variables block works just like it does for planets. Here, the two parameters you must provide are:
M
: Gravitational parameter of the central body, expressed in units of Solar mass.plx
: Distance to the system expressed in milliarcseconds of parallax.
M
is always required for all choices of parameterizations supported by PlanetOrbits.jl. plx
is needed due to our choice to use Visual{...}
orbit and relative astrometry. The prior on plx
can be looked up from GAIA for many targets by using the function gaia_plx
:
plx ~ gaia_plx(;gaia_id=12345678910)
After that, just list any planets that you want orbiting the star. Here, we pass planet b
.
This is also where we could pass likelihood objects for system-wide data like stellar radial velocity.
You can display your system object by running display(Tutoria)
(or whatever you chose to name your system).
Prepare model
We now convert our declarative model into efficient, compiled code:
model = Octofitter.LogDensityModel(Tutoria)
LogDensityModel for System Tutoria of dimension 11 and 11 epochs with fields .ℓπcallback and .∇ℓπcallback
This type implements the julia LogDensityProblems.jl interface and can be passed to a wide variety of samplers.
Sampling
Now we are ready to draw samples from the posterior:
# Provide a seeded random number generator for reproducibility of this example.
# This is not necessary in general: you may simply omit the RNG parameter if you prefer.
using Random
rng = Random.Xoshiro(1234)
chain = octofit(rng, model)
Chains MCMC chain (1000×28×1 Array{Float64, 3}):
Iterations = 1:1:1000
Number of chains = 1
Samples per chain = 1000
Wall duration = 4.25 seconds
Compute duration = 4.25 seconds
parameters = M, plx, b_a, b_e, b_i, b_ωy, b_ωx, b_Ωy, b_Ωx, b_θy, b_θx, b_ω, b_Ω, b_θ, b_tp
internals = n_steps, is_accept, acceptance_rate, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size, is_adapt, loglike, logpost, tree_depth, numerical_error
Summary Statistics
parameters mean std mcse ess_bulk ess_tail r ⋯
Symbol Float64 Float64 Float64 Float64 Float64 Floa ⋯
M 1.1922 0.0928 0.0046 415.0194 325.1179 1.0 ⋯
plx 49.9998 0.0202 0.0006 1068.3874 652.1067 0.9 ⋯
b_a 11.9580 2.3312 0.1750 209.5838 267.9094 1.0 ⋯
b_e 0.1556 0.1141 0.0068 207.0118 109.7143 1.0 ⋯
b_i 0.6575 0.1435 0.0094 229.9286 378.3921 1.0 ⋯
b_ωy -0.0718 0.7387 0.0575 191.4211 721.9466 1.0 ⋯
b_ωx -0.0547 0.6797 0.0534 190.2914 627.0753 1.0 ⋯
b_Ωy 0.0366 0.7596 0.1862 20.6763 383.6202 1.0 ⋯
b_Ωx 0.0516 0.6591 0.1169 44.6409 382.3885 1.0 ⋯
b_θy 0.0746 0.0105 0.0004 853.0921 421.5223 1.0 ⋯
b_θx -0.9998 0.0995 0.0034 891.9627 732.3138 1.0 ⋯
b_ω -0.3002 1.8883 0.1311 265.8957 613.1630 1.0 ⋯
b_Ω -0.3158 1.7797 0.2983 59.1633 346.3920 1.0 ⋯
b_θ -1.4963 0.0074 0.0002 860.3811 694.8675 1.0 ⋯
b_tp 42573.2234 5303.6544 320.8974 297.6587 355.0625 1.0 ⋯
2 columns omitted
Quantiles
parameters 2.5% 25.0% 50.0% 75.0% 97.5%
Symbol Float64 Float64 Float64 Float64 Float64
M 1.0089 1.1275 1.1929 1.2501 1.3678
plx 49.9631 49.9858 49.9996 50.0137 50.0390
b_a 8.5508 10.2071 11.5200 13.3323 17.3831
b_e 0.0042 0.0592 0.1372 0.2297 0.3927
b_i 0.3630 0.5632 0.6643 0.7616 0.9144
b_ωy -1.0766 -0.7954 -0.1658 0.6785 1.0605
b_ωx -1.0430 -0.7086 -0.1005 0.5658 1.0471
b_Ωy -1.0603 -0.7392 0.0928 0.8235 1.0960
b_Ωx -1.0479 -0.5569 0.1474 0.6303 1.0468
b_θy 0.0553 0.0679 0.0736 0.0807 0.0986
b_θx -1.2147 -1.0635 -0.9915 -0.9265 -0.8306
b_ω -3.0102 -2.1157 -0.2224 1.1551 3.0129
b_Ω -2.9810 -2.1192 0.2041 0.9337 2.9524
b_θ -1.5105 -1.5014 -1.4961 -1.4912 -1.4825
b_tp 30032.1408 39371.4262 43955.8358 46379.8996 49815.2229
You will get an output that looks something like this with a progress bar that updates every second or so. You can reduce or completely silence the output by reducing the verbosity
value down to 0.
Once complete, the chain
object will hold the sampler results. Displaying it prints out a summary table like the one shown above. For a basic model like this, sampling should take less than a minute on a typical laptop.
Diagnostics
The first thing you should do with your results is check a few diagnostics to make sure the sampler converged as intended.
A few things to watch out for: check that you aren't getting many (any, really) numerical errors (ratio_divergent_transitions
). This likely indicates a problem with your model: either invalid values of one or more parameters are encountered (e.g. the prior on semi-major axis includes negative values) or that there is a region of very high curvature that is failing to sample properly. This latter issue can lead to a bias in your results.
One common mistake is to use a distribution like Normal(10,3)
for semi-major axis. This left tail of this distribution includes negative values, and our orbit model is not defined for negative semi-major axes. A better choice is a truncated(Normal(10,3), lower=0.1)
distribution (not including zero, since a=0 is not defined).
You may see some warnings during initial step-size adaptation. These are probably nothing to worry about if sampling proceeds normally afterwards.
You should also check the acceptance rate (mean_accept
) is reasonably high and the mean tree depth (mean_tree_depth
) is reasonable (~4-8). Lower than this and the sampler is taking steps that are too large and encountering a U-turn very quicky. Much larger than this and it might be being too conservative.
Next, you can make a trace plot of different variabes to visually inspect the chain:
using CairoMakie
lines(
chain["b_a"][:],
axis=(;
xlabel="iteration",
ylabel="semi-major axis (AU)"
)
)
And an auto-correlation plot:
using StatsBase
using CairoMakie
lines(
autocor(chain["b_e"][:], 1:500),
axis=(;
xlabel="lag",
ylabel="autocorrelation",
)
)
This plot shows that these samples are not correlated after only about 5 iterations. No thinning is necessary.
To confirm convergence, you may also examine the rhat
column from chains. This diagnostic approaches 1 as the chains converge and should at the very least equal 1.0
to one significant digit (3 recommended).
Finaly, you might consider running multiple chains. Simply run octofit
multiple times, and store the result in different variables. Ten you can combined the chains using chainscat
and run additional inter-chain convergence diagnostics:
using MCMCChains
chain1 = octofit(model)
chain2 = octofit(model)
chain3 = octofit(model)
merged_chains = chainscat(chain1, chain2, chain3)
gelmandiag(merged_chains)
Gelman, Rubin, and Brooks diagnostic
parameters psrf psrfci
Symbol Float64 Float64
M 1.0002 1.0013
plx 1.0006 1.0012
b_a 1.0016 1.0053
b_e 1.0029 1.0110
b_i 1.0047 1.0099
b_ωy 1.0084 1.0311
b_ωx 1.0095 1.0348
b_Ωy 1.0114 1.0418
b_Ωx 1.0045 1.0173
b_θy 1.0004 1.0029
b_θx 1.0003 1.0021
b_ω 1.0150 1.0495
b_Ω 1.0047 1.0142
b_θ 0.9999 1.0011
b_tp 1.0063 1.0194
As an additional convergence test.
Analysis
As a first pass, let's plot a sample of orbits drawn from the posterior. The function octoplot
is a conveninient way to generate a 9-panel plot of velocities and position:
using CairoMakie
octoplot(model,chain)
This function draws orbits from the posterior and displays them in a plot. Any astrometry points are overplotted.
Pair Plot
A very useful visualization of our results is a pair-plot, or corner plot. We can use the octocorner
function and our PairPlots.jl package for this purpose:
using CairoMakie
using PairPlots
octocorner(model, chain, small=true)
Remove small=true
to display all variables.
In this case, the sampler was able to resolve the complicated degeneracies between eccentricity, the longitude of the ascending node, and argument of periapsis.
Saving your chain
Variables can be retrieved from the chains using the following sytnax: sma_planet_b = chain["b_a",:,:]
. The first index is a string or symbol giving the name of the variable in the model. Planet variables are prepended by the name of the planet and an underscore.
You can save your chain in FITS table format by running:
Octofitter.savechain("mychain.fits", chain)
You can load it back via:
chain = Octofitter.loadchain("mychain.fits")