Joint Gaia-Hipparcos Astrometry (G23H)

This tutorial demonstrates how to fit orbit models using the G23H catalog, a comprehensive dataset that combines calibrated proper motions from Hipparcos, Gaia DR2, and Gaia DR3. This method provides the tightest constraints yet on planetary companions using existing published data from Gaia and Hipparcos, and is described in Thompson et al. (submitted).

Overview

The G23H method combines multiple sources of astrometric information into a single joint likelihood:

  • Hipparcos proper motions and intermediate astrometric data (IAD)
  • Hipparcos-Gaia proper motion anomaly (from the HGCA)
  • Calibrated Gaia DR2 proper motions (cross-calibrated against DR3 reference frame)
  • DR3-DR2 scaled position differences (sensitive to short-term proper motion changes)
  • Gaia DR3 proper motions
  • Gaia astrometric excess noise (via RUWE/UEVA modeling)
  • Gaia RV variability constraints (from the 'paired' catalog)

This approach can detect and characterize Jovian planets on ~3-20 AU orbits around nearby stars, sometimes providing sufficient constraints to confirm planetary companions using only existing Gaia and Hipparcos data.

Basic Example: Fitting a Known Exoplanet Host

Let's fit a model to a star with a known companion. We'll use a minimal example that demonstrates the key components.

using Octofitter
using Distributions
using CairoMakie
using FiniteDiff, DifferentiationInterface

Creating the Observation Object

The G23HObs object encapsulates all the G23H data for a single star. You can specify either a Gaia DR3 source ID or a Hipparcos ID:

# Using Gaia DR3 source ID
absastrom = G23HObs(gaia_id=2738776816458107136)

# Or using Hipparcos ID (automatically resolved to Gaia ID)
absastrom = G23HObs(hip_id=384)

# Approximation for faster sampling -- set to false for real use 
absastrom = G23HObs(;
    gaia_id=2738776816458107136,  
    freeze_epochs=true
)
[ Info: renormalizing hipparcos uncertainties according to Nielsen et al (2020). Pass `renormalize=false` to disable.
[ Info: Contacting the GAIA scan forecast tool GOST: https://gaia.esac.esa.int/gost/
[ Info: Retrieving forecasted GAIA scans from GOST: https://gaia.esac.esa.int/gost/
[ Info: done
┌ Info: Detected known gap in Gaia scans; skipping.
│   window = 57049.77664351836
└   note = "Phased-array antenna anomaly"
┌ Info: Detected known gap in Gaia scans; skipping.
│   window = 57049.85064500151
└   note = "PAA anomaly "
┌ Info: Detected known gap in Gaia scans; skipping.
│   window = 57355.87779363431
└   note = "Payload data handling unit anomaly"
┌ Info: Count of missed or rejected transits:
└   dr3 = 7
┌ Info: Count of RV transits:
│   transits_rv = 19
└   total_transits = 35
[ Info: Added the following observation variables:

On first use, you'll be prompted to download the G23H catalog (~14 GB). This only happens once; the catalog is then cached locally.

The observation object automatically:

  • Loads the calibrated proper motions from all epochs
  • Fetches Gaia scan angles from the GOST service
  • Prepares the Hipparcos IAD if available
  • Sets up priors on Gaia noise parameters based on the catalog values

Selecting Data Subsets

You can choose which observation types to include in your fit using likeobj_from_epoch_subset:

# Available observation types:
# :iad_hip   - Hipparcos intermediate astrometric data
# :ra_hip    - Hipparcos RA proper motion
# :dec_hip   - Hipparcos Dec proper motion
# :ra_hg     - Hipparcos-Gaia RA proper motion
# :dec_hg    - Hipparcos-Gaia Dec proper motion
# :ra_dr2    - Gaia DR2 RA proper motion
# :dec_dr2   - Gaia DR2 Dec proper motion
# :ra_dr32   - DR3-DR2 RA scaled position difference
# :dec_dr32  - DR3-DR2 Dec scaled position difference
# :ra_dr3    - Gaia DR3 RA proper motion
# :dec_dr3   - Gaia DR3 Dec proper motion
# :ueva_dr3  - Gaia DR3 astrometric excess noise (RUWE)
# :rv_dr3    - Gaia DR3 RV variability

# Example: use only a subset of observations
# obs_to_include = [:ra_hip, :dec_hip, :ra_hg, :dec_hg, :ra_dr3, :dec_dr3, :ueva_dr3, :rv_dr3]
# indices = findall([kind ∈ obs_to_include for kind in absastrom.table.kind])
# absastrom = Octofitter.likeobj_from_epoch_subset(absastrom, indices)

Defining the Planet Model

For G23H fitting, we typically use AbsoluteVisual{KepOrbit} which properly handles the barycentric motion:

planet = Planet(
    name="b",
    basis=AbsoluteVisual{KepOrbit},
    variables=@variables begin
        P ~ LogUniform(1/365.25, 10000)# Period in Julian years (1 day to 10000yrs)
        # Now convert to the units expected by Kepler's third law
        # The conversion factor accounts for the IAU definitions
        P_for_kepler = P * Octofitter.PlanetOrbits.year2day_julian / Octofitter.PlanetOrbits.kepler_year_to_julian_day_conversion_factor
        q ~ LogUniform(1e-5, 1)
        mass = q * system.M_pri / Octofitter.mjup2msol
        e ~ Uniform(0, 0.9)  # Eccentricity
        ω ~ Uniform(0, 2pi)  # Argument of periastron
        i ~ Sine()
        Ω ~ Uniform(0, 2pi)
        τ ~ Uniform(0.0, 1.0)  # Fraction of period past periastron
        M = system.M_pri + mass * Octofitter.mjup2msol
        # Now apply Kepler's third law: a^3 = P^2 * M
        # where P is in "Keplerian years", a in AU, M in solar masses
        a = cbrt(P_for_kepler^2 * M)
        tp = τ * P * 365.25 + 57388.5  # Time of periastron [MJD]
    end
)
Planet b
Derived:
          P_for_kepler = (P * Octofitter.PlanetOrbits.year2day_julian) / Octofitter.PlanetOrbits.kepler_year_to_julian_day_conversion_factor
                mass = (q * system.M_pri) / Octofitter.mjup2msol
                   M = system.M_pri + mass * Octofitter.mjup2msol
                   a = cbrt(P_for_kepler ^ 2 * M)
                  tp = τ * P * 365.25 + 57388.5

Priors:
                   P ~ Distributions.LogUniform{Float64}(a=0.0027378507871321013, b=10000.0)
                   q ~ Distributions.LogUniform{Float64}(a=1.0e-5, b=1.0)
                   e ~ Distributions.Uniform{Float64}(a=0.0, b=0.9)
                   ω ~ Distributions.Uniform{Float64}(a=0.0, b=6.283185307179586)
                   i ~ Sine()
                   Ω ~ Distributions.Uniform{Float64}(a=0.0, b=6.283185307179586)
                   τ ~ Distributions.Uniform{Float64}(a=0.0, b=1.0)


Defining the System Model

The system model includes priors on the primary star mass, parallax, and proper motion:

ref_epoch = Octofitter.meta_gaia_DR3.ref_epoch_mjd

sys = System(
    name="target",
    companions=[planet],
    observations=[absastrom],
    variables=@variables begin
        M_pri = 1.0  # Primary mass [Msol] - set to your target's mass
        # You can also make it a variable to be marginalized over:
        # M_pri ~ truncated(Normal(1.0, 0.1), lower=0.01)

        # Parallax prior (use catalog value, truncated)
        plx ~ truncated(Normal(absastrom.catalog.parallax, absastrom.catalog.parallax_error),lower=max(0, absastrom.catalog.parallax-10absastrom.catalog.parallax_error))

        # Proper motion priors (use wide priors - the data will constrain these)
        pmra ~ Uniform(absastrom.catalog.pmra_dr3 - 10, absastrom.catalog.pmra_dr3 + 10)
        pmdec ~ Uniform(absastrom.catalog.pmdec_dr3 - 10, absastrom.catalog.pmdec_dr3 + 10)

        # Fixed coordinates and RV from catalog -- used for high proper motion & RV propagation
        dec = $absastrom.catalog.dec
        ra = $absastrom.catalog.ra
        rv = isnan($absastrom.catalog.radial_velocity) ? 0.0 : absastrom.catalog.radial_velocity * 1e3

        # Reference epoch for RV, RA, DEC, and parallax
        ref_epoch = $ref_epoch

    end
)

model = Octofitter.LogDensityModel(sys; verbosity=4, autodiff=AutoFiniteDiff())
LogDensityModel for System target of dimension 20 and 13 epochs with fields .ℓπcallback and .∇ℓπcallback

Initialization

Normally, we recommend calling initialize! which uses a very robust initialization strategy for MCMC; however, for posteriors that are fairly wide open, it suffices to use a simpler strategy that is faster, like the following:

# Hack to quickly find good starting positions
Octofitter._kepsolve_use_threads[] = true
initial_θ = collect(Octofitter.guess_starting_position(model, 10000)[1])
model.starting_points = fill(collect(model.link(initial_θ)), 100)
@showtime model.ℓπcallback(model.starting_points[1])
-4.782597753785216e18

Sampling

G23H models often have complex, multi-modal posteriors. We strongly recomment using the Pigeons.jl sampler:

using Pigeons

# For initial exploration, start with fewer rounds
chain, pt = octofit_pigeons(
    model,
    n_chains=32,
    n_rounds=6,  # Increase to ~12 for production runs
    explorer=SliceSampler(),
    n_chains_variational=0,
    variational=nothing,
    multithreaded=true,
)
(chain = MCMC chain (64×119×1 Array{Float64, 3}), pt = PT(checkpoint = false, ...))

You can continue sampling by incrementing rounds:

increment_n_rounds!(pt, 2)
chain, pt = octofit_pigeons(pt)

Analysis and Visualization

After sampling, examine the results:

# Mass vs. separation plot with marginal histograms
Octofitter.dotplot(model, chain)

Important Considerations

Use wide priors for pmra/pmdec

Similar to HGCA fitting, use wide priors for pmra and pmdec. Do not use Gaia DR3 proper motion values as tight priors—this would double-count information since the G23H data already incorporates Gaia astrometry.

Computational cost

G23H models are more computationally expensive than simple HGCA models because they marginalize over Gaia's unpublished observation epochs. Consider using freeze_epochs=true for faster (but approximate) sampling during initial exploration.

Quality checks

For stars where rho_dr2_dr3_cat approaches 1, the DR2-DR3 constraints may be unreliable. Consider fitting without the DR2 and DR32 epochs for such targets.

See Also