Model description

All of the following models are the same as our main model m3, except for the noted changes to test robustness.

r1: Relaxed exclusion criteria

For the four historical populations, we imposed quite stringent exclusion criteria to ensure sufficient data quality for our intended analysis. This was not necessary for the modern Swedish data, because there were no exclusion criteria to relax.

model_filename = make_path("r1_relaxed_exclusion_criteria")
if (file.exists(model_filename)) {
    r1 = model

r2: Fewer covariates

Adding covariates increases the complexity of the model and makes it harder to interpret. We chose to adjust for many potential confounds because we are interested in causal isolation of the paternal age effect. Here we show what happens when only birth cohort and average paternal age in the family are adjusted for.

model_filename = make_path("r2_few_controls")
r2 = model

r3: Continuous birth order control

We chose to control for birth order/number of older siblings as a categorical variable, lumping all those who had more than 5 in the category 5+. Because a continuous covariate is also plausible, we tested this alternative model as well.

model_filename = make_path("r3_birth_order_continuous")
r3 = model

r4: Control number of dependent siblings

Birth order is usually used as a proxy variable for parental investment, the assumption being that older siblings require parental attention. However, there are are reasons to doubt this, as fully-grown siblings probably do not compete for the same resources. To compute a clearer proxy variable of competing siblings, we computed and adjusted for the number of siblings who were alive and younger than five at the time of birth of the anchor child.

model_filename = make_path("r4_control_dependent_sibs")
r4 = model

r5: Birth order interacted with number of siblings

Plausibly, being first-born has a different effect, when one is an only child as opposed to having two siblings, etc. Here, we allow for such an interaction effect.

model_filename = make_path("r5_birth_order_interact_siblings")
r5 = model

r6: No birth order control

Paternal age and birth order are highly collinear with each other and with maternal age. Therefore, the choice to include this predictor widens standard errors for each predictor and may be disputed. Here we show what happens when we simply omit the birth order control.

model_filename = make_path("r6_no_birth_order_control")
r6 = model

r7: Less control for parental loss

We adjusted for parental loss very stringently, including covariates for parental loss up to age 45. Here we show what happens, when we only control for parental loss in the first, and the first five years of life.

model_filename = make_path("r7_less_parental_loss_control")
r7 = model

r8: Adjust for being first-/last-born adult son

Inheritance is linked to birth order and being male in several of the historical populations. Here, we adjust for the anchor being the first or last born adult son in a family. This implies that we control for our outcome to a certain extent, as “adult sons” cannot have died before adulthood, but a paternal age effect on mortality could still be detected for siblings other than the first- and last-born adults.

model_filename = make_path("r8_adjust_for_first_born_adult")
r8 = model

r9: Continuous birth year adjustment

In our main model, we control for birth cohort in 5-year-bins (lumping small bins). We chose to do so, because nonlinear and even sharply spiking effects of birth cohort are plausible (due to e.g. epidemics). This decision may be disputed, as it summarises 5-year-bins. Here, we instead allow for a thin-splate spline on the continuous birth year variable. This allows for smooth nonlinear (but not spiking) birth cohort effects.

model_filename = make_path("r9_continuous_byear_adjustment")
r9 = model

r10: Group-level slope added

Paternal age effects may vary between different families. Although we did not explore between-family moderators of paternal age effects in our study, we tested whether modelling an additional group-level slope for paternal age differences within the family, would change the results by allowing for shrinkage and to examine the amount of inter-family differences to be explained for potential future moderator analysis.

model_filename = make_path("r10_add_random_slope")
r10 = model

r11: Separate group-level effects for each parent

Most anchors in our sample are full biological siblings and especially in the historical populations, divorce and remarriage was rare. Therefore, we chose to include only one group-level effect, for the parent couple (i.e. one group-level effect per father-mother-dyad). Including one intercept per parent is potentially a better way to adjust for genetic propensities inherited from either parent and allows estimating this propensity also from half-siblings, while half-sibling relationships were ignored in our main models. This comes at the cost of modelling complexity.

model_filename = make_path("r11_separate_random_effects_for_parents")
r11 = model

r12: Sex moderation

It need not be the case that paternal age has the same effect on male and female children. For example, male children inherit only the small Y chromosome from the father, but female children inherit the larger X chromosome, so that paternal age predicts X-chromosomal de novo mutations in females but not in males (Francioli et al., 2016). At the same time, the autism literature suggests that males are less robust to heritable and de novo autism risk variants and that these effects are not simply due to having only one X chromosome (Werling & Geschwind, 2015). Here we let a dummy variable for being male moderate the paternal age effect.

model_filename = make_path("r12_sex_moderation")
r12 = model

r13: Control paternal age at first birth

We already control for the average paternal age at which the children in a family were born. The mean is a more complete summary of the reproductive timing of the father than the age at first birth. However, far more literature has examined age at first birth and it has the advantage of never being censored (although we of course try to rule out censoring by choosing appropriate subsets). Therefore, we added age at first birth as a covariate in this model.

model_filename = make_path("r13_control_paternal_afb")
r13 = model

r14: Compare lfe

Most of the previous literature has not used multilevel modelling, but linear group fixed effects (essentially dummy variables on the many thousands of families in the model). We believe our multilevel modelling approach has the advantage of allowing us to examine the effect of including predictors at the level of the family in the same model.

This allows us to
a) appropriately model a zero-inflated outcome such as number of children including those who died young (we’re not aware of a linear group fixed effect approach that handles hurdle or zero-inflated models)
b) examine group-level slopes for paternal age and potentially to examine moderators at the level of the family (though we did not do this)
c) explicitly model confounders at the level of the family (e.g. number of siblings).

Nevertheless, the prevalence of this approach in the literature mandates that we show how our approach compares. We fit this model using the R package “lfe” and the function felm. All covariates that were not estimable in principle were removed (i.e. number of siblings, paternalage.mean).

r14 = readRDS(make_path("r14_compare_lfe"))

r15: Using a moderator by region, group-level effects by parish

In this model we attempted allow for regional variation in paternal age effects and attempted to better control residual variation. Our approach was two-fold: to moderate paternal age by region and to add a random effect for the church parish in which the individual was born. However, for the modern Swedish data, we had no geographic data and no regional information, so this model was not fit.

model_filename = make_path("r15_region_moderator_parish_ranef")
if(file.exists(model_filename)) {
    r15 = model

r16: Restrict to Skellefteå

Only in the DDB (historical Swedish data), parishes in some of the regions were still unlinked. This means that individuals could occur in more than one parish and not be linked. However, the region of Skellefteå was fully linked. Here, we test what happens when we restrict our dataset to Skellefteå.

model_filename = make_path("r16_restrict_to_skelleftea")
if(file.exists(model_filename)) {
    r16 = model

r17: Simulating Down syndrome cases

  1. We assume that 4 in 1000 births are children with Down syndrome (four times the actual rate).
  2. We randomly excluded 33% of all children who had a mother older than 40 and had no children (many times the actual rate at that age).
model_filename = make_path("r17_simulate_downs")
r17 = model

r18: Reversing hurdle_poisson and poisson

To make models computationally feasible and because early mortality was negligible, we fit the very large modern Swedish dataset with a poisson() family distribution. All historical datasets had high early mortality, so we thought a hurdle_poisson() was more appropriate. Here, we show what happens when we reverse this. The hurdle_poisson() model can be fit to the modern Swedish data here, because we only use a subset.

model_filename = make_path("r18_hurdle_poisson")
r18 = model

r19: Normal distribution

Previous analysts sometimes decided to use the normal distribution to predict (potentially zero-inflated) count data. Here, we refit our models using a normal distribution for the outcome. We show that estimates for the paternal age effect can be estimated to have a substantially different magnitude, because of this, but did not change direction.

model_filename = make_path("r19_normal_distribution")
r19 = model

r20: No adjustment for maternal age

In this model, we test what happens when we do not adjust for maternal age, because it is highly collinear with paternal age.

model_filename = make_path("r20_no_maternalage_control")
r20 = model

r21: Continuous adjustment for maternal age

In this model, we adjust for maternal age using a continuous variable instead of three bins. This does not allow for nonlinear effects, but also does not aggregate the predictor. We cannot compare full siblings, test the effects of maternal and paternal age and adjust for average maternal and paternal age in the family (because the predictors are redundant), so that it is not perfectly possible to disentangle the contribution of maternal and paternal age and compare full siblings.

model_filename = make_path("r21_continuous_maternalage")
r21 = model

r22: Relaxed exclusion and censoring criteria

Like r1, but we use a 30-years-later cutoff year for our birth cohorts, relaxing our censoring requirements.

model_filename = make_path("r22_relaxed_exclusion_censoring")
if(file.exists(model_filename)) {
    r22 = model

r23: Student’s t and half-Cauchy priors

To demonstrate the robustness of our prior choice we use Student’s t priors (fatter tails than normal priors) for our population-level effects and a half-Cauchy prior for our group-level effect for the family.

model_filename = make_path("r23_student_cauchy_priors")
r23 = model

r24: Improper flat priors

To demonstrate the robustness of our prior choice we use improper flat priors. These priors should make the model’s results comparable to a frequentist maximum likelihood approach.

model_filename = make_path("r24_uniform_priors")
r24 = model

r25: Adjust for migration status

In the three historical populations, records were kept in the parish. Although records were linked between parishes in all populations, except three out of four provinces in historical Sweden, migration might sometimes lead to censoring of records. Adjusting for migration may however constitute a partial adjustment for the outcome, as lower offspring fitness might make them more likely to migrate. Hence, we show the results of doing so as a robustness analysis. In all analyses, we adjusted for a “migrated”-dummy variable. Migration was differently defined depending on the population. In Québec, we had flags denoting immigrants and emigrants. Few immigrants were included in our analyses anyway, as we needed parental information for our analyses. Emigrants were people who left Québec. In historical Sweden, migration was logged as migration from the parish of birth. In the Krummhörn, we set migrated to true, when the parish of death/burial differed from the parish of birth/baptism.
No migration information was available in 20th-century Sweden, but records there weren’t kept in parishes, so this should not pose a problem.

model_filename = make_path("r25_migration_status")
if(file.exists(model_filename)) {
    r25 = model

Robustness check comparison

Here we show the effect of paternal age for each episode.


In reference to m3, the main reported model, the robustness models were implemented as follows: r1 relaxed exclusion criteria (not in 20th-century Sweden), r2 had only birth cohort as a covariate, r3 adjusted for birth order as a continuous variable, r4 adjusted for number of dependent siblings instead of birth order, r5 interacted birth order with number of siblings, r6 did not adjust for birth order, r7 adjusted only for parental loss in the first 5 years, r8 adjusted for being the first-/last-born adult son, r9 adjusted for a continuous nonlinear thin-splate spline for birth year instead of 5-year bins, r10 added a group-level slope for paternal age, r11 included separate group-level effects for each parent instead of one per marriage, r12 added a moderation by anchor sex, r13 adjusted for paternal age at first birth, r14 compared a model with linear group fixed effects, r15 added a moderator by region and group-level effects by church parish (not in 20th-century Sweden), r16 was restricted to Skellefteå (only in historical Sweden), r17 simulated Down syndrome cases, r18 reversed hurdle Poisson and Poisson distribution for the respective populations, r19 used a normal distribution, r20 did not adjust for maternal age, r21 adjusted for maternal age as a continuous variable, r22 relaxed exclusion criteria and included 30 more years of birth cohorts, allowing for more potential censoring, r23 used Student’s t distributions for population-level priors and half-Cauchy priors for the family variance component, r24 used noninformative priors, which should lead to results comparable with maximum likelihood, r25 controlled for migration status (not in 20th-century Sweden).

max_r = 25
m3 = readRDS(make_path("m3_children_linear"))
rob_checks = lstype("brmsfit")
robustness = data.frame()
for (i in seq_along(rob_checks)) {
    chk = paternal_age_10y_effect(get(rob_checks[i]))[3,]
    chk$model = rob_checks[i]
    chk$robustness_analysis = str_match(rob_checks[i], "\\b([rm]\\d+)")[,2]
    robustness = bind_rows(robustness, chk)
robustness = robustness %>% mutate( 
    median_estimate = as.numeric(median_estimate),
    lower95 = as.numeric(str_match(ci_95, "\\[(-?[0-9.]+);")[,2]),
    upper95 = as.numeric(str_match(ci_95, ";(-?[0-9.]+)]")[,2])

ggplot(robustness %>% mutate(robustness_analysis = factor(robustness_analysis,levels =  c(paste0("r", max_r:1), "m3") ) ), aes(x = robustness_analysis, y = median_estimate, ymin = lower95, ymax = upper95)) + 
    geom_hline(yintercept = 0, linetype = 'dashed') +
    geom_pointrange() + 
    geom_text(aes(label = robustness_analysis, group = effect), vjust = -0.8) + 
    xlab("Robustness analysis") +
    ylab("Percentage change in outcome by paternal age") +
    theme(axis.ticks.y = element_blank(), axis.text.y = element_blank()) +

saveRDS(robustness, file = make_path("robustness"))