Sweden (1947-2000), main models

Loading details

source("0__helpers.R")
opts_chunk$set(warning=TRUE, cache=F,tidy=FALSE,dev=c('png'),fig.width=20,fig.height=12.5)

make_path = function(file) {
    get_coefficient_path(file, "swed")
} 
# options for each chunk calling knit_child
opts_chunk$set(warning=FALSE, message = FALSE, echo = FALSE)

Analysis description

Data subset

The swed_subset_survival.1 dataset contains only those participants where paternal age is known and birth years are from 1969 to 2000. We use it for the analyses pertaining to survival. We randomly sampled 100,000 families from the whole dataset.

The swed.1 dataset contains all those participants where paternal age is known and birth years are from 1947 to 1959. We use it for the analyses pertaining to marriage, divorce and reproductive success.

Model description

All of the following models have the following in common:

Estimation

We fit all models using brms v. 1.2.0, a Bayesian regression analysis statistical package. brms uses Stan, a probabilistic programming langugage to fit models using Hamiltonian Monte Carlo.

Covariates

We adjusted for average paternal age within families to isolate the effect of paternal age differences between siblings. We further adjusted for birth cohort in five-year groupings (small groupings at the edge of the range were lumped) to account for secular changes in mortality and fertility, as well as residual censoring. We adjusted for parental deaths in the first 45 years of life to remove effects related to orphanhood and parental senescence (in categories of 0-1, 2-5, 6-10, …, 45+, unknown) for both parents separately. Parental loss at 45+ served as the reference category. We adjusted for maternal age (up to 20, 21-34, 35+), which we binned to reduce multicollinearity with paternal age and to capture nonlinear effects. A maternal age of 21-34 served as the reference category. We also adjusted for number of siblings continuously, number of older siblings (0-5, 5+), and being born last. Being first-born served as the reference category.

Model stratification

We added random intercepts for each family (father-mother dyad). We then controlled for the average paternal age in the family. Hence, the paternal age effects in the plot are split into those between families and those within families or between siblings. We are interested in the effect of paternal age between siblings, as this effect cannot be explained by e.g. genetic propensities of the father to reproduce late.

Priors

We used weakly informative priors that are documented in each model summary.

m1: No sibling comparison

Here, we ignore the pedigree structure of the data to see whether it matters for the estimation of the paternal age effect.

Model summary

Full summary

model_summary = summary(model, use_cache = FALSE, priors = TRUE)
print(model_summary)
##  Family: poisson(log) 
## Formula: children ~ paternalage + birth_cohort + male + maternalage.factor + paternal_loss + maternal_loss + older_siblings + nr.siblings + last_born 
##    Data: model_data (Number of observations: 1408177) 
## Samples: 6 chains, each with iter = 800; warmup = 300; thin = 1; 
##          total post-warmup samples = 3000
##     ICs: LOO = Not computed; WAIC = Not computed
##  
## Priors: 
## b ~ normal(0,5)
## 
## Population-Level Effects: 
##                        Estimate Est.Error l-95% CI u-95% CI Eff.Sample
## Intercept                  0.73      0.00     0.72     0.74       3000
## paternalage               -0.05      0.00    -0.05    -0.04       3000
## birth_cohort1950M1955      0.00      0.00     0.00     0.00       3000
## birth_cohort1955M1960      0.00      0.00     0.00     0.01       3000
## male1                     -0.06      0.00    -0.07    -0.06       3000
## maternalage.factor1420     0.06      0.00     0.05     0.06       3000
## maternalage.factor3561     0.00      0.00    -0.01     0.00       3000
## paternal_loss01            0.11      0.05     0.00     0.22       1327
## paternal_loss15            0.03      0.02     0.00     0.07       1893
## paternal_loss510          -0.02      0.01    -0.03     0.00       2062
## paternal_loss1015          0.00      0.01    -0.01     0.01       3000
## paternal_loss1520          0.00      0.00     0.00     0.01       3000
## paternal_loss2025          0.00      0.00    -0.01     0.00       3000
## paternal_loss2530          0.00      0.00     0.00     0.01       3000
## paternal_loss3035          0.00      0.00     0.00     0.00       3000
## paternal_loss3540          0.00      0.00    -0.01     0.00       3000
## paternal_loss4045         -0.01      0.00    -0.01     0.00       3000
## paternal_lossunclear      -0.06      0.00    -0.06    -0.05       3000
## maternal_loss01           -0.20      0.07    -0.34    -0.06       1665
## maternal_loss15           -0.06      0.03    -0.11     0.00       1497
## maternal_loss510          -0.01      0.01    -0.04     0.02       1759
## maternal_loss1015         -0.01      0.01    -0.02     0.01       2587
## maternal_loss1520          0.00      0.01    -0.02     0.01       2838
## maternal_loss2025          0.01      0.00     0.00     0.02       3000
## maternal_loss2530          0.01      0.00     0.00     0.01       3000
## maternal_loss3035          0.01      0.00     0.00     0.01       3000
## maternal_loss3540          0.00      0.00    -0.01     0.00       3000
## maternal_loss4045          0.00      0.00    -0.01     0.01       3000
## maternal_lossunclear      -0.02      0.00    -0.02    -0.02       3000
## older_siblings1            0.02      0.00     0.01     0.02       3000
## older_siblings2            0.02      0.00     0.01     0.02       3000
## older_siblings3            0.01      0.00     0.00     0.02       3000
## older_siblings4           -0.01      0.00    -0.02     0.00       3000
## older_siblings5P          -0.06      0.01    -0.07    -0.05       2350
## nr.siblings                0.04      0.00     0.04     0.04       2470
## last_born1                 0.01      0.00     0.01     0.01       3000
##                        Rhat
## Intercept                 1
## paternalage               1
## birth_cohort1950M1955     1
## birth_cohort1955M1960     1
## male1                     1
## maternalage.factor1420    1
## maternalage.factor3561    1
## paternal_loss01           1
## paternal_loss15           1
## paternal_loss510          1
## paternal_loss1015         1
## paternal_loss1520         1
## paternal_loss2025         1
## paternal_loss2530         1
## paternal_loss3035         1
## paternal_loss3540         1
## paternal_loss4045         1
## paternal_lossunclear      1
## maternal_loss01           1
## maternal_loss15           1
## maternal_loss510          1
## maternal_loss1015         1
## maternal_loss1520         1
## maternal_loss2025         1
## maternal_loss2530         1
## maternal_loss3035         1
## maternal_loss3540         1
## maternal_loss4045         1
## maternal_lossunclear      1
## older_siblings1           1
## older_siblings2           1
## older_siblings3           1
## older_siblings4           1
## older_siblings5P          1
## nr.siblings               1
## last_born1                1
## 
## Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
## is a crude measure of effective sample size, and Rhat is the potential 
## scale reduction factor on split chains (at convergence, Rhat = 1).

Table of fixed effects

Estimates are exp(b). When they are referring to the hurdle (hu) component, or a dichotomous outcome, they are odds ratios, when they are referring to a Poisson component, they are hazard ratios. In both cases, they are presented with 95% credibility intervals. To see the effects on the response scale (probability or number of children), consult the marginal effect plots.

fixed_eff = data.frame(model_summary$fixed, check.names = F)
fixed_eff$Est.Error = fixed_eff$Eff.Sample = fixed_eff$Rhat = NULL
fixed_eff$`Odds/hazard ratio` = exp(fixed_eff$Estimate)
fixed_eff$`OR/HR low 95%` = exp(fixed_eff$`l-95% CI`)
fixed_eff$`OR/HR high 95%` = exp(fixed_eff$`u-95% CI`)
fixed_eff = fixed_eff %>% select(`Odds/hazard ratio`, `OR/HR low 95%`, `OR/HR high 95%`)
pander::pander(fixed_eff)
  Odds/hazard ratio OR/HR low 95% OR/HR high 95%
Intercept 2.08 2.062 2.098
paternalage 0.954 0.9517 0.9564
birth_cohort1950M1955 1.001 0.9975 1.004
birth_cohort1955M1960 1.002 0.9983 1.005
male1 0.9384 0.9361 0.9406
maternalage.factor1420 1.059 1.054 1.063
maternalage.factor3561 0.9987 0.9944 1.003
paternal_loss01 1.121 1.003 1.244
paternal_loss15 1.032 0.9957 1.07
paternal_loss510 0.9834 0.9665 1.001
paternal_loss1015 0.9961 0.9855 1.006
paternal_loss1520 1.005 0.997 1.013
paternal_loss2025 0.9983 0.9916 1.005
paternal_loss2530 1.003 0.998 1.009
paternal_loss3035 1 0.9954 1.005
paternal_loss3540 0.996 0.9915 1
paternal_loss4045 0.9915 0.9871 0.9957
paternal_lossunclear 0.9462 0.943 0.9495
maternal_loss01 0.8221 0.7097 0.9435
maternal_loss15 0.9462 0.893 0.9998
maternal_loss510 0.9905 0.9641 1.017
maternal_loss1015 0.9939 0.9786 1.01
maternal_loss1520 0.9953 0.9827 1.007
maternal_loss2025 1.009 0.9988 1.019
maternal_loss2530 1.006 0.9973 1.014
maternal_loss3035 1.006 0.9985 1.013
maternal_loss3540 0.996 0.9903 1.002
maternal_loss4045 0.9994 0.9936 1.005
maternal_lossunclear 0.9798 0.9768 0.9829
older_siblings1 1.016 1.012 1.019
older_siblings2 1.019 1.015 1.023
older_siblings3 1.011 1.004 1.017
older_siblings4 0.9943 0.9849 1.003
older_siblings5P 0.9402 0.9301 0.9506
nr.siblings 1.039 1.038 1.04
last_born1 1.011 1.009 1.014

Paternal age effect

pander::pander(paternal_age_10y_effect(model))
effect median_estimate ci_95 ci_80
estimate father 25y 1.92 [1.91;1.93] [1.92;1.93]
estimate father 35y 1.83 [1.83;1.84] [1.83;1.84]
percentage change -4.60 [-4.83;-4.36] [-4.75;-4.44]
OR/IRR 0.95 [0.95;0.96] [0.95;0.96]

Marginal effect plots

In these marginal effect plots, we set all predictors except the one shown on the X axis to their mean and in the case of factors to their reference level. We then plot the estimated association between the X axis predictor and the outcome on the response scale (e.g. probability of survival/marriage or number of children).

plot.brmsMarginalEffects_shades(
    x = marginal_effects(model, re_formula = NA, probs = c(0.025,0.975)),
    y = marginal_effects(model, re_formula = NA, probs = c(0.1,0.9)), 
    ask = FALSE)

plot of chunk unnamed-chunk-12plot of chunk unnamed-chunk-12plot of chunk unnamed-chunk-12plot of chunk unnamed-chunk-12plot of chunk unnamed-chunk-12plot of chunk unnamed-chunk-12plot of chunk unnamed-chunk-12plot of chunk unnamed-chunk-12plot of chunk unnamed-chunk-12

Coefficient plot

Here, we plotted the 95% posterior densities for the unexponentiated model coefficients (b_). The darkly shaded area represents the 50% credibility interval, the dark line represent the posterior mean estimate.

mcmc_areas(as.matrix(model$fit), regex_pars = "b_[^I]", point_est = "mean", prob = 0.50, prob_outer = 0.95) + ggtitle("Posterior densities with means and 50% intervals") + analysis_theme + theme(axis.text = element_text(size = 12), panel.grid = element_blank()) + xlab("Coefficient size")

plot of chunk unnamed-chunk-13

Diagnostics

These plots were made to diagnose misfit and nonconvergence.

Posterior predictive checks

In posterior predictive checks, we test whether we can approximately reproduce the real data distribution from our model.

brms::pp_check(model, re_formula = NA, type = "dens_overlay")

plot of chunk unnamed-chunk-14

brms::pp_check(model, re_formula = NA, type = "hist")

plot of chunk unnamed-chunk-14

Rhat

Did the 6 chains converge?

stanplot(model, pars = "^b_[^I]", type = 'rhat')

plot of chunk unnamed-chunk-15

Effective sample size over average sample size

stanplot(model, pars = "^b", type = 'neff_hist')

plot of chunk unnamed-chunk-16

Trace plots

Trace plots are only shown in the case of nonconvergence.

if(any( summary(model)$fixed[,"Rhat"] > 1.1)) { # only do traceplots if not converged
    plot(model, N = 3, ask = FALSE)
}

File/cluster script name

This model was stored in the file: coefs/swed/m1_children_linear_noranef.rds.

Click the following link to see the script used to generate this model:

opts_chunk$set(echo = FALSE)
clusterscript = str_replace(basename(model_filename), "\\.rds",".html")
cat("[Cluster script](" , clusterscript, ")", sep = "")

Cluster script

m2: Sibling comparison, no paternal age effect

Here, we compared siblings by including a random intercept for the family, but we modelled no effect for paternal age differences among siblings.

Model summary

Full summary

model_summary = summary(model, use_cache = FALSE, priors = TRUE)
print(model_summary)
##  Family: poisson(log) 
## Formula: children ~ birth_cohort + male + maternalage.factor + paternalage.mean + paternal_loss + maternal_loss + older_siblings + nr.siblings + last_born + (1 | idParents) 
##    Data: model_data (Number of observations: 1408177) 
## Samples: 6 chains, each with iter = 1500; warmup = 1000; thin = 1; 
##          total post-warmup samples = 3000
##     ICs: LOO = Not computed; WAIC = Not computed
##  
## Priors: 
## b ~ normal(0,5)
## sd ~ student_t(3, 0, 5)
## 
## Group-Level Effects: 
## ~idParents (Number of levels: 884975) 
##               Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sd(Intercept)        0         0        0     0.01         67 1.07
## 
## Population-Level Effects: 
##                        Estimate Est.Error l-95% CI u-95% CI Eff.Sample
## Intercept                  0.73      0.00     0.72     0.74       3000
## birth_cohort1950M1955      0.00      0.00     0.00     0.00       3000
## birth_cohort1955M1960      0.00      0.00     0.00     0.00       3000
## male1                     -0.06      0.00    -0.07    -0.06       3000
## maternalage.factor1420     0.06      0.00     0.05     0.06       3000
## maternalage.factor3561    -0.01      0.00    -0.01    -0.01       3000
## paternalage.mean          -0.04      0.00    -0.05    -0.04       3000
## paternal_loss01            0.11      0.05     0.02     0.21       3000
## paternal_loss15            0.03      0.02    -0.01     0.06       3000
## paternal_loss510          -0.02      0.01    -0.04     0.00       3000
## paternal_loss1015         -0.01      0.01    -0.02     0.00       3000
## paternal_loss1520          0.00      0.00    -0.01     0.01       3000
## paternal_loss2025          0.00      0.00    -0.01     0.00       3000
## paternal_loss2530          0.00      0.00     0.00     0.01       3000
## paternal_loss3035          0.00      0.00    -0.01     0.00       3000
## paternal_loss3540         -0.01      0.00    -0.01     0.00       3000
## paternal_loss4045         -0.01      0.00    -0.01     0.00       3000
## paternal_lossunclear      -0.05      0.00    -0.06    -0.05       3000
## maternal_loss01           -0.20      0.07    -0.34    -0.06       3000
## maternal_loss15           -0.06      0.03    -0.11     0.00       3000
## maternal_loss510          -0.01      0.01    -0.04     0.01       3000
## maternal_loss1015         -0.01      0.01    -0.02     0.01       3000
## maternal_loss1520          0.00      0.01    -0.02     0.01       3000
## maternal_loss2025          0.01      0.01     0.00     0.02       3000
## maternal_loss2530          0.01      0.00     0.00     0.01       3000
## maternal_loss3035          0.01      0.00     0.00     0.01       3000
## maternal_loss3540          0.00      0.00    -0.01     0.00       3000
## maternal_loss4045          0.00      0.00    -0.01     0.00       3000
## maternal_lossunclear      -0.02      0.00    -0.02    -0.02       3000
## older_siblings1            0.00      0.00     0.00     0.00       3000
## older_siblings2           -0.01      0.00    -0.02    -0.01       3000
## older_siblings3           -0.03      0.00    -0.04    -0.03       3000
## older_siblings4           -0.06      0.00    -0.07    -0.05       3000
## older_siblings5P          -0.14      0.01    -0.15    -0.12       3000
## nr.siblings                0.05      0.00     0.04     0.05       2555
## last_born1                 0.01      0.00     0.01     0.02       2871
##                        Rhat
## Intercept                 1
## birth_cohort1950M1955     1
## birth_cohort1955M1960     1
## male1                     1
## maternalage.factor1420    1
## maternalage.factor3561    1
## paternalage.mean          1
## paternal_loss01           1
## paternal_loss15           1
## paternal_loss510          1
## paternal_loss1015         1
## paternal_loss1520         1
## paternal_loss2025         1
## paternal_loss2530         1
## paternal_loss3035         1
## paternal_loss3540         1
## paternal_loss4045         1
## paternal_lossunclear      1
## maternal_loss01           1
## maternal_loss15           1
## maternal_loss510          1
## maternal_loss1015         1
## maternal_loss1520         1
## maternal_loss2025         1
## maternal_loss2530         1
## maternal_loss3035         1
## maternal_loss3540         1
## maternal_loss4045         1
## maternal_lossunclear      1
## older_siblings1           1
## older_siblings2           1
## older_siblings3           1
## older_siblings4           1
## older_siblings5P          1
## nr.siblings               1
## last_born1                1
## 
## Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
## is a crude measure of effective sample size, and Rhat is the potential 
## scale reduction factor on split chains (at convergence, Rhat = 1).

Table of fixed effects

Estimates are exp(b). When they are referring to the hurdle (hu) component, or a dichotomous outcome, they are odds ratios, when they are referring to a Poisson component, they are hazard ratios. In both cases, they are presented with 95% credibility intervals. To see the effects on the response scale (probability or number of children), consult the marginal effect plots.

fixed_eff = data.frame(model_summary$fixed, check.names = F)
fixed_eff$Est.Error = fixed_eff$Eff.Sample = fixed_eff$Rhat = NULL
fixed_eff$`Odds/hazard ratio` = exp(fixed_eff$Estimate)
fixed_eff$`OR/HR low 95%` = exp(fixed_eff$`l-95% CI`)
fixed_eff$`OR/HR high 95%` = exp(fixed_eff$`u-95% CI`)
fixed_eff = fixed_eff %>% select(`Odds/hazard ratio`, `OR/HR low 95%`, `OR/HR high 95%`)
pander::pander(fixed_eff)
  Odds/hazard ratio OR/HR low 95% OR/HR high 95%
Intercept 2.067 2.049 2.086
birth_cohort1950M1955 0.9999 0.9968 1.003
birth_cohort1955M1960 0.9999 0.9965 1.003
male1 0.9384 0.936 0.9407
maternalage.factor1420 1.06 1.056 1.065
maternalage.factor3561 0.9907 0.9867 0.9949
paternalage.mean 0.9565 0.9541 0.9588
paternal_loss01 1.122 1.015 1.24
paternal_loss15 1.029 0.9913 1.067
paternal_loss510 0.9805 0.9632 0.9976
paternal_loss1015 0.9937 0.9832 1.004
paternal_loss1520 1.003 0.9947 1.011
paternal_loss2025 0.9964 0.99 1.003
paternal_loss2530 1.001 0.9959 1.007
paternal_loss3035 0.9989 0.9941 1.004
paternal_loss3540 0.9949 0.9905 0.9994
paternal_loss4045 0.9906 0.9863 0.9951
paternal_lossunclear 0.9469 0.9437 0.95
maternal_loss01 0.8222 0.7143 0.945
maternal_loss15 0.9445 0.8925 0.9979
maternal_loss510 0.9893 0.9641 1.015
maternal_loss1015 0.9936 0.9782 1.008
maternal_loss1520 0.9952 0.9835 1.007
maternal_loss2025 1.008 0.9979 1.019
maternal_loss2530 1.005 0.9968 1.014
maternal_loss3035 1.005 0.9981 1.012
maternal_loss3540 0.9957 0.99 1.002
maternal_loss4045 0.999 0.9936 1.004
maternal_lossunclear 0.9809 0.978 0.9838
older_siblings1 0.9999 0.9969 1.003
older_siblings2 0.9884 0.984 0.9926
older_siblings3 0.9681 0.9619 0.9742
older_siblings4 0.9417 0.9331 0.9507
older_siblings5P 0.8737 0.8643 0.8833
nr.siblings 1.046 1.045 1.047
last_born1 1.015 1.012 1.017

Paternal age effect

pander::pander(paternal_age_10y_effect(model))

Marginal effect plots

In these marginal effect plots, we set all predictors except the one shown on the X axis to their mean and in the case of factors to their reference level. We then plot the estimated association between the X axis predictor and the outcome on the response scale (e.g. probability of survival/marriage or number of children).

plot.brmsMarginalEffects_shades(
    x = marginal_effects(model, re_formula = NA, probs = c(0.025,0.975)),
    y = marginal_effects(model, re_formula = NA, probs = c(0.1,0.9)), 
    ask = FALSE)

plot of chunk unnamed-chunk-23plot of chunk unnamed-chunk-23plot of chunk unnamed-chunk-23plot of chunk unnamed-chunk-23plot of chunk unnamed-chunk-23plot of chunk unnamed-chunk-23plot of chunk unnamed-chunk-23plot of chunk unnamed-chunk-23plot of chunk unnamed-chunk-23

Coefficient plot

Here, we plotted the 95% posterior densities for the unexponentiated model coefficients (b_). The darkly shaded area represents the 50% credibility interval, the dark line represent the posterior mean estimate.

mcmc_areas(as.matrix(model$fit), regex_pars = "b_[^I]", point_est = "mean", prob = 0.50, prob_outer = 0.95) + ggtitle("Posterior densities with means and 50% intervals") + analysis_theme + theme(axis.text = element_text(size = 12), panel.grid = element_blank()) + xlab("Coefficient size")

plot of chunk unnamed-chunk-24

Diagnostics

These plots were made to diagnose misfit and nonconvergence.

Posterior predictive checks

In posterior predictive checks, we test whether we can approximately reproduce the real data distribution from our model.

brms::pp_check(model, re_formula = NA, type = "dens_overlay")

plot of chunk unnamed-chunk-25

brms::pp_check(model, re_formula = NA, type = "hist")

plot of chunk unnamed-chunk-25

Rhat

Did the 6 chains converge?

stanplot(model, pars = "^b_[^I]", type = 'rhat')

plot of chunk unnamed-chunk-26

Effective sample size over average sample size

stanplot(model, pars = "^b", type = 'neff_hist')

plot of chunk unnamed-chunk-27

Trace plots

Trace plots are only shown in the case of nonconvergence.

if(any( summary(model)$fixed[,"Rhat"] > 1.1)) { # only do traceplots if not converged
    plot(model, N = 3, ask = FALSE)
}

File/cluster script name

This model was stored in the file: coefs/swed/m2_children_linear_no_diff.rds.

Click the following link to see the script used to generate this model:

opts_chunk$set(echo = FALSE)
clusterscript = str_replace(basename(model_filename), "\\.rds",".html")
cat("[Cluster script](" , clusterscript, ")", sep = "")

Cluster script

m3: Sibling comparison, linear paternal age effect

Here, we compared siblings by including a random intercept for the family, and we modelled a linear effect for paternal age differences among siblings.

Model summary

Full summary

model_summary = summary(model, use_cache = FALSE, priors = TRUE)
## Warning: There were 2 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help.
## See http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
print(model_summary)
##  Family: poisson(log) 
## Formula: children ~ paternalage + birth_cohort + male + maternalage.factor + paternalage.mean + paternal_loss + maternal_loss + older_siblings + nr.siblings + last_born + (1 | idParents) 
##    Data: model_data (Number of observations: 1408177) 
## Samples: 6 chains, each with iter = 800; warmup = 300; thin = 1; 
##          total post-warmup samples = 3000
##     ICs: LOO = Not computed; WAIC = Not computed
##  
## Priors: 
## b ~ normal(0,5)
## sd ~ student_t(3, 0, 5)
## 
## Group-Level Effects: 
## ~idParents (Number of levels: 884975) 
##               Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sd(Intercept)        0         0        0     0.01        169 1.02
## 
## Population-Level Effects: 
##                        Estimate Est.Error l-95% CI u-95% CI Eff.Sample
## Intercept                  0.73      0.00     0.72     0.74       3000
## paternalage               -0.05      0.00    -0.06    -0.05       3000
## birth_cohort1950M1955      0.00      0.00     0.00     0.00       3000
## birth_cohort1955M1960      0.00      0.00     0.00     0.01       3000
## male1                     -0.06      0.00    -0.07    -0.06       3000
## maternalage.factor1420     0.06      0.00     0.05     0.06       3000
## maternalage.factor3561     0.00      0.00    -0.01     0.00       3000
## paternalage.mean           0.01      0.00     0.00     0.01       3000
## paternal_loss01            0.11      0.05     0.01     0.22       3000
## paternal_loss15            0.03      0.02    -0.01     0.07       3000
## paternal_loss510          -0.02      0.01    -0.03     0.00       3000
## paternal_loss1015          0.00      0.01    -0.01     0.01       3000
## paternal_loss1520          0.00      0.00     0.00     0.01       3000
## paternal_loss2025          0.00      0.00    -0.01     0.00       3000
## paternal_loss2530          0.00      0.00     0.00     0.01       3000
## paternal_loss3035          0.00      0.00     0.00     0.01       3000
## paternal_loss3540          0.00      0.00    -0.01     0.00       3000
## paternal_loss4045         -0.01      0.00    -0.01     0.00       3000
## paternal_lossunclear      -0.06      0.00    -0.06    -0.05       3000
## maternal_loss01           -0.20      0.07    -0.34    -0.06       3000
## maternal_loss15           -0.06      0.03    -0.11     0.00       3000
## maternal_loss510          -0.01      0.01    -0.04     0.02       3000
## maternal_loss1015         -0.01      0.01    -0.02     0.01       3000
## maternal_loss1520          0.00      0.01    -0.02     0.01       3000
## maternal_loss2025          0.01      0.01     0.00     0.02       3000
## maternal_loss2530          0.01      0.00     0.00     0.01       3000
## maternal_loss3035          0.01      0.00     0.00     0.01       3000
## maternal_loss3540          0.00      0.00    -0.01     0.00       3000
## maternal_loss4045          0.00      0.00    -0.01     0.00       3000
## maternal_lossunclear      -0.02      0.00    -0.02    -0.02       3000
## older_siblings1            0.02      0.00     0.01     0.02       3000
## older_siblings2            0.02      0.00     0.02     0.03       3000
## older_siblings3            0.02      0.00     0.01     0.03       3000
## older_siblings4            0.00      0.01    -0.01     0.01       3000
## older_siblings5P          -0.05      0.01    -0.07    -0.04       3000
## nr.siblings                0.04      0.00     0.04     0.04       3000
## last_born1                 0.01      0.00     0.01     0.01       3000
##                        Rhat
## Intercept                 1
## paternalage               1
## birth_cohort1950M1955     1
## birth_cohort1955M1960     1
## male1                     1
## maternalage.factor1420    1
## maternalage.factor3561    1
## paternalage.mean          1
## paternal_loss01           1
## paternal_loss15           1
## paternal_loss510          1
## paternal_loss1015         1
## paternal_loss1520         1
## paternal_loss2025         1
## paternal_loss2530         1
## paternal_loss3035         1
## paternal_loss3540         1
## paternal_loss4045         1
## paternal_lossunclear      1
## maternal_loss01           1
## maternal_loss15           1
## maternal_loss510          1
## maternal_loss1015         1
## maternal_loss1520         1
## maternal_loss2025         1
## maternal_loss2530         1
## maternal_loss3035         1
## maternal_loss3540         1
## maternal_loss4045         1
## maternal_lossunclear      1
## older_siblings1           1
## older_siblings2           1
## older_siblings3           1
## older_siblings4           1
## older_siblings5P          1
## nr.siblings               1
## last_born1                1
## 
## Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
## is a crude measure of effective sample size, and Rhat is the potential 
## scale reduction factor on split chains (at convergence, Rhat = 1).

Table of fixed effects

Estimates are exp(b). When they are referring to the hurdle (hu) component, or a dichotomous outcome, they are odds ratios, when they are referring to a Poisson component, they are hazard ratios. In both cases, they are presented with 95% credibility intervals. To see the effects on the response scale (probability or number of children), consult the marginal effect plots.

fixed_eff = data.frame(model_summary$fixed, check.names = F)
fixed_eff$Est.Error = fixed_eff$Eff.Sample = fixed_eff$Rhat = NULL
fixed_eff$`Odds/hazard ratio` = exp(fixed_eff$Estimate)
fixed_eff$`OR/HR low 95%` = exp(fixed_eff$`l-95% CI`)
fixed_eff$`OR/HR high 95%` = exp(fixed_eff$`u-95% CI`)
fixed_eff = fixed_eff %>% select(`Odds/hazard ratio`, `OR/HR low 95%`, `OR/HR high 95%`)
pander::pander(fixed_eff)
  Odds/hazard ratio OR/HR low 95% OR/HR high 95%
Intercept 2.077 2.06 2.096
paternalage 0.9485 0.941 0.9558
birth_cohort1950M1955 1.001 0.9977 1.004
birth_cohort1955M1960 1.002 0.9985 1.005
male1 0.9384 0.9361 0.9407
maternalage.factor1420 1.059 1.054 1.063
maternalage.factor3561 0.9994 0.9948 1.004
paternalage.mean 1.006 0.9981 1.014
paternal_loss01 1.122 1.011 1.241
paternal_loss15 1.032 0.9945 1.071
paternal_loss510 0.9835 0.9663 1.001
paternal_loss1015 0.9961 0.986 1.006
paternal_loss1520 1.005 0.9968 1.012
paternal_loss2025 0.9983 0.9919 1.004
paternal_loss2530 1.003 0.9978 1.009
paternal_loss3035 1 0.9953 1.005
paternal_loss3540 0.996 0.9916 1
paternal_loss4045 0.9915 0.9869 0.9961
paternal_lossunclear 0.9462 0.943 0.9493
maternal_loss01 0.8195 0.7099 0.9387
maternal_loss15 0.9455 0.8936 0.9998
maternal_loss510 0.9906 0.9648 1.017
maternal_loss1015 0.9942 0.9784 1.01
maternal_loss1520 0.9955 0.9837 1.007
maternal_loss2025 1.009 0.9988 1.019
maternal_loss2530 1.006 0.9972 1.014
maternal_loss3035 1.006 0.9986 1.013
maternal_loss3540 0.9961 0.9903 1.002
maternal_loss4045 0.9993 0.994 1.005
maternal_lossunclear 0.9798 0.9769 0.9826
older_siblings1 1.018 1.013 1.022
older_siblings2 1.023 1.016 1.03
older_siblings3 1.016 1.007 1.026
older_siblings4 1.001 0.988 1.014
older_siblings5P 0.949 0.9335 0.9652
nr.siblings 1.038 1.036 1.04
last_born1 1.011 1.008 1.014

Paternal age effect

pander::pander(paternal_age_10y_effect(model))
effect median_estimate ci_95 ci_80
estimate father 25y 1.93 [1.92;1.94] [1.92;1.93]
estimate father 35y 1.83 [1.82;1.84] [1.82;1.83]
percentage change -5.14 [-5.90;-4.42] [-5.63;-4.66]
OR/IRR 0.95 [0.94;0.96] [0.94;0.95]

Marginal effect plots

In these marginal effect plots, we set all predictors except the one shown on the X axis to their mean and in the case of factors to their reference level. We then plot the estimated association between the X axis predictor and the outcome on the response scale (e.g. probability of survival/marriage or number of children).

plot.brmsMarginalEffects_shades(
    x = marginal_effects(model, re_formula = NA, probs = c(0.025,0.975)),
    y = marginal_effects(model, re_formula = NA, probs = c(0.1,0.9)), 
    ask = FALSE)

plot of chunk unnamed-chunk-34plot of chunk unnamed-chunk-34plot of chunk unnamed-chunk-34plot of chunk unnamed-chunk-34plot of chunk unnamed-chunk-34plot of chunk unnamed-chunk-34plot of chunk unnamed-chunk-34plot of chunk unnamed-chunk-34plot of chunk unnamed-chunk-34plot of chunk unnamed-chunk-34

Coefficient plot

Here, we plotted the 95% posterior densities for the unexponentiated model coefficients (b_). The darkly shaded area represents the 50% credibility interval, the dark line represent the posterior mean estimate.

mcmc_areas(as.matrix(model$fit), regex_pars = "b_[^I]", point_est = "mean", prob = 0.50, prob_outer = 0.95) + ggtitle("Posterior densities with means and 50% intervals") + analysis_theme + theme(axis.text = element_text(size = 12), panel.grid = element_blank()) + xlab("Coefficient size")

plot of chunk unnamed-chunk-35

Diagnostics

These plots were made to diagnose misfit and nonconvergence.

Posterior predictive checks

In posterior predictive checks, we test whether we can approximately reproduce the real data distribution from our model.

brms::pp_check(model, re_formula = NA, type = "dens_overlay")

plot of chunk unnamed-chunk-36

brms::pp_check(model, re_formula = NA, type = "hist")

plot of chunk unnamed-chunk-36

Rhat

Did the 6 chains converge?

stanplot(model, pars = "^b_[^I]", type = 'rhat')

plot of chunk unnamed-chunk-37

Effective sample size over average sample size

stanplot(model, pars = "^b", type = 'neff_hist')

plot of chunk unnamed-chunk-38

Trace plots

Trace plots are only shown in the case of nonconvergence.

if(any( summary(model)$fixed[,"Rhat"] > 1.1)) { # only do traceplots if not converged
    plot(model, N = 3, ask = FALSE)
}

File/cluster script name

This model was stored in the file: coefs/swed/m3_children_linear.rds.

Click the following link to see the script used to generate this model:

opts_chunk$set(echo = FALSE)
clusterscript = str_replace(basename(model_filename), "\\.rds",".html")
cat("[Cluster script](" , clusterscript, ")", sep = "")

Cluster script

m4: Sibling comparison, nonlinear paternal age effect

Here, we compared siblings by including a random intercept for the family, and we modelled a possibly nonlinear effect for paternal age differences among siblings.

Model summary

Full summary

model_summary = summary(model, use_cache = FALSE, priors = TRUE)
## Warning: There were 89 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help.
## See http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
print(model_summary)
##  Family: poisson(log) 
## Formula: children ~ s(paternalage) + birth_cohort + male + maternalage.factor + paternalage.mean + paternal_loss + maternal_loss + older_siblings + nr.siblings + last_born + (1 | idParents) 
##    Data: model_data (Number of observations: 1408177) 
## Samples: 6 chains, each with iter = 900; warmup = 450; thin = 1; 
##          total post-warmup samples = 2700
##     ICs: LOO = Not computed; WAIC = Not computed
##  
## Priors: 
## b ~ normal(0,5)
## sd ~ student_t(3, 0, 5)
## sds ~ student_t(3, 0, 10)
## 
## Smooth Terms: 
##                     Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sds(spaternalage_1)     0.13      0.06     0.06      0.3        175 1.03
## 
## Group-Level Effects: 
## ~idParents (Number of levels: 884975) 
##               Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sd(Intercept)        0         0        0     0.01        214 1.02
## 
## Population-Level Effects: 
##                        Estimate Est.Error l-95% CI u-95% CI Eff.Sample
## Intercept                  0.56      0.01     0.53     0.59       1892
## birth_cohort1950M1955      0.00      0.00     0.00     0.00       2700
## birth_cohort1955M1960      0.00      0.00     0.00     0.01       2700
## male1                     -0.06      0.00    -0.07    -0.06       1576
## maternalage.factor1420     0.05      0.00     0.04     0.05       2700
## maternalage.factor3561    -0.01      0.00    -0.01     0.00       2700
## paternalage.mean           0.01      0.00     0.00     0.02       1924
## paternal_loss01            0.10      0.05    -0.01     0.21       2700
## paternal_loss15            0.02      0.02    -0.01     0.06       2700
## paternal_loss510          -0.02      0.01    -0.04    -0.01       2700
## paternal_loss1015         -0.01      0.01    -0.02     0.00       2700
## paternal_loss1520          0.00      0.00    -0.01     0.01       2700
## paternal_loss2025         -0.01      0.00    -0.01     0.00       2700
## paternal_loss2530          0.00      0.00    -0.01     0.01       2700
## paternal_loss3035          0.00      0.00    -0.01     0.00       2700
## paternal_loss3540          0.00      0.00    -0.01     0.00       2700
## paternal_loss4045         -0.01      0.00    -0.01     0.00       2700
## paternal_lossunclear      -0.06      0.00    -0.06    -0.06       2700
## maternal_loss01           -0.20      0.07    -0.35    -0.05       2700
## maternal_loss15           -0.06      0.03    -0.12     0.00       2700
## maternal_loss510          -0.01      0.01    -0.04     0.01       2700
## maternal_loss1015         -0.01      0.01    -0.02     0.01       2700
## maternal_loss1520         -0.01      0.01    -0.02     0.01       2700
## maternal_loss2025          0.01      0.01     0.00     0.02       2700
## maternal_loss2530          0.00      0.00     0.00     0.01       2700
## maternal_loss3035          0.00      0.00     0.00     0.01       2700
## maternal_loss3540          0.00      0.00    -0.01     0.00       2700
## maternal_loss4045          0.00      0.00    -0.01     0.00       2700
## maternal_lossunclear      -0.02      0.00    -0.02    -0.02       2700
## older_siblings1            0.02      0.00     0.02     0.03       1799
## older_siblings2            0.03      0.00     0.02     0.04       1411
## older_siblings3            0.02      0.01     0.01     0.03       1872
## older_siblings4            0.01      0.01    -0.01     0.02       1342
## older_siblings5P          -0.05      0.01    -0.07    -0.03       1829
## nr.siblings                0.04      0.00     0.04     0.04       1829
## last_born1                 0.01      0.00     0.01     0.01       2166
## spaternalage_1            -0.04      0.02    -0.09     0.00        631
##                        Rhat
## Intercept              1.00
## birth_cohort1950M1955  1.00
## birth_cohort1955M1960  1.00
## male1                  1.00
## maternalage.factor1420 1.01
## maternalage.factor3561 1.00
## paternalage.mean       1.00
## paternal_loss01        1.00
## paternal_loss15        1.00
## paternal_loss510       1.00
## paternal_loss1015      1.01
## paternal_loss1520      1.01
## paternal_loss2025      1.00
## paternal_loss2530      1.00
## paternal_loss3035      1.00
## paternal_loss3540      1.00
## paternal_loss4045      1.01
## paternal_lossunclear   1.01
## maternal_loss01        1.00
## maternal_loss15        1.00
## maternal_loss510       1.00
## maternal_loss1015      1.01
## maternal_loss1520      1.00
## maternal_loss2025      1.00
## maternal_loss2530      1.00
## maternal_loss3035      1.00
## maternal_loss3540      1.00
## maternal_loss4045      1.00
## maternal_lossunclear   1.00
## older_siblings1        1.00
## older_siblings2        1.00
## older_siblings3        1.00
## older_siblings4        1.00
## older_siblings5P       1.00
## nr.siblings            1.00
## last_born1             1.00
## spaternalage_1         1.01
## 
## Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
## is a crude measure of effective sample size, and Rhat is the potential 
## scale reduction factor on split chains (at convergence, Rhat = 1).

Table of fixed effects

Estimates are exp(b). When they are referring to the hurdle (hu) component, or a dichotomous outcome, they are odds ratios, when they are referring to a Poisson component, they are hazard ratios. In both cases, they are presented with 95% credibility intervals. To see the effects on the response scale (probability or number of children), consult the marginal effect plots.

fixed_eff = data.frame(model_summary$fixed, check.names = F)
fixed_eff$Est.Error = fixed_eff$Eff.Sample = fixed_eff$Rhat = NULL
fixed_eff$`Odds/hazard ratio` = exp(fixed_eff$Estimate)
fixed_eff$`OR/HR low 95%` = exp(fixed_eff$`l-95% CI`)
fixed_eff$`OR/HR high 95%` = exp(fixed_eff$`u-95% CI`)
fixed_eff = fixed_eff %>% select(`Odds/hazard ratio`, `OR/HR low 95%`, `OR/HR high 95%`)
pander::pander(fixed_eff)
  Odds/hazard ratio OR/HR low 95% OR/HR high 95%
Intercept 1.751 1.703 1.798
birth_cohort1950M1955 1.001 0.9982 1.004
birth_cohort1955M1960 1.003 0.9995 1.006
male1 0.9384 0.9362 0.9406
maternalage.factor1420 1.046 1.04 1.051
maternalage.factor3561 0.9919 0.9873 0.9965
paternalage.mean 1.008 0.9997 1.016
paternal_loss01 1.105 0.9922 1.233
paternal_loss15 1.025 0.9859 1.063
paternal_loss510 0.9766 0.9599 0.993
paternal_loss1015 0.9895 0.9796 1
paternal_loss1520 0.9994 0.9912 1.008
paternal_loss2025 0.994 0.9875 1.001
paternal_loss2530 1 0.9949 1.006
paternal_loss3035 0.9984 0.9935 1.003
paternal_loss3540 0.9954 0.9909 0.9996
paternal_loss4045 0.9915 0.987 0.996
paternal_lossunclear 0.9423 0.9393 0.9456
maternal_loss01 0.818 0.7053 0.9485
maternal_loss15 0.9428 0.8905 0.9974
maternal_loss510 0.9886 0.9638 1.015
maternal_loss1015 0.9935 0.9779 1.008
maternal_loss1520 0.995 0.983 1.007
maternal_loss2025 1.008 0.9977 1.018
maternal_loss2530 1.005 0.9963 1.013
maternal_loss3035 1.004 0.9972 1.011
maternal_loss3540 0.9953 0.9895 1.002
maternal_loss4045 0.9987 0.9932 1.004
maternal_lossunclear 0.9788 0.976 0.9817
older_siblings1 1.022 1.018 1.026
older_siblings2 1.029 1.021 1.036
older_siblings3 1.022 1.012 1.033
older_siblings4 1.006 0.9932 1.021
older_siblings5P 0.9521 0.9358 0.9686
nr.siblings 1.037 1.036 1.039
last_born1 1.011 1.009 1.014
spaternalage_1 0.9599 0.9169 0.9952

Paternal age effect

pander::pander(paternal_age_10y_effect(model))
effect median_estimate ci_95 ci_80
estimate father 25y 1.94 [1.93;1.95] [1.94;1.95]
estimate father 35y 1.81 [1.80;1.82] [1.80;1.82]
percentage change -6.84 [-7.66;-6.03] [-7.37;-6.31]
OR/IRR 0.96 [0.92;1.00] [0.94;0.98]

Marginal effect plots

In these marginal effect plots, we set all predictors except the one shown on the X axis to their mean and in the case of factors to their reference level. We then plot the estimated association between the X axis predictor and the outcome on the response scale (e.g. probability of survival/marriage or number of children).

plot.brmsMarginalEffects_shades(
    x = marginal_effects(model, re_formula = NA, probs = c(0.025,0.975)),
    y = marginal_effects(model, re_formula = NA, probs = c(0.1,0.9)), 
    ask = FALSE)

plot of chunk unnamed-chunk-45plot of chunk unnamed-chunk-45plot of chunk unnamed-chunk-45plot of chunk unnamed-chunk-45plot of chunk unnamed-chunk-45plot of chunk unnamed-chunk-45plot of chunk unnamed-chunk-45plot of chunk unnamed-chunk-45plot of chunk unnamed-chunk-45plot of chunk unnamed-chunk-45

Coefficient plot

Here, we plotted the 95% posterior densities for the unexponentiated model coefficients (b_). The darkly shaded area represents the 50% credibility interval, the dark line represent the posterior mean estimate.

mcmc_areas(as.matrix(model$fit), regex_pars = "b_[^I]", point_est = "mean", prob = 0.50, prob_outer = 0.95) + ggtitle("Posterior densities with means and 50% intervals") + analysis_theme + theme(axis.text = element_text(size = 12), panel.grid = element_blank()) + xlab("Coefficient size")

plot of chunk unnamed-chunk-46

Diagnostics

These plots were made to diagnose misfit and nonconvergence.

Posterior predictive checks

In posterior predictive checks, we test whether we can approximately reproduce the real data distribution from our model.

brms::pp_check(model, re_formula = NA, type = "dens_overlay")

plot of chunk unnamed-chunk-47

brms::pp_check(model, re_formula = NA, type = "hist")

plot of chunk unnamed-chunk-47

Rhat

Did the 6 chains converge?

stanplot(model, pars = "^b_[^I]", type = 'rhat')

plot of chunk unnamed-chunk-48

Effective sample size over average sample size

stanplot(model, pars = "^b", type = 'neff_hist')

plot of chunk unnamed-chunk-49

Trace plots

Trace plots are only shown in the case of nonconvergence.

if(any( summary(model)$fixed[,"Rhat"] > 1.1)) { # only do traceplots if not converged
    plot(model, N = 3, ask = FALSE)
}

File/cluster script name

This model was stored in the file: coefs/swed/m4_children_nonlinear.rds.

Click the following link to see the script used to generate this model:

opts_chunk$set(echo = FALSE)
clusterscript = str_replace(basename(model_filename), "\\.rds",".html")
cat("[Cluster script](" , clusterscript, ")", sep = "")

Cluster script

Model comparison

We compare the four models using an approximate leave-one-out cross-validation information criterion as implemented in brms and loo and the Watanabe-Akaike information criterion.

Approximate leave-one-out (LOO) cross-validation

  LOOIC SE
m1 415187 388.2
m2 415196 388
m3 415186 388.1
m4 415165 387.9
m1 - m2 -9.13 6.31
m1 - m3 0.93 0.42
m1 - m4 22.09 9.37
m2 - m3 10.07 6.47
m2 - m4 31.22 11.3
m3 - m4 21.15 9.36

Watanabe-Akaike information criterion

  WAIC SE
m1 415187 388.2
m2 415196 388
m3 415186 388.1
m4 415165 387.9
m1 - m2 -9.1 6.31
m1 - m3 0.89 0.42
m1 - m4 22.04 9.37
m2 - m3 9.98 6.47
m2 - m4 31.14 11.3
m3 - m4 21.16 9.36