In this vignette, you can see what a codebook generated from a dataset with rich metadata looks like. This dataset includes mock data for a short German Big Five personality inventory and an age variable. The dataset follows the format created when importing data from However, data imported using the haven package uses similar metadata. You can also add such metadata yourself, or use the codebook package for unannotated datasets.

As you can see below, the codebook package automatically computes reliabilities for multi-item inventories, generates nicely labelled plots and outputs summary statistics. The same information is also stored in a table, which you can export to various formats. Additionally, codebook can show you different kinds of (labelled) missing values, and show you common missingness patterns. As you cannot see, but search engines will, the codebook package also generates JSON-LD metadata for the dataset. If you share your codebook as an HTML file online, this metadata should make it easier for others to find your data. See what Google sees here.

knit_by_pkgdown <- !is.null(knitr::opts_chunk$get("fig.retina"))
knitr::opts_chunk$set(warning = FALSE, message = TRUE, error = FALSE)

data("bfi", package = 'codebook')
if (!knit_by_pkgdown) {
    bfi <- bfi %>% select(-starts_with("BFIK_extra"),
bfi$age <- rpois(nrow(bfi), 30)
var_label(bfi$age) <- "Alter"

By default, we only set the required metadata attributes name and description to sensible values. However, there is a number of attributes you can set to describe the data better. Find out more.

metadata(bfi)$name <- "MOCK Big Five Inventory dataset (German metadata demo)"
metadata(bfi)$description <- "a small mock Big Five Inventory dataset"
metadata(bfi)$identifier <- "doi:10.5281/zenodo.1326520"
metadata(bfi)$datePublished <- "2016-06-01"
metadata(bfi)$creator <- list(
      "@type" = "Person",
      givenName = "Ruben", familyName = "Arslan",
      email = "", 
      affiliation = list("@type" = "Organization",
        name = "MPI Human Development, Berlin"))
metadata(bfi)$citation <- "Arslan (2016). Mock BFI data."
metadata(bfi)$url <- ""
metadata(bfi)$temporalCoverage <- "2016" 
metadata(bfi)$spatialCoverage <- "Goettingen, Germany" 
# We don't want to look at the code in the codebook.
knitr::opts_chunk$set(warning = TRUE, message = TRUE, echo = FALSE)



Dataset name: MOCK Big Five Inventory dataset (German metadata demo)

a small mock Big Five Inventory dataset

Metadata for search engines
name value
@type Person
givenName Ruben
familyName Arslan
affiliation Organization , MPI Human Development, Berlin

Survey overview

28 completed rows, 28 who entered any information, 0 only viewed the first page. There are 0 expired rows (people who did not finish filling out in the requested time frame). In total, there are 28 rows including unfinished and expired rows.

There were 28 unique participants, of which 28 finished filling out at least one survey.

This survey was not repeated.

The first session started on 2016-07-08 09:54:16, the last session on 2016-11-02 21:19:50.

Starting date times

Starting date times

People took on average 127.36 minutes (median 1.48) to answer the survey.

Duration people took for answering the survey

Duration people took for answering the survey


Scale: BFIK_agree

Reliability: Cronbach’s α [95% CI] = 0.8 [0.68;0.92].

Missing: 0.

Likert plot of scale BFIK_agree items

Likert plot of scale BFIK_agree items

Distribution of scale BFIK_agree

Distribution of scale BFIK_agree

95% Confidence Interval
lower estimate upper
0.6815382 0.8005842 0.9196302
raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
0.8005842 0.8032578 0.8025354 0.5051216 4.082794 0.0607377 3.116071 0.9316506 0.4955289
Reliability if an item is dropped:
raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
BFIK_agree_4R 0.7039106 0.7059214 0.6277253 0.4444909 2.400452 0.0922357 0.0093288 0.4566167
BFIK_agree_1R 0.7822898 0.7821901 0.7633114 0.5448449 3.591160 0.0731005 0.0413123 0.5344410
BFIK_agree_3R 0.6758982 0.6925559 0.6242359 0.4288568 2.252623 0.1072194 0.0212505 0.3469923
BFIK_agree_2 0.8180575 0.8196006 0.7841113 0.6022937 4.543256 0.0568475 0.0219955 0.5971631
Item statistics
n raw.r std.r r.cor r.drop mean sd
BFIK_agree_4R 28 0.8471206 0.8503385 0.8291974 0.7057555 2.928571 1.184110
BFIK_agree_1R 28 0.7168171 0.7554255 0.6253942 0.5538583 3.000000 0.942809
BFIK_agree_3R 28 0.8820884 0.8651249 0.8439281 0.7510051 3.035714 1.290482
BFIK_agree_2 28 0.7205961 0.7010915 0.5430779 0.4825103 3.500000 1.261980
Non missing response frequency for each item
1 2 3 4 5 miss
BFIK_agree_4R 0.0714286 0.3928571 0.1785714 0.2500000 0.1071429 0
BFIK_agree_1R 0.0000000 0.3928571 0.2500000 0.3214286 0.0357143 0
BFIK_agree_3R 0.1071429 0.3214286 0.1428571 0.2857143 0.1428571 0
BFIK_agree_2 0.0714286 0.1785714 0.1785714 0.3214286 0.2500000 0
name label type type_options data_type value_labels optional item_order n_missing complete_rate min median max mean sd n_value_labels hist
BFIK_agree_4R Ich kann mich schroff und abweisend anderen gegenüber verhalten. rating_button 5 haven_labelled 5. 1: Trifft überhaupt nicht zu,
4. 2,
3. 3,
2. 4,
1. 5: Trifft voll und ganz zu,
NA. Item was never rendered for this user.
0 5 0 1 1 3 5 2.928571 1.184110 6 ▂▇▁▃▁▅▁▂
BFIK_agree_1R Ich neige dazu, andere zu kritisieren. rating_button 5 haven_labelled 5. 1: Trifft überhaupt nicht zu,
4. 2,
3. 3,
2. 4,
1. 5: Trifft voll und ganz zu,
NA. Item was never rendered for this user.
0 7 0 1 2 3 5 3.000000 0.942809 6 ▇▁▅▁▁▆▁▁
BFIK_agree_3R Ich kann mich kalt und distanziert verhalten. rating_button 5 haven_labelled 5. 1: Trifft überhaupt nicht zu,
4. 2,
3. 3,
2. 4,
1. 5: Trifft voll und ganz zu,
NA. Item was never rendered for this user.
0 13 0 1 1 3 5 3.035714 1.290482 6 ▂▇▁▃▁▇▁▃
BFIK_agree_2 Ich schenke anderen leicht Vertrauen, glaube an das Gute im Menschen. rating_button 5 haven_labelled 1. 1: Trifft überhaupt nicht zu,
2. 2,
3. 3,
4. 4,
5. 5: Trifft voll und ganz zu,
NA. Item was never rendered for this user.
0 17 0 1 1 4 5 3.500000 1.261980 6 ▂▅▁▅▁▇▁▆

Scale: BFIK_open

Reliability: Cronbach’s α [95% CI] = 0.53 [0.25;0.81].

Missing: 0.

Likert plot of scale BFIK_open items

Likert plot of scale BFIK_open items

Distribution of scale BFIK_open

Distribution of scale BFIK_open

95% Confidence Interval
lower estimate upper
0.2473043 0.5270752 0.806846
raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
0.5270752 0.5130499 0.5384384 0.2084848 1.053598 0.1427402 4.258929 0.5630692 0.1770952
Reliability if an item is dropped:
raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
BFIK_open_2 0.5186964 0.5019443 0.5306346 0.2514611 1.0078075 0.1551415 0.0955990 0.2626354
BFIK_open_1 0.5786594 0.5593735 0.5130468 0.2973409 1.2694957 0.1307794 0.0504451 0.1947743
BFIK_open_4 0.4119483 0.4100879 0.3217759 0.1881289 0.6951677 0.1885214 0.0042364 0.1594161
BFIK_open_3 0.2202412 0.2437361 0.2041506 0.0970083 0.3222898 0.2572230 0.0195543 0.1594161
Item statistics
n raw.r std.r r.cor r.drop mean sd
BFIK_open_2 28 0.5298378 0.5869036 0.3017297 0.2317746 4.214286 0.7382232
BFIK_open_1 28 0.5062740 0.5329245 0.2760932 0.1568781 4.392857 0.8317445
BFIK_open_4 28 0.7010199 0.6614159 0.5679139 0.3611976 4.214286 0.9567361
BFIK_open_3 28 0.8041472 0.7686222 0.7253133 0.5379725 4.214286 0.9567361
Non missing response frequency for each item
1 2 3 4 5 miss
BFIK_open_2 0.0000000 0.0357143 0.0714286 0.5357143 0.3571429 0
BFIK_open_1 0.0000000 0.0357143 0.1071429 0.2857143 0.5714286 0
BFIK_open_4 0.0357143 0.0000000 0.1428571 0.3571429 0.4642857 0
BFIK_open_3 0.0000000 0.0714286 0.1428571 0.2857143 0.5000000 0
name label type type_options data_type value_labels optional item_order n_missing complete_rate min median max mean sd n_value_labels hist
BFIK_open_2 Ich bin tiefsinnig, denke gerne über Sachen nach. rating_button 5 haven_labelled 1. 1: Trifft überhaupt nicht zu,
2. 2,
3. 3,
4. 4,
5. 5: Trifft voll und ganz zu,
NA. Item was never rendered for this user.
0 4 0 1 2 4.0 5 4.214286 0.7382232 6 ▁▁▁▁▁▇▁▅
BFIK_open_1 Ich bin vielseitig interessiert. rating_button 5 haven_labelled 1. 1: Trifft überhaupt nicht zu,
2. 2,
3. 3,
4. 4,
5. 5: Trifft voll und ganz zu,
NA. Item was never rendered for this user.
0 8 0 1 2 5.0 5 4.392857 0.8317445 6 ▁▁▂▁▁▃▁▇
BFIK_open_4 Ich schätze künstlerische und ästhetische Eindrücke. rating_button 5 haven_labelled 1. 1: Trifft überhaupt nicht zu,
2. 2,
3. 3,
4. 4,
5. 5: Trifft voll und ganz zu,
NA. Item was never rendered for this user.
0 19 0 1 1 4.0 5 4.214286 0.9567361 6 ▁▁▁▂▁▆▁▇
BFIK_open_3 Ich habe eine aktive Vorstellungskraft, bin phantasievoll. rating_button 5 haven_labelled 1. 1: Trifft überhaupt nicht zu,
2. 2,
3. 3,
4. 4,
5. 5: Trifft voll und ganz zu,
NA. Item was never rendered for this user.
0 22 0 1 2 4.5 5 4.214286 0.9567361 6 ▁▁▂▁▁▅▁▇

Scale: BFIK_consc

Reliability: Cronbach’s α [95% CI] = 0.78 [0.66;0.9].

Missing: 0.

Likert plot of scale BFIK_consc items

Likert plot of scale BFIK_consc items

Distribution of scale BFIK_consc

Distribution of scale BFIK_consc

95% Confidence Interval
lower estimate upper
0.6572704 0.7796983 0.9021261
raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
0.7796983 0.7870619 0.7839117 0.480263 3.6962 0.0624632 3.651786 0.7915622 0.4590337
Reliability if an item is dropped:
raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
BFIK_consc_3 0.5979912 0.6118484 0.5247890 0.3444504 1.576313 0.1176371 0.0092555 0.3982579
BFIK_consc_4 0.7626290 0.7705758 0.7158716 0.5282082 3.358737 0.0739794 0.0185680 0.5163541
BFIK_consc_2R 0.7283814 0.7272275 0.7111827 0.4705314 2.666058 0.0849270 0.0474736 0.5163541
BFIK_consc_1 0.7813268 0.8041781 0.7720873 0.5778620 4.106681 0.0663018 0.0232879 0.6618601
Item statistics
n raw.r std.r r.cor r.drop mean sd
BFIK_consc_3 28 0.9085942 0.9115478 0.9168579 0.8120889 3.500000 1.0363755
BFIK_consc_4 28 0.6874998 0.7351180 0.6337064 0.5256863 3.857143 0.7559289
BFIK_consc_2R 28 0.8411080 0.7904948 0.6957375 0.6208811 3.178571 1.3067792
BFIK_consc_1 28 0.6732655 0.6874444 0.5201855 0.4656927 4.071429 0.8997354
Non missing response frequency for each item
1 2 3 4 5 miss
BFIK_consc_3 0.0357143 0.1428571 0.2500000 0.4285714 0.1428571 0
BFIK_consc_4 0.0000000 0.0357143 0.2500000 0.5357143 0.1785714 0
BFIK_consc_2R 0.1785714 0.1071429 0.1785714 0.4285714 0.1071429 0
BFIK_consc_1 0.0000000 0.0714286 0.1428571 0.4285714 0.3571429 0
name label type type_options data_type value_labels optional item_order n_missing complete_rate min median max mean sd n_value_labels hist
BFIK_consc_3 Ich bin tüchtig und arbeite flott. rating_button 5 haven_labelled 1. 1: Trifft überhaupt nicht zu,
2. 2,
3. 3,
4. 4,
5. 5: Trifft voll und ganz zu,
NA. Item was never rendered for this user.
0 10 0 1 1 4 5 3.500000 1.0363755 6 ▁▂▁▅▁▇▁▂
BFIK_consc_4 Ich mache Pläne und führe sie auch durch. rating_button 5 haven_labelled 1. 1: Trifft überhaupt nicht zu,
2. 2,
3. 3,
4. 4,
5. 5: Trifft voll und ganz zu,
NA. Item was never rendered for this user.
0 11 0 1 2 4 5 3.857143 0.7559289 6 ▁▁▃▁▁▇▁▂
BFIK_consc_2R Ich bin bequem, neige zur Faulheit. rating_button 5 haven_labelled 5. 1: Trifft überhaupt nicht zu,
4. 2,
3. 3,
2. 4,
1. 5: Trifft voll und ganz zu,
NA. Item was never rendered for this user.
0 12 0 1 1 4 5 3.178571 1.3067792 6 ▃▂▁▃▁▇▁▂
BFIK_consc_1 Ich erledige Aufgaben gründlich. rating_button 5 haven_labelled 1. 1: Trifft überhaupt nicht zu,
2. 2,
3. 3,
4. 4,
5. 5: Trifft voll und ganz zu,
NA. Item was never rendered for this user.
0 18 0 1 2 4 5 4.071429 0.8997354 6 ▁▁▂▁▁▇▁▇

Scale: BFIK_extra

Reliability: Cronbach’s α [95% CI] = 0.9 [0.84;0.96].

Missing: 0.

Likert plot of scale BFIK_extra items

Likert plot of scale BFIK_extra items

Distribution of scale BFIK_extra

Distribution of scale BFIK_extra

95% Confidence Interval
lower estimate upper
0.8368641 0.8992625 0.9616609
raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
0.8992625 0.8991277 0.9103612 0.6902472 8.913524 0.0318359 3.848214 1.009995 0.693212
Reliability if an item is dropped:
raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
BFIK_extra_2 0.9049618 0.9049007 0.8810063 0.7602940 9.515329 0.0315012 0.0075063 0.7906075
BFIK_extra_3R 0.8656522 0.8666622 0.8458840 0.6842023 6.499751 0.0453083 0.0171133 0.7238529
BFIK_extra_4 0.8526404 0.8505466 0.8225061 0.6548173 5.691049 0.0487973 0.0233309 0.5986022
BFIK_extra_1R 0.8524132 0.8543807 0.8025601 0.6616754 5.867224 0.0487065 0.0039225 0.6625711
Item statistics
n raw.r std.r r.cor r.drop mean sd
BFIK_extra_2 28 0.8073670 0.8162172 0.7344684 0.6733826 4.178571 1.090483
BFIK_extra_3R 28 0.8877202 0.8813510 0.8432063 0.7880184 3.750000 1.205696
BFIK_extra_4 28 0.9027707 0.9065043 0.8787875 0.8247657 3.857143 1.112697
BFIK_extra_1R 28 0.9062901 0.9006338 0.8874781 0.8219851 3.607143 1.196888
Non missing response frequency for each item
1 2 3 4 5 miss
BFIK_extra_2 0.0714286 0.0000000 0.0714286 0.3928571 0.4642857 0
BFIK_extra_3R 0.0714286 0.0714286 0.2142857 0.3214286 0.3214286 0
BFIK_extra_4 0.0357143 0.1071429 0.1428571 0.3928571 0.3214286 0
BFIK_extra_1R 0.0357143 0.1785714 0.2142857 0.2857143 0.2857143 0
name label type type_options data_type value_labels optional item_order n_missing complete_rate min median max mean sd n_value_labels hist
BFIK_extra_2 Ich bin begeisterungsfähig und kann andere leicht mitreißen. rating_button 5 haven_labelled 1. 1: Trifft überhaupt nicht zu,
2. 2,
3. 3,
4. 4,
5. 5: Trifft voll und ganz zu,
NA. Item was never rendered for this user.
0 6 0 1 1 4 5 4.178571 1.090483 6 ▁▁▁▁▁▇▁▇
BFIK_extra_3R Ich bin eher der “stille Typ”, wortkarg. rating_button 5 haven_labelled 5. 1: Trifft überhaupt nicht zu,
4. 2,
3. 3,
2. 4,
1. 5: Trifft voll und ganz zu,
NA. Item was never rendered for this user.
0 14 0 1 1 4 5 3.750000 1.205696 6 ▂▂▁▅▁▇▁▇
BFIK_extra_4 Ich gehe aus mir heraus, bin gesellig. rating_button 5 haven_labelled 1. 1: Trifft überhaupt nicht zu,
2. 2,
3. 3,
4. 4,
5. 5: Trifft voll und ganz zu,
NA. Item was never rendered for this user.
0 20 0 1 1 4 5 3.857143 1.112697 6 ▁▂▁▃▁▇▁▆
BFIK_extra_1R Ich bin eher zurückhaltend, reserviert. rating_button 5 haven_labelled 5. 1: Trifft überhaupt nicht zu,
4. 2,
3. 3,
2. 4,
1. 5: Trifft voll und ganz zu,
NA. Item was never rendered for this user.
0 21 0 1 1 4 5 3.607143 1.196888 6 ▁▅▁▆▁▇▁▇

Scale: BFIK_neuro

Reliability: Cronbach’s α [95% CI] = 0.75 [0.61;0.9].

Missing: 0.

Likert plot of scale BFIK_neuro items

Likert plot of scale BFIK_neuro items

Distribution of scale BFIK_neuro

Distribution of scale BFIK_neuro

95% Confidence Interval
lower estimate upper
0.6080638 0.7537326 0.8994015
raw_alpha std.alpha G6(smc) average_r S/N ase mean sd median_r
0.7537326 0.7476172 0.7145893 0.496833 2.962235 0.0743208 2.892857 0.9254231 0.440167
Reliability if an item is dropped:
raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
BFIK_neuro_2R 0.8400000 0.8408387 0.7253854 0.7253854 5.2829328 0.0602555 NA 0.7253854
BFIK_neuro_3 0.5904682 0.6112721 0.4401670 0.4401670 1.5724937 0.1456732 NA 0.4401670
BFIK_neuro_4 0.4653928 0.4905053 0.3249467 0.3249467 0.9627289 0.1871605 NA 0.3249467
Item statistics
n raw.r std.r r.cor r.drop mean sd
BFIK_neuro_2R 28 0.6549435 0.7217484 0.4638430 0.4100297 3.107143 0.8751417
BFIK_neuro_3 28 0.8755182 0.8383732 0.7625955 0.6528603 3.071429 1.2744954
BFIK_neuro_4 28 0.9046526 0.8854863 0.8409166 0.7420620 2.500000 1.2018504
Non missing response frequency for each item
1 2 3 4 5 miss
BFIK_neuro_2R 0.0000000 0.2857143 0.3571429 0.3214286 0.0357143 0
BFIK_neuro_3 0.1071429 0.2500000 0.2857143 0.1785714 0.1785714 0
BFIK_neuro_4 0.2500000 0.3214286 0.1071429 0.3214286 0.0000000 0
name label type type_options data_type value_labels optional item_order n_missing complete_rate min median max mean sd n_value_labels hist
BFIK_neuro_2R Ich bin entspannt, lasse mich durch Stress nicht aus der Ruhe bringen. rating_button 5 haven_labelled 5. 1: Trifft überhaupt nicht zu,
4. 2,
3. 3,
2. 4,
1. 5: Trifft voll und ganz zu,
NA. Item was never rendered for this user.
0 9 0 1 2 3 5 3.107143 0.8751417 6 ▆▁▇▁▁▇▁▁
BFIK_neuro_3 Ich mache mir viele Sorgen. rating_button 5 haven_labelled 1. 1: Trifft überhaupt nicht zu,
2. 2,
3. 3,
4. 4,
5. 5: Trifft voll und ganz zu,
NA. Item was never rendered for this user.
0 15 0 1 1 3 5 3.071429 1.2744954 6 ▃▇▁▇▁▅▁▅
BFIK_neuro_4 Ich werde leicht nervös und unsicher. rating_button 5 haven_labelled 1. 1: Trifft überhaupt nicht zu,
2. 2,
3. 3,
4. 4,
5. 5: Trifft voll und ganz zu,
NA. Item was never rendered for this user.
0 16 0 1 1 2 4 2.500000 1.2018504 6 ▆▁▇▁▁▂▁▇



Distribution of values for age

Distribution of values for age

0 missing values.

name label data_type n_missing complete_rate min median max mean sd hist
age Alter numeric 0 1 19 32 38 30.5 4.670633 ▂▂▇▇▅

Missingness report

Missing values in 1 variables0128
Missing values per variable282828

Codebook table

JSON-LD metadata

The following JSON-LD can be found by search engines, if you share this codebook publicly on the web.

