psyctests_info <- readRDS("../sober_rubric/raw_data/psyctests_info.rds")

In the following, you see treemaps. In these plots, the area per test is proportional to its usage frequency. In the plots, tests which have been rarely used according to APA PsycInfo are grouped, mainly so that the individually small tiles are still visible despite their low frequency.

By comparing across the subdisciplines, we can see what higher and lower entropy fields look like visually. High entropy is seen as great fragmentation, i.e. there are many tiles of similar size and the “used 1-5 times” and “used 6-20 times” tiles are prominent. Lower fragmentation is apparent when some large tiles reflecting individual measures, such as the Beck Depression Inventory, dominate a field.

tests <- psyctests_info %>% 
  group_by(DOI, Name) %>% 
  summarise(n = sum(usage_count, na.rm = T),
            parent = case_when(
                    n > 50 ~ "",
                    n > 20 ~ "used 21-50 times",
                    n > 5 ~ "used 6-20 times",
                    TRUE ~ "used 1-5 times"))
entropy = entropy(tests$n)
norm_entropy = calc_norm_entropy(tests$n)

tests <- bind_rows(tests, 
                   tests %>% filter(parent != "") %>% 
                     ungroup() %>% 
                     select(Name = parent) %>% distinct() %>% mutate(n=0, parent = ""))

Overall

Normalized Shannon entropy \(\eta(X) = 0.71\)

fig <- plot_ly(
  type='treemap',
  labels = tests$Name,
  parents = tests$parent,
  values= tests$n,
  text = tests$DOI,
  tiling = list(packing = "squarify", squarifyratio = (1 + sqrt(5)) / 2),
  hoverinfo="label+value+text",
  textinfo="label+value")

fig %>% layout(
  autosize = TRUE,
  margin = list(l = 0, t = 0, r = 0 , b = 0)
)