One lives only to make blunders: Mistakes

Elson, Alsalti, Hussey, & Arslan (2023)

Project: Elson, Alsalti, Hussey, & Arslan (2023). Psychological measures aren’t toothbrushes.

Timing: After publication.

Reporter: I found this myself.

Type of mistake: Coding error that changes a conclusion in the manuscript.

Description: I incorrectly implemented the normalisation of Shannon entropy. We submitted the following correction notice: A previous version incorrectly normalised Shannon entropy by the log of the number of times measures were used rather than by the number of measures that were used at least once. After correcting this mistake, we see a less pronounced increase in normalised entropy over time and overall higher levels of entropy (i.e. higher fragmentation).

Delay: After finding the error, I took some time to revisit the other code and asked Taym Alsalti to also review my code. We were working on a follow-up project. After this process concluded, we submitted a correction notice on March 20, 2024. I’m posting this note March 22, 2024.

Solution: Correctly normalize by the number of measures that were used at all. See figure to see the difference (dotted lines were the incorrect, published numbers).

Corrected graph. Dotted lines are the incorrectly normalized, published numbers.

Future Measures: More code review, always write and test a function when hand-coding a metric. Also, avoid hand-coding metrics, I evidently suck at it :-)

Bounty: I found this myself, so I guess the bounty doesn’t apply. I donate to Against Malaria anyway. If the sum of my errors this year exceeds what I usually donate, I’ll have to consider increasing my donation.

Follow-up: While we incorrectly described what we did in the paper, this mistake caused me to revisit this issue and collaborate with an ecologist, Zach Marion, to understand the benefits and drawbacks of different indices for diversity/entropy. As it turns out, ecologists see a problem with normalising diversity indices using the number of distinct species found. The problem is that the number of distinct species found is itself censored and could be low because of low diversity or because of low sampling effort. Hence, other methods normalize instead using the number of individuals found (which is the method we originally used, but incorrectly described). The number of individuals found more directly reflects sampling effort. So, in the follow-up paper, among other things, we better explain our rationale for our diversity index, our normalization procedure and the inferential problems involved. It’s all documented here.

Holzleitner et al., 2021, preprint revision

Project: Holzleitner et al., 2021. No evidence that inbreeding avoidance is up-regulated during the ovulatory phase of the menstrual cycle.

Timing: Before publication of preprint, while preparing a revision. The linked preprint still shows the earlier version, but the updated code is available at https://rubenarslan.github.io/fertility_and_kin/.

Reporter: Steve Gangestad

Type of mistake: Coding error that changes a number reported in the supplement

Description: I ran a number of robustness checks and automated the steps. When using the update.formula syntax, I made a mistake, in which I removed not only the term from the regression that I wanted to remove, but also another main effect that I wanted to keep, which led to an interaction term being included without the respective main effect.

Delay: 11 days. Reported February 8, 2021, corrected February 19, 2021.

Solution: I re-added the main effect. The numbers changed only slightly, the interpretation was not affected.

Future Measures: I’m considering writing a better-tested set of functions to run my robustness checks or to look for existing well-tested solutions, to reduce the chance of errors. However, I think given a great number of robustness checks, the probability of errors goes up, and I do not want to respond by reducing the number of robustness checks.

Bounty: 10€ paid to Against Malaria

Arslan et al., 2018

Project: Arslan et al., 2018. Using 26,000 diary entries to show ovulatory changes in sexual desire and behavior.

Timing: After publication.

Reporters: Steve Gangestad, Dan Engber, Mike Wood

Type of mistake: Several.

Description: See correction at the journal JPSP, extended blog post, follow-up criticism by Gangestad & Dinh, which reports an uncorrected error. We wrote a response to the follow-up critique. Please note that the publicly visible portion of the correction on PsycNet is truncated, so you actually have to download the PDF.

Delay: Some immediate fixes were released via Twitter, the correction was published in June, 2019. The correction did not correct one further error, which the critical reanalysis, published online in May 2020, reported. Our response to it was published only in October 2021.

Solution: See links.

Future Measures: I increased the time spent on data cleaning, tried to standardize my procedures more (by using my codebook package and better naming schemes for variables). I plan to reproducibly generate figures, tables, and results paragraphs in the future to avoid transcription errors. I introduced my bug bounty policy as a result.

Bounty: Before BB policy.

Arslan et al., 2017

Project: Arslan et al., 2018. Relaxed selection and mutation accumulation are best studied empirically: reply to Woodley of Menie et al

Timing: After publication.

Reporter: Anonymous via my Tell Me I’m Wrong form.

Type of mistake: Misquotes.

Description: The journal sent us a comment on our paper to respond to, which we did in peer review and in a published response. However, the published comment was a later version than the last one we were sent. As a result, three quotes we highlight do not occur in the comment etc.

Delay: The journal only allowed the publication of a very brief erratum and massively delayed the response. Our response was published online February 2018, I heard about the misquotes almost immediately and got in touch with the journal in February. The cut erratum appeared online only August 2018.

Solution: A footnote describes the ordeal. The uncut erratum is here, the published erratum is here.

Future Measures: I resolved to always insist on the final version of an article I’m responding to.

Bounty: Before BB policy.

Minor stuff

To remind me who I owe beer.

https://twitter.com/rubenarslan/status/1277637487624675328

https://twitter.com/LL_Wieczorek/status/1117093149250289664

https://twitter.com/rubenarslan/status/1057365698698268672

Mistakes

Elson, Alsalti, Hussey, & Arslan (2023)

Holzleitner et al., 2021, preprint revision

Arslan et al., 2018

Arslan et al., 2017

Minor stuff

Corrections

Reuse