A critique of Leach et al. 2024
"Response of Northern bobwhite (Colinus virginianus) and two parasitic nematode populations in western Oklahoma to anthelmintic supplemental feed"
The paper can be found here: https://doi.org/10.1016/j.ijppaw.2024.101001.
Preface
Do not take my criticisms of the paper as a criticism of the authors themselves. R.T. Bakker allegedly said, “Be ruthless to theories, kind to colleagues.” I don’t recall ever meeting any of the authors—I am sure they are fine folks.
Relatedly, do not take my criticisms of the scientific approach as reflective of stance towards the effects of parasites on bobwhites or medicated feed on parasites and/or bobwhites. In fact, I am a coauthor on a paper that indicates some effect of parasites on NOBO.
There is no such thing as a perfect study.
The risk is severe from a study that justifiably suggests applying a medicine/chemical into the environment because of the possible effects on non-target organisms and building up resistance in parasites. The risk is far greater from one that unjustifiably does so because it diverts important resources from bobwhite restoration in addition to the other risks. For these reasons, we must be exceptionally critical of studies on this topic and hold them to the highest standards of rigor.
The Meat
No spatial replication.
“No one would now dream of testing response to a treatment by comparing two plots, one treated and other untreated.”
R.A Fisher and J. Wishart (1930) as cited in Hurlbert (1984).
In a causal inference context, the lack of replication severely compromises the ability to attribute causality to the treatment, which is the primary purpose of a study like this. Replication is a cornerstone of causal inference and experimental design. In this study, there was only one experimental unit for the medicated feed treatment and one for the control—there was no replication.
Replication is critical because it reduces the likelihood of random chance occurrences (what Hurlbert referred to as "nondemonic intrusion") and minimizes the risk of systematic bias introduced by confounders. Without replication, the results are far more susceptible to being influenced by unaccounted-for variables.
A key concern in this study is the strong potential for confounding effects between the two sites, which were inherently different in nature. The reference site was a state-owned wildlife area, while the treatment site was a privately owned ranch. These differences likely introduced substantial confounding effects unrelated to the treatment itself.
For instance, harvest pressure immediately comes to mind as a possible confounder. It is reasonable to suspect that the reference site, being more publicly accessible, experienced greater hunting pressure compared to the privately owned treatment site—although this is speculative on my part. Such differences, if unaccounted for, undermine the ability to attribute observed effects to the treatment rather than to site-specific factors.
In a statistical context, the lack of replication is referred to as pseudoreplication, which essentially means using the wrong error term in a statistical test. The authors acknowledged that this was a pseudoreplicated study, which is commendable, but the risks associated with the lack of replication remain severe.
The authors stated that “95% confidence intervals based on a normal distribution were used to distinguish differences between the treatment and reference site and between years for each site with respect to bobwhite abundance in lieu of formal statistical analysis.” However, it’s important to recall how a 95% confidence interval is calculated under the assumption of a normal distribution: 1.96 * [sample standard deviation / square root (n)], where n is the sample size. The authors incorrectly used the subsample size as n in their calculations. This approach is flawed because these subsamples are not independent, which invalidates the informal test. Consequently, the confidence intervals are artificially narrow, making the results appear more precise than they actually are.
To address some of the statistical issues stemming from pseudoreplication, rigorous modeling approaches could have been employed. For instance, spatially weighting the subsamples might have mitigated some of the statistical shortcomings. However, even with improved statistical techniques, the fundamental causal inference issues—stemming from the lack of replication and confounding effects—would have persisted.
No randomization. The authors did not mention whether treatment was randomly assigned to one of the two properties. While subsampling locations were randomly assigned within each experimental unit, the lack of random assignment of the treatment and control to the properties is a notable omission.
In this case, randomization would not have eliminated bias entirely due to the lack of replication, but it would have been a step toward reducing selection bias in assigning the treatment area. Randomization is a fundamental principle in experimental design that helps ensure treatment and control groups are comparable at the outset, minimizing systematic differences that could influence the results.
Lacking an important treatment/control. The medicated feed ostensibly (the description of the feed itself is scant) contained grains and other ingredients in addition to the fenbendazole. A treatment with a non-medicated feed would have allowed isolation of the fenbendazole effects beyond the increase in nutrition. The effect of the treatment suggests that the medicated feed did reduce parasite burdens but the results are not compelling. See next bullet point.
Wrong hypothesis/statistical test for effect on helminths. Yes, the helminth counts appeared lower on the treated site, but like the control, the helminth counts increased over time and it appears to be at the same rate for both sites. The hypothesis that should have been tested is whether or not the helminth count decreased over time on the treatment area relative to the control. Based on Figure 3 the helminth counts paralleled each other for treatment and control. Pre-treatment data on the sites would have provided much needed context.
Bobwhite sampling and modeling was improper. Similar to the the issues mentioned with the helminth counts, the bobwhite count data indicates that the treatment and control follow a similar pattern. Yes, they appear to be different but no way to attribute it to the treatment. It could entirely be the inherent site effects or some confounding variable. The first few years of count data for the reference site are not reported only adding to the inability to discern the effect of the treatment.
The lack of for accounting for detection differences among sites/observers/etc. or including calling availability in the modeling is a critical flaw. Observation error alone can explain the differences in treatment and control. No need to go into the issues with using raw count data because it has been out of practice for a long time.
The log3 transformation of the count data is problematic too. Lots of literature in the past 20 years strongly discourages the use of log transformations. See this paper for start.Use of age ratios were improper. Guthery’s (1997) theoretical curve should never/rarely be used. The assumption of stationarity is rarely met especially among different populations. The authors acknowledge that it was not met for the treatment site. I would argue it wasn’t met for the control.
Age ratios from the control were heavily biased towards juvenile and the authors use this information to argue that survival was better for the treatment. Let’s assume this to be true for a moment, this would suggest that reproduction was greater for the control using Guthery’s approach. Which is more important? Hard to say but past research like McConnell et al. and Lewis et al. point to reproduction being more important.
Using age ratios from hunter harvest without accounting for harvest bias is not good practice. Also, sample sizes were very low. I am not sure how a J:A ratio of >20 can be achieved when only juveniles were donated. The math doesn’t math.
Conclusion
This study has several critical, and in my opinion, fatal flaws. It should not have been published. There is a need to understand the effects of parasites on bobwhite demography and the effect of treatments to alleviate any possible negative effects but this study falls short.
Thank you to Drs. Mark McConnell, Becky Ruzicka, and Dwayne Elmore for their feedback on my review. All opinions and errors are mine.
James,
Thanks for this review. I agree that we must always be willing to take a hard look at our scientific methods and findings. If we can't, we are not really serious scientists.
I look forward to seeing the thoughts and potential counter arguments as we work to better understand NOBO.