Two researchers at the University of Louisville have revealed a fatal flaw in a paper out of Stanton Glantz’s shop and called for its retraction. Glantz and his coauthors analyzed a survey of American teenagers and concluded that the results suggest that vaping causes teens who have tried a cigarette to progress to being smokers. In an online comment at the journal, Brad Rodu and Nantaporn Plurphanswat identified a variable in the dataset that when controlled for — as it clearly should have been — makes most of the reported association disappear. Rodu then published two blog posts explaining the flaw in the Glantz study and calling for it to be retracted.
[Disclosure: Professor Rodu asked me to assess the online comment and its implications, and he incorporated my analysis into those blog posts.]
The study looked at teens who reported having taken at least one puff on a cigarette but not having smoked 100 cigarettes total, measuring the association between whether they had tried vaping and whether they had become smokers a year later. They found that subjects who had tried vaping were substantially more likely, compared to those who had never vaped, to report having smoked in the last 30 days at the one-year follow-up, or that their lifetime total consumption of cigarettes had reached 100, or both. This (obviously inadequate) observation was the basis for the authors’ conclusion that vaping was causing smoking.
The study design was actually better than most of the studies that (falsely) claim to show a “gateway effect” from vaping to smoking. For most such studies, the confounding created by subjects who would never consider trying either product guarantees there will be a strong association between vaping and smoking, however it is measured. That is, anyone who vapes is more likely to start smoking than average, simply because that average includes the majority who are averse to trying either product. Similarly anyone who smokes is more likely to start vaping. The same is true for any pattern of using both products, even if there is zero causal relationship between the two. Studying only teenagers who had tried a puff of a cigarette eliminates the worst of this problem. It is not nearly adequate to fully control for the propensities — inclinations to smoke more and also to vape — that create confounding. But it is better than the previous junk research by Glantz and others.
However, someone who has tried just one cigarette (and perhaps stopped because she decided she does not like nicotine) is very different from someone who has smoked 90 cigarettes (and perhaps is already a committed smoker, but has not been at it long enough to hit 100). It turns out that the survey did not just measure whether someone was in the “between one puff and 99 cigarettes” category at the time of the initial survey, but actually measured where in that range she was. Rodu and Plurphanswat noticed this and reran the statistical analysis from the paper controlling for this variable, how many cigarettes the subject had smoked at the time of the initial survey. As they reported in their posted comment, this made most of the association between vaping and progressing to one of the measures of “being a smoker” disappear.
This is sufficient to show that the original paper is invalid for two reasons. First, even if there were some legitimate argument that the original statistical model is better than Rodu and Plurphanswat’s, the instability demonstrated by the alternative model means that the calculation is not informative. That is, if a result is radically changed by including an additional variable that is reasonable to include, then that result is meaningless. The estimated associations are just artifacts of particular modeling choices. If two different methods for measuring what is supposedly the same phenomenon produce different results, then obviously we do not know the true value, and should not pretend otherwise.
Second, including the additional variable is not only a reasonable alternative in this case; it is mandatory. No honest researcher would leave it out. It is obviously a crucial measure of whether someone is more likely to progress to smoking, regardless of vaping status. If it affects the association with vaping, then it shows there was confounding. Thus, the Rodu and Plurphanswat result is clearly more valid. Their model is still not adequate to fully correct for confounding, but by correcting for just some of it they almost eliminated the association.
Benjamin W. Chaffee, the first author of original paper and one of Glantz’s minions, posted a truly bizarre response to the Rodu and Plurphanswat comment. He suggested that including the measure of total cigarette consumption in the model was a case of including a form of the outcome variable as a predictor variable, an error that occasionally sneaks into statistical analyses. He is completely wrong, garbling the principle he is trying to cite. To explain, if there were a variable for “smoked in the last 50 days at follow-up” in the dataset, and that were included in a model where the outcome was “smoked in the last 30 days at follow-up,” that would be an error. The 50-day measure is an alternative measure of basically the same outcome. As such, it would be such a good predictor of the outcome that all other associations would disappear. However, including a measure of how close someone already was to the outcome at baseline (i.e., the first survey, not the follow-up) is not only fine, but it should definitely be done.
Indeed, Rodu presents a graph that illustrates the confounding. It shows that subjects who vaped were much further along the path to smoking, at baseline, than those who did not. Everyone was somewhere in the “one puff to 99 cigarettes” range, but at very different points in it. Perhaps those who had smoked more were already trying to switch to a low-risk alternative, and thus the smoking caused the vaping rather than vice versa. Perhaps they just liked it a lot and were trialling multiple products. Either way, it was this difference, not the vaping, that caused the association. This is a bit technical (Rodu offers an alternative attempt to explain it for those who did not follow the last two paragraphs), but the bottom line is that Chaffee and Glantz’s claim is wrong, and examining why it is wrong further illustrates why the Rodu and Plurphanswat model is obviously better. Indeed, Rodu suggests that Chaffee’s claim effectively validates their criticism.
Interestingly, this is the second case in two months of Glantz and his minions garbling basic principles of epidemiology statistics in a failed attempt to defend a huge flaw in their research. In both cases, they recited words that vaguely resemble something taught to first-year epidemiology students, but got it totally wrong. It appears that the understanding of epidemiology among University of California San Francisco faculty is at the level of the word salad one might see when a failing first-year student tries to answer an exam question. Alternatively, part of Glantz’s con game might be to intentionally trick inexpert readers with sciencey-sounding phrases, and he is training his minions to do the same. Either way, it is a pattern.
It is not plausible that this was an honest mistake. For one thing, honest authors would indeed have issued a correction when other researchers informed them about the problem. In reality, the original authors undoubtedly already knew what happens when the missing variable is included. No one publishes an analysis like this without looking at the results of many different models. (It would perhaps be better if they did, given that the standard practice is to “model shop”, to make a biased choice of models to get the results they like best.) The authors may not have tried the exact model Rodu and Plurphanswat ran, but they undoubtedly tried something similar and so noticed that their result was really an artifact of uncontrolled confounding, which they could have and should have controlled for.
As for the call that the paper be retracted, it is both valid and quixotic. This blatant model shopping is indeed unethical junk science. Publishing results from an unstable model as if they were meaningful is similarly bad. But if such practices resulted in retractions, it would not only eliminate over 90 percent of tobacco control epidemiology, but about half of all public health epidemiology and economics (including, it should be noted, roughly that portion of pro-vaping research). This would be good for science, but researchers and journals are not going to endorse it for obvious reasons.
We might hope that cases like this, where there is a clear exposition of an simple error, would be the exception. At the very least, we would hope that the results would be ignored after they were so clearly demonstrated to be wrong. But those hopes would only be fulfilled if tobacco controllers and their pet journals (in this case, “Pediatrics”) were pursuing science rather than acting as marketing hucksters who intentionally garble the trappings of science.