Another Journal Misinterprets Randomized Vaping Trial

Imagine someone doing a clinical trial to test whether antibiotics work. Not to measure the effect of a particular antibiotic regimen on a particular disease, but just to answer the question “do antibiotics work.” A group of people with various health conditions are selected and some are randomized into a group that gets an antibiotic. It turns out that the rate of recovery from their ailments is barely higher than the group that was given only “standard” treatments, and so the conclusion is that antibiotics do not “work.” Except there are a few problems: There was no attempt to optimize the antibiotic for each individual. Instead, each was given a single dose of penicillin (once the best available drug but long since obsolete). Worse, in this hypothetical, say that few subjects actually suffered from an antibiotic-treatable infection; the intervention had little chance of curing their viral infections or fixing their broken bones.

If you can imagine that study, you understand clinical trials of vaping as a smoking cessation method. This is not to suggest that smoking is a disease and vaping is a medicine, of course. Rather, it points out how the study methods would still be fatally flawed if a medical mindset were appropriate, and the interpretations of the results even more so. The failure to address aspects of vaping that are entirely unlike medicine — social support, learning, heterogeneous preferences — make the methods and interpretations far worse still.

The latest entry in the collection of worthless research, which tobacco controllers (including the study authors) have touted as showing vaping does not “work,” is “A Pragmatic Trial of E-Cigarettes, Incentives, and Drugs for Smoking Cessation” by Scott D. Halpern, et al. The authors do not explain what they mean by “pragmatic.” Since the word normally means focusing on practical reality rather than theory, it describes roughly the opposite of what was done.

The researchers recruited about 6,000 enrollees from an employer-based health insurance plan who had recently indicated they smoke, and randomized them into five groups. Their smoking status was assessed at 1, 3, and 6 months after the start of the trial, with self-reports of smoking abstinence verified with urine and blood samples. The null-treatment group (amusingly called “usual care”) were just told to look at insurers’ smoking cessation webpages and some other useless websites. Unsurprisingly, only 0.1% of them were smoking abstinent at the three follow-up dates. They had probably already quit by the time the clock started running, rather than actually quitting during that first month (a possibility the authors apparently ignored).

The real baseline group were those who were offered free NRT, bupropion (Zyban), or varenicline (Chantix). Since these pharmaceuticals were presumably already available at relatively little cost to this population and are not actually very useful, the offer basically served as a focusing event (a concept the study authors do not seem to understand). Someone who was seriously planning to quit and perfectly capable, but had not yet decided it was the right day to start, might have taken their NRT bonanza as a reason to choose that month to finally do it. This study turned out to be a weak focusing event, however, seemingly not even as good as a New Year’s resolution. Only 0.5% of the subjects in this group were abstinent through 6 months. (Note that many quits triggered by a focusing event will be a “harvesting effect,” simply moving up in time something that would have happened soon anyway. Quitting smoking sooner rather than in a few months has substantial health benefits, but it is not the same as causing a quit that would not have soon happened anyway.)

Part of the explanation for the meager focusing effect is that the study population were random smokers. This is not the typical population for smoking cessation studies, which enroll people who express an interest in quitting or actively seek to participate. Presumably few subjects were seriously interested in quitting (notwithstanding the silly rhetoric that most smokers want to quit). It is also fairly clear, reading between the lines a bit, that many subjects did not really notice they were in a study. People were contacted, and those who did not actively refuse to participate (only 125 out of over 6000) were included in the study and notified of what intervention they were eligible for. It would not be surprising if half of them never even read the letter to find that out.

Those in the vaping arm of the trial, if it can even be called that, were offered a single variety of an obsolete low-nicotine cigalike. They were not, apparently, offered any advice about vaping, let alone the instruction, “if you find this at all appealing, go out and buy something good and try that instead.” It is frankly rather impressive that their smoking abstinence prevalence was double that from the focusing event alone (plus whatever small aid the pharmaceuticals might have provided), with 1.0% abstinent through 6 months. Given just how bad the intervention was — poor product, no variety or individual optimization, no education, smokers who were mostly not interested in vaping (not seeking to quit; many had already tried vaping and decided against it), and with few even taking delivery of the offered product (see below) — this is frankly a vaping miracle. It is as if the dose of penicillin in the cartoon hypothetical had doubled the rate of recovery from diseases.

Of course, tobacco controllers did not spin it this way. Because the difference between those results was not “statistically significant,” they claimed that there was “no difference.” They can get away with this because this is a common (grossly incorrect) misinterpretation of “not statistically significant.” The fact is there was an impressive difference. The difference was very unlikely to be random sampling error despite not making the arbitrary “significance” cut. If you do a trial where there are only about 10 positive outcomes in each trial arm, the differences in results are unlikely to be statistically significant. That does not mean differences do not exist.

Indeed, if someone insists on using that naive interpretation of random error statistics, they would have to say this study was “incapable of detecting” a doubling of the chance of quitting compared to pharmaceutical products. Saying “the study found there was no difference” is simply lying. Even worse is saying “vaping does not work,” as if offering uninterested smokers a single cigalike is a measure of “vaping.” This conclusion is equally absurd as “antibiotics do not work.” Yet that is how the usual dishonest commentators are spinning this study.

The other two interventions were not implemented quite as badly, but were fails for a different reason. Participants were offered the pharmaceutical products and up to $600, getting paid for being abstinent from smoking at each of the three follow-up times. The difference between the two groups was one was told they were being given money for each stage of abstinence, while the other was told that they had an account with the $600 in it, receivable at the end, but the sum would be reduced if they did not test smoking abstinent. These were functionally (pragmatically!) exactly the same payoff structure, but the latter tries to play on the tendency known as “loss aversion” or “the endowment effect,” which causes people to react more strongly to losing what they have versus an equivalent gain. Of course, a pretend account with $600 is not any more something they have than an equivalent potential to earn $600, but even this simple illusion of gains versus losses created a difference: 2.9% of the “account” group tested smoking-abstinent throughout follow-up, versus 2.0% of the “gains” group.

Before drawing the conclusion that the cash incentive was much more successful than the cigalike or pharmaceuticals alone, however, it is important to note (as the authors apparently failed to understand) that these payoffs create the incentive to cheat. The methods of the study made cheating easy: A subject needed only assert smoking abstinent and use their free NRT rather than smoking for a day or two before their scheduled biological tests. The NRT would explain the nicotine metabolites in their system and the effects of carbon monoxide exposure on blood chemistry (the test that could indicate smoking rather than NRT use) would be temporarily gone. Frankly it is a bit sad that not even a few percent of the subjects took advantage of this scam. Since these were beneficiaries of good employer-funded health insurance, $600 was probably not worth the effort for most of them. And again, many did not pay enough attention to the study to even think it through. If this intervention, which the authors proclaim is wonderfully “cost effective,” were attempted more widely, it seems safe to assume the scam would be more widely practiced.

In a feeble attempt to deal with the fact that few subjects actually paid any attention to the study, the authors defined an “engaged” group, the one-fifth of the subjects who ever logged onto the study website. The measured success rates were higher for this subgroup, of course. Any reader who pauses to think will accurately predict the rates were all approximately five times higher. “Engaging” is a necessary step on the path to being identified as a success; subjects who quit (or had already quit) without ever visiting the website presumably could not be identified or get their biological tests to be counted. Thus the subgroup includes all the successes but removes most of the denominator. It would have been useful to know what portion of the “subjects” even realized they were in a study and cared. But the authors apparently did not attempt to assess that, and could only identify the subset who went so far as to sign in.

What the authors did know, but chose to deceptively bury in a table in the supplemental appendix, is the portion of subjects in the e-cigarette arm who ordered the proffered cigalikes. It was 12%. It is bad enough when a supposed test of the effectiveness of vaping just hands smokers some poor-quality e-cigarettes, but in this case they did not even manage to hand out the product 88% of the time. This makes the small increase in abstinence in the e-cigarette group look better still. This was not a failure of vaping to help people quit smoking. It was a failure of the study methods to encourage smokers to try vaping.

The appendix material, which few readers will ever look at, offers a few additional insights.

Almost no one “failed” the biological tests. That is, almost everyone who bothered with the urine and blood tests did so because they knew they would pass. This is despite the fact that minor financial incentives were offered for getting the tests, further indicating that this particular population (unlike others) was not inclined to game the system for a few dollars. Overall, less than 10% of the subjects asked for their free NRT or cigalikes, which could be sold or traded if not wanted, which is further evidence of this. These relatively wealthy smokers could not be bothered to grab these freebies. This explains the low rate of gaming the system to get the $600 payoff, something that would change if these methods were expanded to people who do not shrug off a few hundred dollars.

The appendix also shows that the rate of abstinence at 12 months, unsurprisingly, dropped substantially from the point success was measured, at 6 months. This is the universal pattern for weak smoking cessation interventions: temporary abstinence is not really quitting. It is a further argument in favor of offering vaping (with good products) or other low-risk alternatives as a satisfying permanent substitute. Sadly, the advantage of the cigalikes compared to pharmaceuticals disappeared at 12 months. Perhaps more of the temporary vapers would have stayed smoking-abstinent if they had been offered better products or advice, but that was not part of the study. Instead, they might be permanently soured on the idea of vaping.

The cash payment arms dropped even more dramatically — not surprising since there was no money on the line at 12 months to either earn or cheat for — though they remained a bit higher than for the other interventions. Perhaps this means that 1% of smokers really are on the cusp of quitting, such that a focusing event is not enough but a $600 incentive pushes them over the edge. That might be an interesting result, but buried in this poorly conducted and reported study, it is hard to make much of it.

This report was published in The New England Journal of Medicine, which naive readers might take to be an endorsement of the quality. But when I was teaching, NEJM was always my go-to source for bad population-based or behavioral studies. It is a hugely profitable magazine, meaning that it is well-edited and the stupid sloppy problems with most health journals are stripped away, making plain the real scientific failures. I found it was safe to pick a random NEJM paper and assign my students to identify the fatal flaws; there were always fatal flaws. This study was definitely no exception.

Follow Dr. Phillips on Twitter