Not all evidence carries the same weight. A 2,000-year history of traditional use tells you something. A randomized controlled trial published last year tells you something else. Understanding where different types of evidence sit on the pyramid - and what each layer can and can’t tell you - is fundamental to making informed decisions about supplements and herbal medicine.

The pyramid, from bottom to top
#

Traditional use. Centuries or millennia of traditional use is real information. It tells you that generations of practitioners found a substance useful enough to keep teaching its use, and that it’s unlikely to be acutely toxic at typical doses. What traditional use does not tell you is whether the substance works for a specific condition by modern diagnostic standards, whether it works better than placebo, or what dose and preparation are optimal. Traditional use is a reason to study something. It is not proof that it works.

Turmeric is the illustrative case. It has been used in Ayurvedic medicine for roughly 4,000 years, primarily for digestive complaints, wound healing, and inflammatory conditions. That long history is meaningful. It tells you turmeric at culinary doses is safe for most people, and that many generations of practitioners found it useful enough to return to. But it doesn’t tell you whether curcumin at 500 mg twice daily reduces osteoarthritis pain better than acetaminophen. That requires clinical trials.

The trials that exist show a mixed but directionally positive picture. A 2016 meta-analysis in the Journal of Medicinal Food pooled data from eight randomized controlled trials of turmeric extracts for arthritis and found statistically significant improvements in pain and physical function compared to placebo. The effect sizes were comparable to some NSAIDs in the included trials. A 2021 systematic review in BMJ Open Sport and Exercise Medicine found that curcumin reduced pain and improved function in knee osteoarthritis, though heterogeneity was high , some trials showed substantial benefit, others showed none.

The catch is absorption. Curcumin is poorly absorbed on its own. Most of what you swallow is metabolized before reaching the bloodstream. Supplement manufacturers address this by combining curcumin with piperine from black pepper, which can increase absorption twentyfold, or by creating phospholipid complexes and nanoparticle formulations. The clinical trial results depend heavily on which formulation was tested. The turmeric a traditional Ayurvedic practitioner would have used , ground rhizome in food , delivers minimal systemic curcumin. The traditional preparation hasn’t been tested in large RCTs for any condition. This is the bridge between traditional use and modern evidence: tradition provides the hypothesis, and research tests it, sometimes confirming parts of it and often adding nuance that tradition alone couldn’t supply.

Mechanistic studies. Laboratory research explains how a substance might work at the molecular level. Curcumin inhibits COX-2 and suppresses NF-kB, a master regulator of inflammation. Resveratrol activates sirtuins, proteins associated with longevity in simple organisms. Berberine activates AMPK, the same metabolic sensor engaged by metformin and by exercise. These mechanisms are real, and they provide biological plausibility , a reason to think the substance might do something useful in humans.

A plausible mechanism is not a clinical result. Biology operates through networks of interacting pathways, not single switches. Resveratrol is the cautionary example. It attracted enormous scientific and media attention after lifespan-extension studies in yeast, worms, and fish. The mechanism was intriguing: resveratrol activated sirtuins, which appeared to mimic some effects of caloric restriction. But when researchers tested resveratrol in human trials, the results underwhelmed. A 2014 randomized trial of resveratrol in healthy older adults, published in Cell Metabolism, found no significant effects on metabolic function, inflammatory markers, or any longevity-related endpoint. A separate trial in obese men found no improvement in insulin sensitivity or lipid profiles. The mechanism was genuine. The translation to human health was not straightforward.

Berberine tells a somewhat more successful story. The AMPK activation mechanism suggested metabolic benefits, and multiple RCTs have confirmed that berberine lowers blood glucose and improves lipid profiles. A 2015 meta-analysis in the Journal of Ethnopharmacology that pooled 27 randomized trials found berberine reduced HbA1c by roughly 0.7 percentage points, comparable to some first-line diabetes medications. The mechanism predicted the clinical effect. It doesn’t always.

Observational studies. These examine populations and find statistical associations. People who drink green tea have lower rates of certain cancers. People with higher vitamin D blood levels have fewer cardiovascular events. People who eat more fish have less cognitive decline. These associations are useful for generating hypotheses. They’re poor evidence of causation because they’re shot through with confounding.

The “healthy user effect” is the most relevant confounder for supplement research. People who take supplements systematically differ from people who don’t. They exercise more, smoke less, eat more vegetables, drink less alcohol, and have higher incomes and more access to healthcare. When an observational study finds that supplement users have better health outcomes, the supplement may not be the cause. It may simply be a marker of the kind of person who takes supplements.

Vitamin E is the classic case study. Observational studies throughout the 1990s repeatedly found that people with higher vitamin E intake had lower rates of cardiovascular disease and certain cancers. The biological plausibility was strong: vitamin E is a fat-soluble antioxidant, and oxidative damage to LDL cholesterol contributes to atherosclerosis. Supplement sales surged. But when the large randomized trials were completed , the HOPE trial (9,541 patients, 2000), the Women’s Health Study (39,876 women, 2005), the Physicians’ Health Study II (14,641 physicians, 2008) , none found cardiovascular benefit. The SELECT trial of 35,533 men, published in JAMA in 2011, was stopped early because vitamin E appeared to increase prostate cancer risk by 17%. The observational signal was robust. It was also spurious. The people taking vitamin E were healthier for reasons unrelated to the vitamin.

Randomized controlled trials. The RCT is the gold standard for determining whether a treatment works. By randomly assigning participants to treatment or placebo and blinding both groups to the assignment, RCTs control for the placebo effect, expectation bias, and both measured and unmeasured confounding variables. A well-conducted RCT provides the strongest single piece of evidence for or against efficacy.

Echinacea illustrates how RCT evidence evolves , and how it can remain stubbornly inconclusive. Echinacea has centuries of traditional use by Native American healers for infections and wounds. It became one of the most popular supplements in the United States for preventing and treating colds. But the RCT evidence has been persistently inconsistent.

A 2007 meta-analysis in The Lancet Infectious Diseases pooled data from 14 studies and reported that echinacea reduced cold incidence by 58% and cold duration by 1.4 days. The authors emphasized, however, that the included trials used different echinacea species (E. purpurea, E. angustifolia, E. pallida), different plant parts (root, aerial parts, whole plant), and different extraction methods. It was impossible to know which specific product produced the effect because the products weren’t the same product.

A 2014 Cochrane review of 24 trials found that some echinacea preparations might have a modest preventive effect, while others showed no benefit at all. A large, well-designed 2005 trial published in the New England Journal of Medicine found no effect of an E. angustifolia root extract on either cold prevention or symptom severity. The echinacea evidence doesn’t say the herb doesn’t work. It says the evidence is genuinely mixed, the effect, if real, is probably small, and the product variability is so extreme that one bottle of echinacea is not pharmacologically equivalent to another. The echinacea you buy at the pharmacy may not be the same echinacea that showed a signal in a positive trial.

Systematic reviews and meta-analyses. These pool data from multiple RCTs, increasing statistical power and providing a more reliable estimate of effect size than any single trial. They also examine consistency: does the intervention work across different populations, doses, and study designs, or does it only work in one specific context? A Cochrane review represents the most rigorous form of systematic review, with standardized methodology and conflict-of-interest policies that exclude authors with financial ties to the products being reviewed.

When a Cochrane review concludes that a supplement has evidence of benefit, that conclusion carries real weight. Examples include ginger for pregnancy-related nausea (a 2014 Cochrane review of 12 trials found ginger significantly reduced nausea compared to placebo), probiotics for prevention of antibiotic-associated diarrhea (a 2019 review of 39 trials found a 60% relative risk reduction), and cranberry for prevention of recurrent urinary tract infections (a 2012 review of 24 trials found reduced UTI recurrence, though a 2023 update showed a smaller effect than earlier estimates).

Clinical practice guidelines. When a major medical organization evaluates the full body of evidence and issues a treatment recommendation, that represents the highest tier of evidence synthesis. The American Academy of Neurology recommending magnesium, riboflavin, and coenzyme Q10 for migraine prevention, or the American College of Gastroenterology recommending peppermint oil for irritable bowel syndrome , these guideline-level recommendations reflect systematic review by experts who have weighed the full evidence base.

Where most supplements fall
#

Most supplements sit in the lower half of the pyramid. They have traditional use and mechanistic plausibility. Some have observational data. A smaller number have been tested in one or two RCTs. Only a fraction , magnesium for migraine, ginger for nausea, probiotics for specific GI conditions, fish oil for hypertriglyceridemia , have accumulated enough high-quality evidence to appear in clinical practice guidelines.

This is not a judgment that supplements without RCT evidence are worthless. It is a statement about the state of the evidence. A person who takes turmeric for arthritis pain based on traditional use and mixed clinical trial data is making a different kind of decision than someone taking magnesium for migraine prevention where the evidence is strong enough for neurologists to include it in their practice recommendations. Both decisions are defensible. They’re not equivalent.

The “but people have used this for centuries” argument
#

This is the most common defense of supplements with weak clinical evidence, and it deserves a serious answer rather than dismissal. The argument conflates two different questions: is something safe, and does it work for a specific purpose.

Centuries of traditional use is reasonable evidence of safety at traditional doses and with traditional preparations. If millions of people consumed a substance over hundreds of years without documented serious harm, it’s unlikely to be acutely toxic. This is genuinely useful information. What traditional use is not , despite the intuition that it should be , is evidence of efficacy for a specific condition by modern diagnostic standards.

Aspirin is the instructive counterexample. Willow bark was used for pain and fever for thousands of years, documented by Hippocrates in the 5th century BCE. The active compound, salicin, was isolated in 1828. Acetylsalicylic acid was synthesized by Bayer in 1897. When tested in clinical trials, aspirin worked , for pain, for fever, for heart attack prevention, with effect sizes that are large and reproducible. The traditional use pointed in the right direction.

For every aspirin there are dozens of traditional remedies that didn’t survive modern testing. Bloodletting was practiced for over 2,000 years based on humoral theory. Mercury was used for syphilis into the 20th century. Countless herbal preparations turned out to be inert, toxic, or effective only at doses that also caused unacceptable side effects. Long use doesn’t guarantee effectiveness. It guarantees that a lot of people believed something for a long time, which is a different thing.

The appropriate posture toward traditional use is respectful but not deferential. This has been used for a long time, which means it’s worth studying. What do the studies say?

The safety exception
#

The evidence pyramid works differently for safety than for efficacy. Long traditional use provides meaningful safety data because very large populations over very long time periods are a reasonable screen for acute toxicity. But traditional use has blind spots. It can miss rare adverse events: if a reaction occurs in 1 in 10,000 people, it takes a lot of exposures to detect. It can miss long-term effects that develop slowly. And it completely misses interactions with modern medications that didn’t exist when the traditional use was established.

Kava kava is the instructive safety case. Kava was used ceremonially and medicinally in the South Pacific for centuries. Water-based extracts of the root were consumed socially and for anxiety without documented liver toxicity. But when kava was introduced to Western markets in the 1990s, concentrated extracts , often made with acetone or ethanol rather than water, and sometimes using stems and leaves in addition to the root , were associated with roughly 100 cases of severe liver injury, including several requiring liver transplantation. The traditional preparation was apparently safe. The commercial preparation, different in extraction method and plant parts, was not. Safety data from traditional use didn’t transfer to a different product even though it contained the same herb.

Bottom line
#

When you encounter a supplement claim, locate it on the evidence pyramid. “There’s a plausible mechanism” is a starting point, not an endpoint. “There’s a 500-person randomized trial” is actual evidence. “There’s a Cochrane review showing consistent benefit across multiple independent trials” is as close to certainty as supplement evidence gets. Know which one you’re looking at. And understand that most supplements operate at the lower levels of the pyramid , not because they don’t work, but because the rigorous studies simply haven’t been conducted.