Making Observational Studies More Reliable

— Harlan Krumholz, MD, discusses a new path to less bias, more transparency

by Emily Hutto, Associate Video Producer July 17, 2022

In this video, Harlan Krumholz, MD, director of the Outcomes Research & Evaluation at Yale University and Yale New Haven Hospital in Connecticut, discusses an innovation that could significantly control for bias and confounding in observational studies.

The following is a transcript of his remarks:

My name is Harlan Krumholz. I'm a professor of medicine at Yale University and a cardiologist.

I'm here today to talk about that recently came out -- well, a description of a study that came out -- that's going to be looking at comparative effectiveness of different approaches in diabetes.

But I want to talk for a minute just about observational studies. Observational studies: We can't live with them, we can't live without them. Look, a lot of the journals won't even let us make any causal inferences when we use observational studies. They make us say, "There's an association, there's not a causation associated with this study." We can't make firm conclusions. They shouldn't be used in order to make decisions that ought to be placed into practice.

And yet, a lot of our understanding of the way the world works is based on observational studies. There's never been a randomized trial of smoking, for example, and yet we accept as fact that smoking can cause lung cancer.

The problem is that observational studies are a great hodgepodge of methods and approaches and cause all sorts of different problems and are susceptible to great amounts of bias. There's a need for us to differentiate between very good observational studies and studies that represent a fishing expedition that are just putting out one result that actually doesn't really merit our confidence in the finding.

So a group, in disclosure, a group that I've had an opportunity to be part of, is trying to do things a different way. They've created a series of studies called LEGEND, leveraging large-scale, real-world observational studies to provide evidence on head-to-head comparisons of drugs. And in this case, the study I'm going to discuss is around diabetes.

LEGEND is part of the Observational Health Data Sciences and Informatics community, OHDSI, a group of people around the world who are working together to improve observational research and to produce knowledge that will help people to have better evidence and that evidence to generate better outcomes.

But like I said, we can't live without these observational studies. Why? Because we can't simply do enough randomized trials, enough experiments, that are going to provide the evidence we need in clinical practice. There are too many different comparisons that are necessary; there are too many different patient types that we see.

Now, I would love a world where we do RCTs [randomized clinical trials] with more enthusiasm, that there are many more of them. We need to get to a world where it's easier to do these kinds of experiments and produce this kind of information. But I believe that there will always be a need to supplement those experiments with observational studies, which provide us evidence to fill in the cracks, to fill in those areas that we don't know about.

Honestly, right now, those aren't even cracks. The vast majority of decisions that are being made in clinical practice are being made in situations where there are no [randomized trials]. So we need to move towards reliable evidence generation.

One of the problems is that the traditional approach examines one comparison at a time. They don't necessarily use appropriate methods to control for bias, so they don't do it very much. And they modify the design, or the choice of comparators, often until they get a result that results in something that they think will impress the journals and maybe impress the field.

By the way, we've all been susceptible to this. I don't want to cast aspersions on anyone in particular. It's just to say in observational studies, because we don't register them in advance, they tend to be ones where they're iteratively done. Then ultimately, there is one approach with one result that ends up getting into the journal.

This notion of large-scale evidence generation across a network of databases is something different, where everything needs to be pre-specified. There's fixed design and dissemination of the results. There's a promise that no matter what is found, it's going to be pushed out. All the research questions are clearly articulated. The code is shared. The data is clear and transparent, and every single comparison is looked at. So there's not a chance for people to -- we talk about p-hacking, where people are looking at different comparisons and one of them turns out to be quite interesting and intriguing, and that becomes the focus of the study. In this case, we're talking about just showing everyone all the results. So if there could be 120 different comparisons, show them all. This isn't fishing. It's not fishing if all the results are shown. Everyone can see for themselves what's there.

Then, it's basically using best practices to try to mitigate any sources of confounding, all sources of bias, and try to make this work. And then even negative controls, so you pick outcomes that you don't think would be affected by the research question and see, in any of those cases is that positive. Is that showing you that maybe what you found was found by chance? So this is kind of a new way of doing observational research: more discipline, pre-specified, more open, and a commitment to share all the comparisons.

Again, many of us think this is going to represent a big advance. Rohan Khera published a protocol paper that I think is worthy of your attention, because it lays this out for a study of a network of databases for type 2 diabetes comparisons. This is going to be a multinational comparison with large data to make comparisons particularly across SGLT2 inhibitors and GLP1 agonists, as well as several other traditional treatments, in order to make inferences about whether in the real world there's evidence of benefit that we've seen reflected in trials. What about populations that go beyond what are in trials? What about safety when it's used in the real world?

It's going to employ all of these different methods to really strengthen observational studies and minimize the bias. It's going to look at a whole range of agents within the class. So often we're just looking at one drug within a class and trying to make a generalization across the whole class. It's going to be very clear about the eligibility, the exposure, what kind of adjustments are made within this study.

It's in BMJ Open. If you check this out, you will see the wide range of databases that are going to be employed and used, again, to see whether there's consistency across these databases, and a large number of strategies to control and organize the data. Then the outcomes are going to be large.

Again, some people will say, "Well, aren't you fishing? Aren't you just looking for what the positive result is?" It's not fishing if all the results are going to be shared back. You're going to see, if there's just one result that's impressive, but none of the others are and they're not consistent across, you'll see it. Sometimes in a usual study, people will take that one result and that'll be the centerpiece of the study. In studies like this, there's a commitment to share all the information.

So, I think we're on the cusp of a different era in observational studies. And I'm saying, if you're going to read an observational study, reading the methods and differentiating between a fishing expedition and something that hasn't really minimized bias, or something that is taking a systematic, comprehensive approach, which is going to share all the data back is very, very different.

We ought not to be thinking about observational studies with a broad brush, but what we should be doing is differentiating, determining the ones that are worthy of our attention and ones that aren't. Then saying, how strong is this evidence? Some evidence will be very strong. Some will be very weak.

Then, ultimately, we need to be able to do this to fill in those areas where we don't have RCTs. And we need the kind of evidence from observational studies to ensure that we can practice evidence that's scientifically-based, that has the best possible evidence behind it, that we can share that with our patients, and together we can make the best decisions possible.

Emily Hutto is an Associate Video Producer & Editor for 名媛直播. She is based in Manhattan.

名媛直播

Making Observational Studies More Reliable

— Harlan Krumholz, MD, discusses a new path to less bias, more transparency