ֱ

The Shibboleth: The Hidden Crux of the Professional Interview

— Milton Packer recalls when he was asked to prove his worthiness

MedpageToday
A male physician shakes the hand of another position during an interview

I have been analyzing clinical research data for 45 years, and I have become adept at understanding their strengths and limitations. The principles that govern the unbiased analysis of information take a long time to comprehend and master.

I learned how to analyze data by listening to thousands of discussions among highly knowledgeable people. But most of what I know came from my own experience, which informed me not only about the reasoning processes that worked but also those that did not. I have had the opportunity and privilege to analyze hundreds of clinical research studies, and I applied this experience to the trials that I designed and led.

The ultimate goal of analyzing data is to discern whether an apparent pattern represents a replicable truth -- or alternatively, whether it reflects an intriguing coincidence, which needs to be set aside or tested further.

The principles of data analysis cannot be memorized, and they cannot be approximated by the application of statistics. Statistics is merely a methodological tool; in contrast, effective data analysis represents a way of disciplined thinking.

Here is just one example. Many physicians wrongly believe that we distinguish truth from chance in the analysis of clinical trials by the calculation of P values, but this is false. But it is a falsehood that is so deeply entrenched in our thinking that it represents the single greatest impediment to understand the results of clinical research studies.

Most of the P values that we calculate in clinical research do not mean what many of us think they mean. Many P values less than 0.05 are due to the play of chance, and many P values greater than 0.05 reflect true findings of great value.

Sadly, many physicians, epidemiologists, and clinical researchers do not understand this. They view a P value of 0.05 as representing the "holy grail." So they do everything they possibly can to achieve some result with a P value of less than 0.05.

The easiest way to achieve some result with a P value of less than 0.05 is to simply analyze data to the point of exhaustion. If you slice and dice the data enough times, you will certainly find a difference that is accompanied by a P<0.05. These massive fishing expeditions may yield a few tiny fish, but they are destined to rot quickly.

Conversely, many trials reveal distinctive patterns that are highly replicable. In many instances, the replication is evident within the trial itself. But in other instances, the patterns seen in a single trial are strongly reinforced by nearly identical patterns seen in other trials. The ongoing drumbeat of replication reveals a truth, even if none of the individual analyses yield a P value of less than 0.05.

In truth, the selection of 0.05 as a threshold for a P value has a checkered history. There is nothing magical or important about a false positive error rate of 1 in 20. The P value of 0.05 was by the egocentricity of Ronald Fisher, the father of modern statistics, who sought to challenge the work of archcompetitors who had advocated different approaches to the interpretation of P values.

Interestingly, Ronald Fisher was primarily known as a geneticist, who appears to have developed statistical methods primarily as a means of supporting his strong racist views and his . He to his doctrinaire principles. Yet, his own advocacy of a threshold of 0.05 was .

In truth, P values were never designed to represent a decision-making tool. Most physicians have great difficulty understanding that, and amazingly, many statisticians also struggle with the real meaning of a P value. But recently, that they would like P values to simply disappear.

Unfortunately, discerning meaningful patterns in the data takes a lot of work. It requires a real understanding of the clinical question, the study design, and the methods used to collect the information. Using P values as a decision tool represents an ineffective short cut to avoid all of this intellectual effort.

A nominal P value is a finding of unearned convenience, or as , "a foolish consistency [that] is the hobgoblin of little minds." By relying on it, "a great soul has simply nothing to do. He may as well concern himself with his shadow on the wall."

About 15 years ago, my view of P values was directly tested in a surprising manner. A series of clinical trials had yielded data that were exceptionally difficult to understand. The sponsor of the trials could not make sense of them, and thus, it decided to invite two outside experts to review the data and make recommendations. We would be provided with unlimited access to the raw data and could perform any analyses that we wished. However, for us to have any chance of success, the two of us needed to be able to work together with a high level of effectiveness, since our two complementary skillsets were considered essential to the success of the project.

I was asked to be one of the experts. The other invited expert was a world-famous statistician, who was known for his undisputed brilliance in mathematical methods. He had written hundreds of landmark papers in statistics, whereas I had never taken a single course in statistics. If the prerequisite for my participation had been that I had training in statistical methods, I would have been eliminated from contention immediately. But I had been asked to become involved not because of my mathematical skills, but because of my experience in framing the analysis of clinical data in order to facilitate their interpretability.

The master statistician had never met me and did not know anything about me, and he was very skeptical that a clinical cardiologist could meet his exacting standards. So to determine if the relationship would work, the master statistician suggested that he and I meet one-on-one in a small room for 5 minutes, during which time he would ask me one question. Based on my answer, he would decide whether he and I could work together.

Essentially, the master statistician had devised a "shibboleth," a test that would determine if he and I were philosophically of the same mind.

The from a famous biblical story (Book of Judges, chapter 12), which recounts the battle between the Ephraimites and Gileadites. The Ephraimites had invaded Gilead, but the Gileadites (who spoke Hebrew) had cut the Ephraimites off from their home base. When the Ephraimites sought to cross the Jordan River to return home, each was asked to pronounce the word "shibboleth." The "sh" sound did not exist in the Ephraimite dialect, and thus, the Ephraimites pronounced the word in a way that, to Gileadites, sounded like "sibboleth." According to , "Say now Shibboleth: and he said Sibboleth: for he could not frame to pronounce it right. Then they took him, and slew him at the passages of Jordan: and there fell at that time of the Ephraimites forty and two thousand." Shibboleth has come to refer to a password or belief that identifies people with a common mind.

My shibboleth was intended to reveal my analytical philosophy. During our meeting, the master statistician looked deeply into my eyes and asked:

"Dr. Packer, what does a P value mean to you?"

The question was an exceptionally clever device. He was not asking for a definition. There was no possibility of a right or wrong answer. He was asking me to define my philosophical temperament, in a way that could be not practiced or rehearsed.

I had no idea what he was looking for. I had read several of his papers, but I could not recall anything he had ever written that would tell me what he might be thinking. I had only my own sense of self to guide me in my response.

I also had a strong sense that a long-winded answer would serve me poorly. I needed to crystallize my entire approach to data analysis in as few words as possible. And so I answered:

"A P value is an artificial device for those who are not willing to make a real effort to think about the data and to understand what it does or does not reveal. I do not know how you feel about P values. But regardless of your views, I can assure you that I hate them more than you do."

He smiled and extended his hand to mine. Over the next four years, the two of us worked hand-in-hand, and together, we solved a puzzle that many thought would be impossible to resolve.

Like it or not, many professional interviews in medicine have their shibboleths. They are an ancient form of a password, but not in a form that you can learn, memorize, or repeat on command. They represent , because they are intended to reveal who you really are and how you really think.

If you think that shibboleths are unfair, you may be right, but you are missing the whole point of why they exist. They do not exist to determine if a person is qualified. They exist entirely to determine if a relationship (if pursued) is likely to work.

If you fail your shibboleth, do not fret. he would refuse to join any club that would have him as a member.

Disclosures

Packer has recently consulted for Amarin, AstraZeneca, Boehringer Ingelheim, Novartis, and Relypsa on issues unrelated to COVID-19. Novartis is one of several companies that manufactures hydroxychloroquine, and is conducting clinical trials with the drug for COVID-19. Packer has no financial relationship with Gilead Sciences, which is developing remdesivir for COVID-19.