Dr. OpenAI Lied to Me

— AI platform has great potential for use in medicine, but huge pitfalls, says Jeremy Faust, MD

by Emily Hutto, Associate Video Producer January 20, 2023

Emily Hutto is an Associate Video Producer & Editor for 番茄社区app. She is based in Manhattan.

In this video, Jeremy Faust, MD, editor-in-chief of 番茄社区app, discusses the impressive but concerning results after using OpenAI to create medical charts and obtain diagnoses for hypothetical patients.

The following is a transcript of his remarks:

Hello, it's Jeremy Faust, medical editor-in-chief of 番茄社区app. Thank you for joining me. I'm talking today about an Inside Medicine newsletter I wrote over at Substack called "."

OpenAI is an amazing platform, where you can sign up for free, which I did recently. It basically gives you a playground or a chat bot to play with, and you can ask it questions and have conversations. Essentially it's like Google, but it's got AI, and it can be more conversant and interactive. But I really wanted to see what it could do with medicine and medical charting, and I think you'll find the results are pretty astounding.

I actually asked OpenAI to write me an ER medical chart about a patient with a cough. I gave some demographic information, I said that the x-rays are negative, we did a COVID test but the results aren't back yet, vitals are normal; he's a smoker by the way, and he'll be safe for discharge: please write me that chart.

As you can see, it very quickly -- in about 8 seconds -- came up with this amazing medical chart that I think would pass muster. It's certainly better than a lot of medical students I've seen, because at some point you gotta go to med school. It looks like Dr. OpenAI, as I call it, has gone to medical school. So, a really lovely job. It comes up with an assessment and a plan based on what I told it. It even threw in a couple of little details like in addition to discharging the patient with further PCP [primary care physician] evaluation if needed, they should quit smoking and they should wear a mask until they get their COVID test back.

I just thought that was really great and amazing, and I was blown away.

I've messed around with this platform a lot now and I see some really impressive things about it and some concerning things. I want to walk you through what I did.

I wrote in medical jargon, as you can see, "35f no pmh, p/w cp which is pleuritic. She takes OCPs. What's the most likely diagnosis?"

Now of course, many of us who are in healthcare will know that means age 35, female, no past medical history, presents with chest pain which is pleuritic -- worse with breathing -- and she takes oral contraception pills. What's the most likely diagnosis? And OpenAI comes out with costochondritis, inflammation of the cartilage connecting the ribs to the breast bone. Then it says, and we'll come back to this: "Typically caused by trauma or overuse and is exacerbated by the use of oral contraceptive pills."

Now, this is impressive. First of all, everyone who read that prompt, 35, no past medical history with chest pain that's pleuritic, a lot of us are thinking, "Oh, a pulmonary embolism, a blood clot. That's what that is going to be." Because on the Boards, that's what that would be, right?

But in fact, OpenAI is correct. The most likely diagnosis is costochondritis -- because so many people have costochondritis, that the most common thing is that somebody has costochondritis with symptoms that happen to look a little bit like a classic pulmonary embolism. So OpenAI was quite literally correct, and I thought that was pretty neat.

But we'll come back to that oral contraceptive pill correlation, because that's not true. That's made up. And that's bothersome.

But I wanted to ask OpenAI a little more about this case. So I asked, "What's the ddx?" What's the differential diagnosis? It spit out the differential diagnosis, as you can see, led by costochondritis. It did include a rib fracture, pneumonia, but it also mentioned things like pulmonary embolism and pericarditis and other things. Pretty good differential diagnosis for the minimal information that I gave the computer.

Then I said to Dr. OpenAI, "What's the most important condition to rule out?" Which is different from what's the most likely diagnosis. What's the most dangerous condition I've got to worry about? And it very unequivocally said, pulmonary embolism. Because given this little mini clinical vignette, this is what we're thinking about, and it got it. I thought that was interesting.

I wanted to go back and ask OpenAI, what was that whole thing about costochondritis being made more likely by taking oral contraceptive pills? What's the evidence for that, please? Because I'd never heard of that. It's always possible there's something that I didn't see, or there's some bad study in the literature.

OpenAI came up with this study in the European Journal of Internal Medicine that was supposedly saying that. I went on Google and I couldn't find it. I went on PubMed and I couldn't find it. I asked OpenAI to give me a reference for that, and it spits out what looks like a reference. I look up that, and it's made up. That's not a real paper.

It took a real journal, the European Journal of Internal Medicine. It took the last names and first names, I think, of authors who have published in said journal. And it confabulated out of thin air a study that would apparently support this viewpoint.

It must have picked up the idea that if you look up pulmonary embolism on those webpages, whether it's a webpage on the CDC website or whether it's a webpage on the Mayo Clinic or whoever it might be, that OCPs, oral contraceptives, show up on the same page as chest pain causes. So it sort of started to figure out that maybe costochondritis and oral contraceptives are related, when in fact that's a red herring. It's really that people who are taking oral contraceptives have a higher risk of a pulmonary embolism, and those travel together on internet pages, and OpenAI got fooled.

But rather than admit that, I asked OpenAI for links and asked, are you sure you're not wrong? It stood its ground.

So, I was blown away by the accuracy of so much of what I did with the platform OpenAI, but I was also scared that it was willing to lie to me to make up something to support a contention that was not real.

So, we have to proceed with caution. This is an amazing tool, but it's also one that, as we all know, is going to set us up for some tricky situations.

I think you'll all enjoy playing around with the platform, and I'm curious what nooks and crannies you may find. I've been using it to teach my residents and students, and I've been using it to do shortcuts on things like running a few math problems, that kind of thing. There's great, huge potential and great, huge pitfalls.

Leave your questions and comments below. Thanks for joining us.