03/27/26

Move over, vibe-coding. Vibe-proving is here for math

When ChatGPT first came onto the scene, it wowed users with its writing abilities, but drew laughs for generating images of seven-fingered hands and struggling with basic math, where 2+2 didn’t always equal 4. But more recently, things have changed: Google and OpenAI’s models bagged gold medals at the International Mathematical Olympiad last year, and now some experts say AI could pose an existential threat to the field of mathematics itself.

Mathematicians Emily Riehl and Daniel Litt join Host Flora Lichtman to explore how this technology could change the way math discoveries are made—and what could be lost if things go too far.


Donate To Science Friday

Invest in quality science journalism by making a donation to Science Friday.

Donate

Segment Guests

Emily Riehl

Emily Riehl is a professor of the Department of Mathematics at Johns Hopkins University. She’s based in Baltimore, Maryland.

Daniel Litt

Dr. Daniel Litt is an associate professor of mathematics at the University of Toronto.

Segment Transcript

[MUSIC PLAYING] FLORA LICHTMAN: Hey, I’m Flora Lichtman, and you’re listening to Science Friday. When ChatGPT came out a few years ago, one of its most impressive features was its writing abilities. But something it was often ridiculed for, you may remember, other than making images of seven-fingered hands, was its inability to do basic math. 2 plus 2 did not always equal 4. But recently, things have changed.

Last year, Google and OpenAI’s models bagged gold medals at the International Mathematical Olympiad, edging out some of the best high school math athletes in the world. And now some experts say AI is getting so good, it could pose an existential threat to the field. Here with a perspective are two mathematicians who thought a lot about this Dr. Emily Riehl from Johns Hopkins University, and Dr. Daniel Litt from the University of Toronto. Emily, Daniel, welcome to Science Friday.

DR. DANIEL LITT: Hey.

DR. EMILY RIEHL: Thank you.

FLORA LICHTMAN: All right. We have seen, at least in the nerdy media that I consume, these very big claims that AI suddenly is revolutionizing math. It’s fundamentally changing what it means to be a mathematician. I want your takes on that. Emily, let’s start with you.

DR. EMILY RIEHL: I think from the perspective of a professional mathematician, we’re trying to evaluate where AI is on the trajectory of a mathematician’s life. So we’re first encountered in mathematics at school, where you’re solving problems that have a numerical answer, maybe involving some geometric figures, maybe involving some arithmetic or some algebra. As you mentioned already, AIs used to be very bad at those problems, and they’re much better now.

More recently, AIs have been doing well with more advanced contest-level problems, contests for undergraduate mathematics majors, where they’re writing proofs, and what’s exciting and perhaps scary in this past year is they’re starting to be tested on more research-level problems, so problems that are of interest to people who get paid to do mathematics.

FLORA LICHTMAN: Daniel, what do you think?

DR. DANIEL LITT: Yeah, so the specific claim that AI tools are revolutionizing mathematics definitely has not come true yet. I think there are leading indicators that it will be very significant. It’s definitely changing the way professional mathematicians, including me, work. But the cases Emily mentioned of AIs being tested on research-level mathematics, there are a few success stories. I think those are exciting leading indicators.

There’s not any example of anything that I would consider to be revolutionary yet. Who knows what the next year or two will bring? So I’m excited about that.

DR. EMILY RIEHL: And I think an important part of the story is, mathematics is not just about solving problems. Even research mathematics is not just about solving problems. Solving problems can be beautiful, but actually stating the problems, figuring out what is an interesting theorem to try and prove is equally important– really what drives the field forward. And we haven’t seen any examples there where AIs are coming up with problem statements or new mathematical universes to explore.

FLORA LICHTMAN: Like picking the things that we should be trying to solve.

DR. EMILY RIEHL: Yes.

FLORA LICHTMAN: Is AI solving problems that people haven’t been able to solve? Has that happened? Does it seem like it’s on the horizon?

DR. EMILY RIEHL: One way to answer that is that there is so much math that still needs to be discovered, and there are a lot of mathematicians working together and individually all around the world, trying to make new discoveries and prove new theorems. But there aren’t enough of us to solve every problem. And so in particular, there’s a famous database of problems tied to the mathematician Paul Erdos that has been collected online. And some of them are quite famous and have received a lot of attention. But others, we don’t actually if anybody’s given a serious effort to them.

DR. DANIEL LITT: So I’m aware of I think maybe three Erdos problems like Emily mentioned, that have been solved fully autonomously, and maybe another six or seven without prior solutions in the literature that were solved by a human with AI help, and then a number where solutions were extracted from the literature by AI tools. I think that’s a really exciting development. I think that said, we should view this as an exciting sign that soon the tools will be useful to help us solve maybe more interesting questions that people have devoted effort into trying to answer.

FLORA LICHTMAN: It’s interesting that you use the word “exciting.” Just as a contrast, we had mathematician Steve Strogatz on last year, and he said something I want your thoughts on.

STEVEN STROGATZ: I think the days when we will understand math may be numbered, that it will not be far in the future when computers are producing really impressive math that we will not understand. And it will be correct, but it will be like their oracles just telling us the truth, and we can’t understand it. We’ll just be sitting there with our mouth open.

FLORA LICHTMAN: Emily?

DR. EMILY RIEHL: I agree 100% with Steve that if we had an oracle to tell us which theorems are true and which theorems are false, that would be unsatisfactory to mathematicians. We do like it when a proof tells us something new, so that something is true or something is false. But if the proof doesn’t also explain something deep to justify the new conclusion, then mathematicians will go and search for a different proof that does. People get celebrated for reproving old theorems almost as much as for proving new theorems.

What I would imagine if an AI discovers a proof of a theorem that humans have difficulty understanding is that humans will then try to prove it themselves, and may discover a new proof that way. But I also want to bring into the conversation the fact that there are two different modes in which an AI can produce a proof, and if it’s really a proof that is beyond the level of human understanding, and it’s communicated just in natural language, so as the output of a ChatGPT session.

There’s an argument that actually mathematicians should just throw it in the garbage and ask instead for an AI to produce a proof that is formalized.

FLORA LICHTMAN: Formalized like in math– in the language of math. Is that right?

DR. EMILY RIEHL: Right. So a thing that’s exciting about the possibility of generating AI for mathematics, that I think doesn’t exist in quite the same way in other fields, is that there are these trusted software programs called computer proof assistants that are engineered by human experts, so not produced with AI in any way, that will take a very precisely written mathematical proof and check the logic line by line. So a concern that many mathematicians share is that we’re going to get a lot of AI slop, a lot of new preprints purporting proofs to very famous theorems. But they’re long and complicated and require a ton of expertise to evaluate. And humans will just never be able to judge these proofs for accuracy.

But just as a large language model can produce Python code, a large language model can be asked to write the proof not in English, but in the language of these computer proof assistants. And then at least some aspect of the refereeing can be automated, can be outsourced to a computer. And I think that’s a more productive realm to interact with AI than just the natural language realm.

FLORA LICHTMAN: Daniel, I want to hear your thoughts on that, and also on this idea that AI will become oracles for us in math.

DR. DANIEL LITT: Yeah, so I definitely think it’s possible that in the long term, AI tools will become better at proving mathematical statements than human beings. We’re not there yet, but it could happen. As Emily suggests, that would be a little bit unsatisfying when my goal as a mathematician is to understand mathematics. That said, I don’t think– it’s not clear to me that AI tools producing amazing and long proofs of mathematical statements would mean that we couldn’t understand them. Why wouldn’t we expect an AI tool able to prove an awesome theorem also to be able to explain it to us really well?

So yeah, I feel confident that if I’m sufficiently motivated to understand a piece of mathematics produced by a computer, that I’ll be able to do it. And that’s kind of my goal.

DR. EMILY RIEHL: But I guess, there’s a big problem– kind of a moral problem with AI generated mathematics and human generated mathematics. So there’s a very strong norm in the mathematics community that you do not post a proof to the internet, to the archive, which is our mathematics preprint server, unless you believe that it’s true. And we can’t ask the same things of large language models because they don’t– they’re completely untrustworthy. They don’t have beliefs. I don’t really have a theory of mind of large language models, so these sort of metaphors are imperfect.

So I think we should demand more of a proof written by an AI than a proof written by a human because we don’t have this sort of norm of trust, this norm of belief. So if an AI gives us a large proof that is difficult to understand, as Daniel said, the AI has not finished its job. It should also give us an explanation. But maybe that explanation should also be in this formal realm, so we can use computer tools to start the verification process, which will be very long.

FLORA LICHTMAN: We have to take a break, but when we come back, I want to talk about vibe proofing. You’ve heard of vibe coding. Have you heard of vibe proofing? You will. Stay with us.

[MUSIC PLAYING]

Hey, it’s Flora. I know that you have heard this before, but it’s not a line. Science Friday really can only continue with support from you, our listeners. I love making and telling these stories. And if you like listening to them, please go to sciencefriday.com/donate to make a donation. Join us. Stand up for the value of Science Friday and public media, and help us continue to spark curiosity and spread the joy of science. Donations are fast, easy, and secure. Just go to sciencefriday.com.donate, and thank you.

[MUSIC PLAYING]

OK, so we’ve heard of vibe coding. Is there an equivalent in math? Could I ask it? Could I be like, I don’t really understand how to do this? Can you solve this problem?

DR. DANIEL LITT: Yeah, you could certainly ask that. And then, well, sometimes it would give you the right answer, and sometimes it would give you a wrong answer. There’s still a lot of work to do for a human, which is you have to check the answer is correct. That’s hard. So you can vibe proof something, and then, well, often it will be vibe wrong. But sometimes it’ll be correct, and sometimes it’ll be really useful.

So yeah, I think this is starting to be a part of the mathematical workflow. It’s still important at the end of the day that a human is there taking responsibility for the correctness of the results. And I think we can see some examples of this going wrong. In September of last year, I counted the number of papers put on the archive, which is the main math preprint server, with the words “Hodge conjecture” in the title or abstract. The Hodge conjecture is one of the millennium problems. I think there were about 12 total, and as far as I could tell, 11 of them were nonsense generated by AI tools.

FLORA LICHTMAN: Like hallucinations? Is that what we’re talking about?

DR. DANIEL LITT: Well, I think it was a human who wanted to prove the Hodge conjecture, because there’s a $1 million prize associated to it. So they said, Claude, prove the Hodge conjecture. Don’t make any mistakes. And then they posted whatever came out of that to the archive, and it wasn’t correct.

DR. EMILY RIEHL: Right. And what Daniel is alluding to is there’s a danger with vibe proving, just like with vibe coding. So vibe coding for an expert programmer can be an enhancement of their workflow, but it would also allow somebody who’s a total novice, so I’ll put my hand up here in the programming context to think that they can achieve more than they actually can with computer programming. But it requires a lot of expertise to know when it’s bluffing and when it’s sound.

DR. DANIEL LITT: One thought on this is that actually, the ways the models bluff are remarkably similar to how humans bluff in a proof.

FLORA LICHTMAN: Really? Say more about that.

DR. DANIEL LITT: Yeah, so you can ask– well, first of all, can ask an undergrad to prove something on their homework. And often they’ll leave some things out that they don’t how to do and hope to get full marks. And sometimes that works out. Even if you ask a professional mathematician just off the top of their head, prove something, they might have some kind of general idea and not think through all the details. And sometimes that’ll be right and sometimes not. So I think it’s actually remarkable to watch the models doing math and see in some ways how human they are.

FLORA LICHTMAN: Can the proof assistants sort of root out the AI math slop?

DR. EMILY RIEHL: Yes, I think so. So there are some caveats there. Firstly, it’s a lot more exacting. So you can’t wave your hands and skip a few steps in a proof assistant. It will not accept it. It’ll make a note of exactly what has not yet been justified. So I do think it can help weed out some slop. And in particular, there are some AI startup companies that are aiming to get better at writing formal proofs. And they are training by using the feedback from the proof assistant to evaluate partial proofs.

The proof assistant is meant to be used interactively, and will give you some information about whether each line of your proof is correct or incorrect. And so that information could be integrated into an AI workflow to improve the search for proofs, for instance.

FLORA LICHTMAN: Do these tools change what skills are required to be a great mathematician?

DR. EMILY RIEHL: I think one of the things that’s most challenging about being a mathematician today is just the breadth of the field. So Daniel and I are in different enough research areas that I think we would have difficulty understanding each other’s most recent papers. And I also know that it’s very difficult to keep on top of the literature, even in my own subject area, because there’s a lot of exciting work that gets finished every month. And I don’t read at the pace that I would like to. So a thing that I’m optimistic about is that these tools will help me stay more on top of the literature.

DR. DANIEL LITT: Yeah, I found them really useful to learn things that I needed. Over the summer, I learned from ChatGPT 03 about hyperkahler geometry, which is something that I could have gotten a human expert and pestered them, but they would have gotten annoyed with me much more rapidly than the AI tool did. I think that anytime a new technology is developed, it obviates some skills that, by automating them, that humans previously needed.

And then it also opens up some new capabilities. And I absolutely expect that to happen with AI tools for mathematics. Right now, I think it’s sort of too early to figure out exactly what skills will become less necessary and what new capabilities will open up. But I’m hopeful that it will open up a lot of new possibilities.

DR. EMILY RIEHL: I think there’s a lot of misconceptions about what a great mathematician looks like. There’s some stereotypes that this is something that’s identifiable when you’re very young. And in fact, maybe when you’re my age– I’m 41, which means I’m no longer eligible for the Fields Medal. But I’m a much better mathematician now than I was 20 years ago, because I have spent so much time thinking about mathematics in the last 20 years. And some of that is reading the literature. Some of that is attempting to solve a problem and failing, but failing in a way that when a similar problem comes up later, I will have better intuition about what avenues might be productive and what might be not productive.

So I really think what makes a great mathematician is time and dedication to mathematics. Anybody who falls in love with a subject and is fortunate enough to be able to devote themselves to it is going to be able to achieve some pretty cool things.

DR. DANIEL LITT: Yeah, so what make a great mathematician? So certainly, I think loving the subject, motivation, like Emily was talking about, are very important. You definitely need some technical ability. And I think that’s what people usually think about when they think about strong mathematics. Can you solve this problem? Can you even prove this theorem? But I think in practice, the most influential mathematics is not actually about solving problems or proving a theorem. It’s about something that’s a little bit less tangible, like some kind of philosophy or mysticism, even.

You’re trying to develop a theory or figure out why something is true. Often, that involves discovering a new structure. And that structure might not even be something that is really precise. You spread a philosophy to a community of other mathematicians and give them a way of understanding new mathematical objects. I think you don’t really think of that. It’s not a purely formal thing. You don’t necessarily think of that as being mathematics when you’re in high school, or middle school, or college, or whatever.

But yeah, the most important parts of mathematics are actually closer to philosophy, I think, than science.

DR. EMILY RIEHL: Well, and the final thing to highlight is creativity. And where great ideas often arise is really in conversation between mathematicians. We travel all the time to speak to each other in person at a chalkboard about some mathematical ideas. And it’s amazing in a collaboration that a new direction can just originate spontaneously from a bringing together of multiple minds.

FLORA LICHTMAN: I love that. Emily and Daniel, thank you for taking the time to talk to us today.

DR. EMILY RIEHL: Thanks for having us.

DR. DANIEL LITT: Thank you. Yeah, that was fun.

FLORA LICHTMAN: Dr. Emily Riehl, mathematician at Johns Hopkins University, and Dr. Daniel Litt, mathematician at the University of Toronto. This episode was produced by De Peterschmidt. If you’re surprised to find that you got all the feels from a conversation about math, please leave a review and tap, follow, or subscribe wherever you listen. Thank you for listening. I’m Flora Lichtman.

[MUSIC PLAYING]

Copyright © 2026 Science Friday Initiative. All rights reserved. Science Friday transcripts are produced on a tight deadline by 3Play Media. Fidelity to the original aired/published audio or video file might vary, and text might be updated or amended in the future. For the authoritative record of Science Friday’s programming, please visit the original aired/published recording. For terms of use and more information, visit our policies pages at http://www.sciencefriday.com/about/policies/

Meet the Producers and Host

About Flora Lichtman

Flora Lichtman is a host of Science Friday. In a previous life, she lived on a research ship where apertivi were served on the top deck, hoisted there via pulley by the ship’s chef.

About Dee Peterschmidt

Dee Peterschmidt is Science Friday’s audio production manager, hosted the podcast Universe of Art, and composes music for Science Friday’s podcasts. Their D&D character is a clumsy bard named Chip Chap Chopman.

Explore More