Move over, vibe-coding. Vibe-proving is here for math

Subscribe to Science Friday

When ChatGPT first came onto the scene, it wowed users with its writing abilities, but drew laughs for generating images of seven-fingered hands and struggling with basic math, where 2+2 didn’t always equal 4. But more recently, things have changed: Google and OpenAI’s models bagged gold medals at the International Mathematical Olympiad last year, and now some experts say AI could pose an existential threat to the field of mathematics itself.

Mathematicians Emily Riehl and Daniel Litt join Host Flora Lichtman to explore how this technology could change the way math discoveries are made—and what could be lost if things go too far.

Donate To Science Friday

Invest in quality science journalism by making a donation to Science Friday.

Donate

Segment Guests

Emily Riehl

Emily Riehl is a professor of the Department of Mathematics at Johns Hopkins University. She’s based in Baltimore, Maryland.

Segment Transcript

[MUSIC PLAYING] FLORA LICHTMAN: Hey, I’m Flora Lichtman, and you’re listening to Science Friday. When ChatGPT came out a few years ago, one of its most impressive features was its writing abilities. But something it was often ridiculed for, you may remember, other than making images of seven-fingered hands, was its inability to do basic math. 2 plus 2 did not always equal 4. But recently, things have changed.

Last year, Google and OpenAI’s models bagged gold medals at the International Mathematical Olympiad, edging out some of the best high school math athletes in the world. And now some experts say AI is getting so good, it could pose an existential threat to the field. Here with a perspective are two mathematicians who thought a lot about this Dr. Emily Riehl from Johns Hopkins University, and Dr. Daniel Litt from the University of Toronto. Emily, Daniel, welcome to Science Friday.

DR. DANIEL LITT: Hey.

DR. EMILY RIEHL: Thank you.

FLORA LICHTMAN: All right. We have seen, at least in the nerdy media that I consume, these very big claims that AI suddenly is revolutionizing math. It’s fundamentally changing what it means to be a mathematician. I want your takes on that. Emily, let’s start with you.

DR. EMILY RIEHL: I think from the perspective of a professional mathematician, we’re trying to evaluate where AI is on the trajectory of a mathematician’s life. So we’re first encountered in mathematics at school, where you’re solving problems that have a numerical answer, maybe involving some geometric figures, maybe involving some arithmetic or some algebra. As you mentioned already, AIs used to be very bad at those problems, and they’re much better now.

More recently, AIs have been doing well with more advanced contest-level problems, contests for undergraduate mathematics majors, where they’re writing proofs, and what’s exciting and perhaps scary in this past year is they’re starting to be tested on more research-level problems, so problems that are of interest to people who get paid to do mathematics.

FLORA LICHTMAN: Daniel, what do you think?

DR. DANIEL LITT: Yeah, so the specific claim that AI tools are revolutionizing mathematics definitely has not come true yet. I think there are leading indicators that it will be very significant. It’s definitely changing the way professional mathematicians, including me, work. But the cases Emily mentioned of AIs being tested on research-level mathematics, there are a few success stories. I think those are exciting leading indicators.

There’s not any example of anything that I would consider to be revolutionary yet. Who knows what the next year or two will bring? So I’m excited about that.

DR. EMILY RIEHL: And I think an important part of the story is, mathematics is not just about solving problems. Even research mathematics is not just about solving problems. Solving problems can be beautiful, but actually stating the problems, figuring out what is an interesting theorem to try and prove is equally important– really what drives the field forward. And we haven’t seen any examples there where AIs are coming up with problem statements or new mathematical universes to explore.

FLORA LICHTMAN: Like picking the things that we should be trying to solve.

DR. EMILY RIEHL: Yes.

FLORA LICHTMAN: Is AI solving problems that people haven’t been able to solve? Has that happened? Does it seem like it’s on the horizon?

DR. EMILY RIEHL: One way to answer that is that there is so much math that still needs to be discovered, and there are a lot of mathematicians working together and individually all around the world, trying to make new discoveries and prove new theorems. But there aren’t enough of us to solve every problem. And so in particular, there’s a famous database of problems tied to the mathematician Paul Erdos that has been collected online. And some of them are quite famous and have received a lot of attention. But others, we don’t actually if anybody’s given a serious effort to them.

DR. DANIEL LITT: So I’m aware of I think maybe three Erdos problems like Emily mentioned, that have been solved fully autonomously, and maybe another six or seven without prior solutions in the literature that were solved by a human with AI help, and then a number where solutions were extracted from the literature by AI tools. I think that’s a really exciting development. I think that said, we should view this as an exciting sign that soon the tools will be useful to help us solve maybe more interesting questions that people have devoted effort into trying to answer.

FLORA LICHTMAN: It’s interesting that you use the word “exciting.” Just as a contrast, we had mathematician Steve Strogatz on last year, and he said something I want your thoughts on.

STEVEN STROGATZ: I think the days when we will understand math may be numbered, that it will not be far in the future when computers are producing really impressive math that we will not understand. And it will be correct, but it will be like their oracles just telling us the truth, and we can’t understand it. We’ll just be sitting there with our mouth open.

FLORA LICHTMAN: Emily?

DR. EMILY RIEHL: I agree 100% with Steve that if we had an oracle to tell us which theorems are true and which theorems are false, that would be unsatisfactory to mathematicians. We do like it when a proof tells us something new, so that something is true or something is false. But if the proof doesn’t also explain something deep to justify the new conclusion, then mathematicians will go and search for a different proof that does. People get celebrated for reproving old theorems almost as much as for proving new theorems.

What I would imagine if an AI discovers a proof of a theorem that humans have difficulty understanding is that humans will then try to prove it themselves, and may discover a new proof that way. But I also want to bring into the conversation the fact that there are two different modes in which an AI can produce a proof, and if it’s really a proof that is beyond the level of human understanding, and it’s communicated just in natural language, so as the output of a ChatGPT session.

There’s an argument that actually mathematicians should just throw it in the garbage and ask instead for an AI to produce a proof that is formalized.

FLORA LICHTMAN: Formalized like in math– in the language of math. Is that right?

DR. EMILY RIEHL: Right. So a thing that’s exciting about the possibility of generating AI for mathematics, that I think doesn’t exist in quite the same way in other fields, is that there are these trusted software programs called computer proof assistants that are engineered by human experts, so not produced with AI in any way, that will take a very precisely written mathematical proof and check the logic line by line. So a concern that many mathematicians share is that we’re going to get a lot of AI slop, a lot of new preprints purporting proofs to very famous theorems. But they’re long and complicated and require a ton of expertise to evaluate. And humans will just never be able to judge these proofs for accuracy.

But just as a large language model can produce Python code, a large language model can be asked to write the proof not in English, but in the language of these computer proof assistants. And then at least some aspect of the refereeing can be automated, can be outsourced to a computer. And I think that’s a more productive realm to interact with AI than just the natural language realm.

FLORA LICHTMAN: Daniel, I want to hear your thoughts on that, and also on this idea that AI will become oracles for us in math.

DR. DANIEL LITT: Yeah, so I definitely think it’s possible that in the long term, AI tools will become better at proving mathematical statements than human beings. We’re not there yet, but it could happen. As Emily suggests, that would be a little bit unsatisfying when my goal as a mathematician is to understand mathematics. That said, I don’t think– it’s not clear to me that AI tools producing amazing and long proofs of mathematical statements would mean that we couldn’t understand them. Why wouldn’t we expect an AI tool able to prove an awesome theorem also to be able to explain it to us really well?

So yeah, I feel confident that if I’m sufficiently motivated to understand a piece of mathematics produced by a computer, that I’ll be able to do it. And that’s kind of my goal.

DR. EMILY RIEHL: But I guess, there’s a big problem– kind of a moral problem with AI generated mathematics and human generated mathematics. So there’s a very strong norm in the mathematics community that you do not post a proof to the internet, to the archive, which is our mathematics preprint server, unless you believe that it’s true. And we can’t ask the same things of large language models because they don’t– they’re completely untrustworthy. They don’t have beliefs. I don’t really have a theory of mind of large language models, so these sort of metaphors are imperfect.

So I think we should demand more of a proof written by an AI than a proof written by a human because we don’t have this sort of norm of trust, this norm of belief. So if an AI gives us a large proof that is difficult to understand, as Daniel said, the AI has not finished its job. It should also give us an explanation. But maybe that explanation should also be in this formal realm, so we can use computer tools to start the verification process, which will be very long.

FLORA LICHTMAN: We have to take a break, but when we come back, I want to talk about vibe proofing. You’ve heard of vibe coding. Have you heard of vibe proofing? You will. Stay with us.

[MUSIC PLAYING]

Hey, it’s Flora. I know that you have heard this before, but it’s not a line. Science Friday really can only continue with support from you, our listeners. I love making and telling these stories. And if you like listening to them, please go to sciencefriday.com/donate to make a donation. Join us. Stand up for the value of Science Friday and public media, and help us continue to spark curiosity and spread the joy of science. Donations are fast, easy, and secure. Just go to sciencefriday.com.donate, and thank you.

[MUSIC PLAYING]

OK, so we’ve heard of vibe coding. Is there an equivalent in math? Could I ask it? Could I be like, I don’t really understand how to do this? Can you solve this problem?

DR. DANIEL LITT: Yeah, you could certainly ask that. And then, well, sometimes it would give you the right answer, and sometimes it would give you a wrong answer. There’s still a lot of work to do for a human, which is you have to check the answer is correct. That’s hard. So you can vibe proof something, and then, well, often it will be vibe wrong. But sometimes it’ll be correct, and sometimes it’ll be really useful.

So yeah, I think this is starting to be a part of the mathematical workflow. It’s still important at the end of the day that a human is there taking responsibility for the correctness of the results. And I think we can see some examples of this going wrong. In September of last year, I counted the number of papers put on the archive, which is the main math preprint server, with the words “Hodge conjecture” in the title or abstract. The Hodge conjecture is one of the millennium problems. I think there were about 12 total, and as far as I could tell, 11 of them were nonsense generated by AI tools.

FLORA LICHTMAN: Like hallucinations? Is that what we’re talking about?

DR. DANIEL LITT: Well, I think it was a human who wanted to prove the Hodge conjecture, because there’s a $1 million prize associated to it. So they said, Claude, prove the Hodge conjecture. Don’t make any mistakes. And then they posted whatever came out of that to the archive, and it wasn’t correct.

DR. EMILY RIEHL: Right. And what Daniel is alluding to is there’s a danger with vibe proving, just like with vibe coding. So vibe coding for an expert programmer can be an enhancement of their workflow, but it would also allow somebody who’s a total novice, so I’ll put my hand up here in the programming context to think that they can achieve more than they actually can with computer programming. But it requires a lot of expertise to know when it’s bluffing and when it’s sound.

DR. DANIEL LITT: One thought on this is that actually, the ways the models bluff are remarkably similar to how humans bluff in a proof.

FLORA LICHTMAN: Really? Say more about that.

DR. DANIEL LITT: Yeah, so you can ask– well, first of all, can ask an undergrad to prove something on their homework. And often they’ll leave some things out that they don’t how to do and hope to get full marks. And sometimes that works out. Even if you ask a professional mathematician just off the top of their head, prove something, they might have some kind of general idea and not think through all the details. And sometimes that’ll be right and sometimes not. So I think it’s actually remarkable to watch the models doing math and see in some ways how human they are.

FLORA LICHTMAN: Can the proof assistants sort of root out the AI math slop?

DR. EMILY RIEHL: Yes, I think so. So there are some caveats there. Firstly, it’s a lot more exacting. So you can’t wave your hands and skip a few steps in a proof assistant. It will not accept it. It’ll make a note of exactly what has not yet been justified. So I do think it can help weed out some slop. And in particular, there are some AI startup companies that are aiming to get better at writing formal proofs. And they are training by using the feedback from the proof assistant to evaluate partial proofs.

The proof assistant is meant to be used interactively, and will give you some information about whether each line of your proof is correct or incorrect. And so that information could be integrated into an AI workflow to improve the search for proofs, for instance.

FLORA LICHTMAN: Do these tools change what skills are required to be a great mathematician?

DR. EMILY RIEHL: I think one of the things that’s most challenging about being a mathematician today is just the breadth of the field. So Daniel and I are in different enough research areas that I think we would have difficulty understanding each other’s most recent papers. And I also know that it’s very difficult to keep on top of the literature, even in my own subject area, because there’s a lot of exciting work that gets finished every month. And I don’t read at the pace that I would like to. So a thing that I’m optimistic about is that these tools will help me stay more on top of the literature.

DR. DANIEL LITT: Yeah, I found them really useful to learn things that I needed. Over the summer, I learned from ChatGPT 03 about hyperkahler geometry, which is something that I could have gotten a human expert and pestered them, but they would have gotten annoyed with me much more rapidly than the AI tool did. I think that anytime a new technology is developed, it obviates some skills that, by automating them, that humans previously needed.

And then it also opens up some new capabilities. And I absolutely expect that to happen with AI tools for mathematics. Right now, I think it’s sort of too early to figure out exactly what skills will become less necessary and what new capabilities will open up. But I’m hopeful that it will open up a lot of new possibilities.

DR. EMILY RIEHL: I think there’s a lot of misconceptions about what a great mathematician looks like. There’s some stereotypes that this is something that’s identifiable when you’re very young. And in fact, maybe when you’re my age– I’m 41, which means I’m no longer eligible for the Fields Medal. But I’m a much better mathematician now than I was 20 years ago, because I have spent so much time thinking about mathematics in the last 20 years. And some of that is reading the literature. Some of that is attempting to solve a problem and failing, but failing in a way that when a similar problem comes up later, I will have better intuition about what avenues might be productive and what might be not productive.

So I really think what makes a great mathematician is time and dedication to mathematics. Anybody who falls in love with a subject and is fortunate enough to be able to devote themselves to it is going to be able to achieve some pretty cool things.

DR. DANIEL LITT: Yeah, so what make a great mathematician? So certainly, I think loving the subject, motivation, like Emily was talking about, are very important. You definitely need some technical ability. And I think that’s what people usually think about when they think about strong mathematics. Can you solve this problem? Can you even prove this theorem? But I think in practice, the most influential mathematics is not actually about solving problems or proving a theorem. It’s about something that’s a little bit less tangible, like some kind of philosophy or mysticism, even.

You’re trying to develop a theory or figure out why something is true. Often, that involves discovering a new structure. And that structure might not even be something that is really precise. You spread a philosophy to a community of other mathematicians and give them a way of understanding new mathematical objects. I think you don’t really think of that. It’s not a purely formal thing. You don’t necessarily think of that as being mathematics when you’re in high school, or middle school, or college, or whatever.

But yeah, the most important parts of mathematics are actually closer to philosophy, I think, than science.

DR. EMILY RIEHL: Well, and the final thing to highlight is creativity. And where great ideas often arise is really in conversation between mathematicians. We travel all the time to speak to each other in person at a chalkboard about some mathematical ideas. And it’s amazing in a collaboration that a new direction can just originate spontaneously from a bringing together of multiple minds.

FLORA LICHTMAN: I love that. Emily and Daniel, thank you for taking the time to talk to us today.

DR. EMILY RIEHL: Thanks for having us.

DR. DANIEL LITT: Thank you. Yeah, that was fun.

FLORA LICHTMAN: Dr. Emily Riehl, mathematician at Johns Hopkins University, and Dr. Daniel Litt, mathematician at the University of Toronto. This episode was produced by De Peterschmidt. If you’re surprised to find that you got all the feels from a conversation about math, please leave a review and tap, follow, or subscribe wherever you listen. Thank you for listening. I’m Flora Lichtman.

[MUSIC PLAYING]

Copyright © 2026 Science Friday Initiative. All rights reserved. Science Friday transcripts are produced on a tight deadline by 3Play Media. Fidelity to the original aired/published audio or video file might vary, and text might be updated or amended in the future. For the authoritative record of Science Friday’s programming, please visit the original aired/published recording. For terms of use and more information, visit our policies pages at http://www.sciencefriday.com/about/policies/

Meet the Producers and Host

About Flora Lichtman

Flora Lichtman is a host of Science Friday. In a previous life, she lived on a research ship where apertivi were served on the top deck, hoisted there via pulley by the ship’s chef.

About Dee Peterschmidt

Dee Peterschmidt is Science Friday’s audio production manager, hosted the podcast Universe of Art, and composes music for Science Friday’s podcasts. Their D&D character is a clumsy bard named Chip Chap Chopman.

Cookie	Duration	Description
_abck	1 year	This cookie is used to detect and defend when a client attempt to replay a cookie.This cookie manages the interaction with online bots and takes the appropriate actions.
ASP.NET_SessionId	session	Issued by Microsoft's ASP.NET Application, this cookie stores session data during a user's website visit.
AWSALBCORS	7 days	This cookie is managed by Amazon Web Services and is used for load balancing.
bm_sz	4 hours	This cookie is set by the provider Akamai Bot Manager. This cookie is used to manage the interaction with the online bots. It also helps in fraud preventions
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
csrftoken	past	This cookie is associated with Django web development platform for python. Used to help protect the website against Cross-Site Request Forgery attacks
JSESSIONID	session	The JSESSIONID cookie is used by New Relic to store a session identifier so that New Relic can monitor session counts for an application.
nlbi_972453	session	A load balancing cookie set to ensure requests by a client are sent to the same origin server.
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
TiPMix	1 hour	The TiPMix cookie is set by Azure to determine which web server the users must be directed to.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
visid_incap_972453	1 year	SiteLock sets this cookie to provide cloud-based website security services.
X-Mapping-fjhppofk	session	This cookie is used for load balancing purposes. The cookie does not store any personally identifiable data.
x-ms-routing-name	1 hour	Azure sets this cookie for routing production traffic by specifying the production slot.

Cookie	Duration	Description
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
bcookie	2 years	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
bscookie	2 years	LinkedIn sets this cookie to store performed actions on the website.
lang	session	LinkedIn sets this cookie to remember a user's language setting.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
S	1 hour	Used by Yahoo to provide ads, content or analytics.
sp_landing	1 day	The sp_landing is set by Spotify to implement audio content from Spotify on the website and also registers information on user interaction related to the audio content.
sp_t	1 year	The sp_t cookie is set by Spotify to implement audio content from Spotify on the website and also registers information on user interaction related to the audio content.
UserMatchHistory	1 month	LinkedIn sets this cookie for LinkedIn Ads ID syncing.

Cookie	Duration	Description
__jid	30 minutes	Cookie used to remember the user's Disqus login credentials across websites that use Disqus.
_gat	1 minute	This cookie is installed by Google Universal Analytics to restrain request rate and thus limit the collection of data on high traffic sites.
_gat_UA-28243511-22	1 minute	A variation of the _gat cookie set by Google Analytics and Google Tag Manager to allow website owners to track visitor behaviour and measure site performance. The pattern element in the name contains the unique identity number of the account or website it relates to.
AWSALB	7 days	AWSALB is an application load balancer cookie set by Amazon Web Services to map the session to the target.
countryCode	session	This cookie is used for storing country code selected from country selector.

Cookie	Duration	Description
_fbp	3 months	This cookie is set by Facebook to display advertisements when either on Facebook or on a digital platform powered by Facebook advertising, after visiting the website.
fr	3 months	Facebook sets this cookie to show relevant advertisements to users by tracking user behaviour across the web, on sites that have Facebook pixel or Facebook social plugin.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
NID	6 months	NID cookie, set by Google, is used for advertising purposes; to limit the number of times the user sees an ad, to mute unwanted ads, and to measure the effectiveness of ads.
personalization_id	2 years	Twitter sets this cookie to integrate and share features for social media and also store information about how the user uses the website, for tracking and targeting.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
vglnk.Agent.p	1 year	VigLink sets this cookie to track the user behaviour and also limit the ads displayed, in order to ensure relevant advertising.
vglnk.PartnerRfsh.p	1 year	VigLink sets this cookie to show users relevant advertisements and also limit the number of adverts that are shown to them.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
_dc_gtm_UA-28243511-20	1 minute	No description
abtest-identifier	1 year	No description
AnalyticsSyncHistory	1 month	No description
ARRAffinityCU	session	No description available.
ccc	1 month	No description
COMPASS	1 hour	No description
cookies.js_dtest	session	No description
debug	never	No description available.
donation-identifier	1 year	No description
f	never	No description available.
GFE_RTT	5 minutes	No description available.
incap_ses_1185_2233503	session	No description
incap_ses_1185_823975	session	No description
incap_ses_1185_972453	session	No description
incap_ses_1319_2233503	session	No description
incap_ses_1319_823975	session	No description
incap_ses_1319_972453	session	No description
incap_ses_1364_2233503	session	No description
incap_ses_1364_823975	session	No description
incap_ses_1364_972453	session	No description
incap_ses_1580_2233503	session	No description
incap_ses_1580_823975	session	No description
incap_ses_1580_972453	session	No description
incap_ses_198_2233503	session	No description
incap_ses_198_823975	session	No description
incap_ses_198_972453	session	No description
incap_ses_340_2233503	session	No description
incap_ses_340_823975	session	No description
incap_ses_340_972453	session	No description
incap_ses_374_2233503	session	No description
incap_ses_374_823975	session	No description
incap_ses_374_972453	session	No description
incap_ses_375_2233503	session	No description
incap_ses_375_823975	session	No description
incap_ses_375_972453	session	No description
incap_ses_455_2233503	session	No description
incap_ses_455_823975	session	No description
incap_ses_455_972453	session	No description
incap_ses_8076_2233503	session	No description
incap_ses_8076_823975	session	No description
incap_ses_8076_972453	session	No description
incap_ses_867_2233503	session	No description
incap_ses_867_823975	session	No description
incap_ses_867_972453	session	No description
incap_ses_9117_2233503	session	No description
incap_ses_9117_823975	session	No description
incap_ses_9117_972453	session	No description
li_gc	2 years	No description
loglevel	never	No description available.
msToken	10 days	No description

Subscribe to Science Friday

Donate To Science Friday

Segment Guests

Segment Transcript

Meet the Producers and Host

About Flora Lichtman

About Dee Peterschmidt

Explore More

Why does fashion repeat in 20-year cycles? Math has the answer

AI Music Is On The Charts. Where Does It Go From Here?