This Computer Won The 2021 American Crossword Puzzle Tournament

Subscribe to Science Friday

In 2012, a computer program named Dr. Fill placed 141st out of some 660 entries in that year’s American Crossword Puzzle Tournament, a competition for elite crossword puzzle solvers. This year, the algorithm beat the human competition, completing the final playoff puzzle in just 49 seconds.

The A.I. relies on a collection of different techniques to make sense of a puzzle. Sometimes, a simple fact is needed—who was the First Lady before Eleanor Roosevelt? (Lou Henry Hoover.) More often, however, crossword puzzle solutions rely not just on factual knowledge, but an ability to recognize themes that puzzle constructors have embedded in the crosswords, along with an understanding of puns, homonyms, and word play. (Think: Five letters, “dining table leaves”—SALAD!) The program makes a series of statistical calculations about likely answers, then tries to fit those possibilities into the puzzle squares.

This year, researchers from the Berkeley Natural Language Processing group added their expertise to Dr. Fill’s algorithms—a contribution that may have helped push Dr. Fill to its crowning victory.

But the program isn’t infallible. This year, it made three mistakes solving puzzles during the tournament, while some human solvers completed the puzzles perfectly. It can make these errors with any unique puzzle form it’s never seen before.

Matt Ginsberg, the computer programmer behind Dr. Fill, joins Ira to talk about the competition and the advances his program has made over the years.

Segment Guests

Matt Ginsberg

Matt Ginsberg is a computer scientist, crossword constructor, and co-founder of Connected Signals in Eugene, Oregon.

Segment Transcript

IRA FLATOW: For the rest of the hour, an update on a story we first told you about back in 2014– a computer program called Dr. Fill that could solve crossword puzzles. Last month, for the first time, the program unofficially beat human competition in the American Crossword Puzzle Tournament, solving the playoff puzzle in just– get this– 49 seconds. Oh. Joining me now is Matt Ginsberg, the computer programmer who developed Dr. Fill. Welcome back to Science Friday.

MATT GINSBERG: Thanks, Ira.

IRA FLATOW: I want to tell everybody that we have a video demo of how the program tackled some of this year’s competition puzzles up there in our website, sciencefriday.com/crossword. OK, first of all, describe the format of the competition for us. What do you have to do?

MATT GINSBERG: You have to solve seven crosswords. And six of them are Saturday. One is Sunday morning. And then there’s a championship puzzle for the people who did the best on Sunday afternoon.

It’s timed. The puzzles vary in difficulty. The fifth puzzle is actually far and away the hardest. The first is the easiest, sort of give you a feeling of confidence going in. The sixth is easy so you don’t feel so bad about what happened to you on the fifth puzzle.

And the seventh on Sunday is– it’s like a Sunday-size puzzle. It’s not overwhelmingly difficult. And people solve them flawlessly in a few minutes.

IRA FLATOW: Wow. And how well did your program do in this recent tournament?

MATT GINSBERG: Dr. Fill made three mistakes, but it was so much faster than the humans– typically solving the puzzles in less than a minute– that based on the scoring system in use, it came out a tiny bit ahead of the top human.

IRA FLATOW: Did it shackle the humans there?

MATT GINSBERG: It was a virtual tournament this year, so it couldn’t do anything. And when the tournament is live, Will Shortz, who runs it, typically reports. So when he starts puzzle 2, he says, on puzzle 1, Dr. Fill did thus and so. And when Dr. Fill does poorly, everybody applauds. And when Dr. Fill does well, everybody boos. But it’s sort of good-natured competition between man and machine.

IRA FLATOW: Well, we actually reached out to puzzle master Will Shortz for his thoughts, and he said, when it comes to something new and never seen before, humans still have the advantage in figuring it out. Oh, fighting words.

MATT GINSBERG: No, it’s absolutely true. One of the puzzles this year was very clever. One of the clues, for example, was “crazed,” and the answer was “mannequin.” And the next clue over was “deduce,” and the answer was “fur.” And that makes no sense at all until you realize that if you say it– “mannequin fur”– you’re actually saying “manic” for “crazed” and “infer” for “deduced.”

Dr. Fill didn’t understand it at all. It managed to do pretty well in that puzzle. It made one mistake because it was solving the down clues. But it had no clue what was going on.

I think next year– I’m excited about next year. We’re going to work very hard on Dr. Fill. And by we, I mean me and the Berkeley Natural Language Group, who helped this year. We’re going to work very hard on making Dr. Fill better. And I think the constructors are going to work very hard on making Dr. Fill worse. So we’ll see next year who wins that little battle.

IRA FLATOW: Let’s talk a bit about how it solves a puzzle. What are the steps it works through?

MATT GINSBERG: So it doesn’t solve the puzzle like we do. So when I solve a puzzle– when I try. I’m terrible. When I try and solve a puzzle, I say, oh, here’s a clue. My level of clue is, “Scooby blank,” three letters. So I say, oh, that’s “doo,” and I put it in.

When Dr. Fill solves a puzzle, it actually doesn’t write anything in. It makes giant lists of every possible word in every possible slot and how good it feels about putting that word in that position. And then, armed with those lists, it looks at all these combinations. Well, I can put this word here and that word there, and how do I feel about the combination? And it’s doing a ton of search over possible ways to fill in the puzzle. And it has algorithms that are designed to help it find the overall fill that it feels the best about.

And what the Berkeley guys did is they brought in work where it was more likely that if the program thought an answer was correct, it actually was correct. So that made it much more likely, when it was done with the puzzle, that it actually had gotten all the right answers in all the various places.

IRA FLATOW: Puzzle constructors sometimes reuse clues. Does your program have a database of past answers that it tries?

MATT GINSBERG: It does. It’s very happy when it finds a clue that’s been used before. So when I make a puzzle, for example, I like looking for clues that have been used before to clue a different word of that length, because then you’re throwing a little trick at the solver. And Dr. Fill knows about that. And it says, you know, I’ve seen this before, so probably it’s the same answer as before, but not for sure. But I’ll be pretty happy if I can put that answer in.

IRA FLATOW: You know, to be successful, you need to know a lot of random facts, like who were the Academy Award winners in a certain year. Where is it getting its information? Is it doing live look-ups of Wikipedia for some of the clues as it’s solving the puzzle?

MATT GINSBERG: It can. So one of the rules that Will Shortz and I agreed on was that Dr. Fill was not allowed to access the internet. So it can’t do a Google search, for example. But it’s a computer, right? So I do have a downloaded copy of Wikipedia that it can look in.

One of the things Will’s done for crosswords is it’s not nearly as fact based as used to be. So yeah, occasionally you do see, you know, Eleanor’s predecessor as first lady, something along those lines. And there, you would like to be able to look it up.

But most of the crosswords these days are about common sense knowledge put in interesting ways. And there, it’s really a matter of having some understanding, you know, that “crazed” and “manic” mean the same thing. And there, you have a thesaurus. You have a dictionary. It’s got all these resources that it uses to try and figure out what the various words mean.

And again, this is where the Berkeley collaboration has been so great, because they are using a system that is not totally unlike when you talk to Siri. There’s a system that’s figuring out what you probably need and generating a useful response. And they’re using a system similar, at a high level, to those other systems. It’s designed specifically for crosswords, but it’s using all the work on machine learning and natural language and so on to help Dr. Fill understand a bit better what the right answer to any particular clue is likely to be.

IRA FLATOW: This is Science Friday from WNYC Studios. How well would Dr. Fill work against IBM’s Watson?

MATT GINSBERG: It’s really different domains. Jeopardy really is about facts. When the answer comes up on the Jeopardy board, it doesn’t say, 13th president of the United States, seven letters. When Dr. Fill gets a query, it knows exactly how long the answer is. It makes enormous use of the crossing words to say, oh, seven letters, and the second one is an “I.” That’s essential to what Dr. Fill is doing, and Watson has no capability like that at all.

On the other hand, Dr. Fill is dealing with situations that are deliberately vague, deliberately confusing. There’s a clue in crosswords, for example, you may see, and it’s “Nice flower.” And it’s actually the river that goes through the French city of nice. And you’re supposed to parse that as “Nice ‘flow-er.'” They would never do that to you in Jeopardy, whereas crossword constructors do it to the solvers all the time.

So very, very different problem. Hard for different reasons. Easy for different reasons. And you’ve got two pieces of software that really are just good at what they do.

I mean, Dr. Fill isn’t going to do anything other than solve crosswords. It’s not designed to do anything else. And Watson’s forte is playing Jeopardy. And they feel like they’re the same, but they’re really quite different.

IRA FLATOW: Interesting. I know that crossword puzzle constructors are tricky. And in more difficult puzzles, they can build in added layers of complexity like you’re talking about, like having answers that I’ll skip a certain letter, for instance or having answers that read backwards. How well does Fill do with these types of challenges?

MATT GINSBERG: It’s terrible.

IRA FLATOW: [LAUGHS]

MATT GINSBERG: So there was a puzzle at the tournament one year where every clue was a spoonerism. Dr. Fill can’t figure that out. It doesn’t understand. It has no way to sort of get started.

And if you’ve got these overarching themes– I made a puzzle once where every word was a homonym. But what you were supposed to enter into the grid wasn’t the actual word. It was the homonym of that word. Dr. Fill can’t do that. It just doesn’t understand that none of its rules will help it.

It does understand simple themes. So you might have a puzzle where the long entries add an “E” to a common phrase to get some wacky phrase. It’ll understand about that, and it’ll say, oh, look, the theme is add an “E.”

But these overarching themes are too hard for it, because it doesn’t really know what it’s doing, right? It’s just following rules to fill letters in in a grid. And when Will says in something truly imaginative, humans have the edge, he is absolutely right.

IRA FLATOW: I’d like to thank my guest this hour, Matt Ginsberg, computer programmer. His program Dr. Fill unofficially won the most recent American Crossword Puzzle Tournament. Thank you, Matt, for taking time to be with us today.

MATT GINSBERG: Thank you very much.

IRA FLATOW: Matt is also author of the book Factor Man.

Copyright © 2021 Science Friday Initiative. All rights reserved. Science Friday transcripts are produced on a tight deadline by 3Play Media. Fidelity to the original aired/published audio or video file might vary, and text might be updated or amended in the future. For the authoritative record of Science Friday’s programming, please visit the original aired/published recording. For terms of use and more information, visit our policies pages at http://www.sciencefriday.com/about/policies/

Meet the Producers and Host

About Charles Bergquist

As Science Friday’s director and senior producer, Charles Bergquist channels the chaos of a live production studio into something sounding like a radio program. Favorite topics include planetary sciences, chemistry, materials, and shiny things with blinking lights.

About Ira Flatow

Ira Flatow is the founder and host of Science Friday. His green thumb has revived many an office plant at death’s door.

Cookie	Duration	Description
_abck	1 year	This cookie is used to detect and defend when a client attempt to replay a cookie.This cookie manages the interaction with online bots and takes the appropriate actions.
ASP.NET_SessionId	session	Issued by Microsoft's ASP.NET Application, this cookie stores session data during a user's website visit.
AWSALBCORS	7 days	This cookie is managed by Amazon Web Services and is used for load balancing.
bm_sz	4 hours	This cookie is set by the provider Akamai Bot Manager. This cookie is used to manage the interaction with the online bots. It also helps in fraud preventions
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
csrftoken	past	This cookie is associated with Django web development platform for python. Used to help protect the website against Cross-Site Request Forgery attacks
JSESSIONID	session	The JSESSIONID cookie is used by New Relic to store a session identifier so that New Relic can monitor session counts for an application.
nlbi_972453	session	A load balancing cookie set to ensure requests by a client are sent to the same origin server.
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
TiPMix	1 hour	The TiPMix cookie is set by Azure to determine which web server the users must be directed to.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
visid_incap_972453	1 year	SiteLock sets this cookie to provide cloud-based website security services.
X-Mapping-fjhppofk	session	This cookie is used for load balancing purposes. The cookie does not store any personally identifiable data.
x-ms-routing-name	1 hour	Azure sets this cookie for routing production traffic by specifying the production slot.

Cookie	Duration	Description
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
bcookie	2 years	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
bscookie	2 years	LinkedIn sets this cookie to store performed actions on the website.
lang	session	LinkedIn sets this cookie to remember a user's language setting.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
S	1 hour	Used by Yahoo to provide ads, content or analytics.
sp_landing	1 day	The sp_landing is set by Spotify to implement audio content from Spotify on the website and also registers information on user interaction related to the audio content.
sp_t	1 year	The sp_t cookie is set by Spotify to implement audio content from Spotify on the website and also registers information on user interaction related to the audio content.
UserMatchHistory	1 month	LinkedIn sets this cookie for LinkedIn Ads ID syncing.

Cookie	Duration	Description
__jid	30 minutes	Cookie used to remember the user's Disqus login credentials across websites that use Disqus.
_gat	1 minute	This cookie is installed by Google Universal Analytics to restrain request rate and thus limit the collection of data on high traffic sites.
_gat_UA-28243511-22	1 minute	A variation of the _gat cookie set by Google Analytics and Google Tag Manager to allow website owners to track visitor behaviour and measure site performance. The pattern element in the name contains the unique identity number of the account or website it relates to.
AWSALB	7 days	AWSALB is an application load balancer cookie set by Amazon Web Services to map the session to the target.
countryCode	session	This cookie is used for storing country code selected from country selector.

Cookie	Duration	Description
_fbp	3 months	This cookie is set by Facebook to display advertisements when either on Facebook or on a digital platform powered by Facebook advertising, after visiting the website.
fr	3 months	Facebook sets this cookie to show relevant advertisements to users by tracking user behaviour across the web, on sites that have Facebook pixel or Facebook social plugin.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
NID	6 months	NID cookie, set by Google, is used for advertising purposes; to limit the number of times the user sees an ad, to mute unwanted ads, and to measure the effectiveness of ads.
personalization_id	2 years	Twitter sets this cookie to integrate and share features for social media and also store information about how the user uses the website, for tracking and targeting.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
vglnk.Agent.p	1 year	VigLink sets this cookie to track the user behaviour and also limit the ads displayed, in order to ensure relevant advertising.
vglnk.PartnerRfsh.p	1 year	VigLink sets this cookie to show users relevant advertisements and also limit the number of adverts that are shown to them.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
_dc_gtm_UA-28243511-20	1 minute	No description
abtest-identifier	1 year	No description
AnalyticsSyncHistory	1 month	No description
ARRAffinityCU	session	No description available.
ccc	1 month	No description
COMPASS	1 hour	No description
cookies.js_dtest	session	No description
debug	never	No description available.
donation-identifier	1 year	No description
f	never	No description available.
GFE_RTT	5 minutes	No description available.
incap_ses_1185_2233503	session	No description
incap_ses_1185_823975	session	No description
incap_ses_1185_972453	session	No description
incap_ses_1319_2233503	session	No description
incap_ses_1319_823975	session	No description
incap_ses_1319_972453	session	No description
incap_ses_1364_2233503	session	No description
incap_ses_1364_823975	session	No description
incap_ses_1364_972453	session	No description
incap_ses_1580_2233503	session	No description
incap_ses_1580_823975	session	No description
incap_ses_1580_972453	session	No description
incap_ses_198_2233503	session	No description
incap_ses_198_823975	session	No description
incap_ses_198_972453	session	No description
incap_ses_340_2233503	session	No description
incap_ses_340_823975	session	No description
incap_ses_340_972453	session	No description
incap_ses_374_2233503	session	No description
incap_ses_374_823975	session	No description
incap_ses_374_972453	session	No description
incap_ses_375_2233503	session	No description
incap_ses_375_823975	session	No description
incap_ses_375_972453	session	No description
incap_ses_455_2233503	session	No description
incap_ses_455_823975	session	No description
incap_ses_455_972453	session	No description
incap_ses_8076_2233503	session	No description
incap_ses_8076_823975	session	No description
incap_ses_8076_972453	session	No description
incap_ses_867_2233503	session	No description
incap_ses_867_823975	session	No description
incap_ses_867_972453	session	No description
incap_ses_9117_2233503	session	No description
incap_ses_9117_823975	session	No description
incap_ses_9117_972453	session	No description
li_gc	2 years	No description
loglevel	never	No description available.
msToken	10 days	No description

This Computer Won The 2021 American Crossword Puzzle Tournament

Subscribe to Science Friday

Further Reading

Segment Guests

Segment Transcript

Meet the Producers and Host

About Charles Bergquist

About Ira Flatow

Explore More

Subscribe to Science Friday

Further Reading

Segment Guests

Segment Transcript

Meet the Producers and Host

About Charles Bergquist

About Ira Flatow

Explore More

An Illustrated Exploration Of Hypothetical Futures

‘Dr.Fill’ Vies for Crossword Solving Supremacy