Rapper And Scholar Sammus Confronts AI In Hip-Hop

Subscribe to Science Friday

a Black woman against a wall holding her hands to the sides of her face looking directly at the camera — Rapper and scholar Sammus. Credit: Vrinda Jagota and Zoloo Brown

Over the last six months, there’s been a lot of movement and discussion about the effects that generative AI will have on visual art and writing. But what about its effects on music—in particular, hip-hop?

A few years ago, a deep fake of Kanye West rapping a verse from “Bohemian Rhapsody” by Queen went viral. It was created with just a few clicks using the program Uberduck, which can output AI-generated raps from text of the users’ choice. And it turns out that the rhythmic qualities that make hip-hop performers’ verses so spellbinding is exactly what makes them easier to mimic in deep fakes, as opposed to other genres of music.

Guest host Regina Barber talks with rapper and music, science, and technology scholar Dr. Enongo Lumumba-Kasongo, also known as Sammus, about the unexpected crossovers between hip-hop and the growing field of generative AI. She is also an assistant professor of music at Brown University in Providence, Rhode Island.

Segment Guests

Enongo Lumumba-Kasongo

Dr. Enongo Lumumba-Kasongo is an assistant professor of Music at Brown University in Providence, Rhode Island.

Segment Transcript

JOHN DANKOSKY: This is Science Friday. I’m John Dankosky.

REGINA BARBER: And I’m Regina Barber. On Science Friday, we’ve been following closely the explosion of new artificial intelligence tools available to the public and the ethical implications in using AI within creative mediums. This week, we’re focusing on how AI intersects with the most popular music genre in the United States, hip hop.

Take a listen to this viral deepfake of Kanye West rapping a verse from Bohemian Rhapsody by Queen.

RAPPER (ON RECORDING): This is this the real life. This is this just fantasy caught in a landslide no escape from reality opinion. I looked up to the skies and see I’m just a poor boy need no sympathy.

REGINA BARBER: It was created with just a few clicks using the program Uber Duck, and it turns out that what makes hip hop performers verses so spellbinding is what makes them easier to mimic in deep fakes than other genres of music.

Joining me now to talk more about the role of AI and the future of hip hop is my guest, the rapper and music, science, and technology scholar Dr. Enongo Lumumba-Kasongo, also known as SAMMUS. She’s an assistant professor of music at Brown University based in Providence, Rhode Island. Welcome to Science Friday, Dr. Lumumba-Kasongo.

ENONGO LUMUMBA-KASONGO: Hi, this is so exciting.

REGINA BARBER: Let’s talk about that Kanye deepfake. What was your reaction when you first heard it?

ENONGO LUMUMBA-KASONGO: Oh, I was horrified. I was totally horrified, in part, just because of the space that Kanye West occupies culturally right now.

REGINA BARBER: Right.

ENONGO LUMUMBA-KASONGO: But then also, of course, I immediately– my brain started spinning and thinking about what are the implications of this thing? How is this going to change the landscape, soundscape of the art form in which I work and the art form that I’ve fallen in love with?

REGINA BARBER: Is the ability to create convincing deepfakes unique to hip hop? I imagine you could also create a deepfake to say somebody like Taylor Swift or something.

ENONGO LUMUMBA-KASONGO: Yeah, it’s totally possible to create deepfakes in other genres. As long as you have the sonic data, then you’re able to manipulate it and create a pool from which to draw material. The thing that’s interesting and troubling about the development within hip hop is that there’s so much speech data.

There’s so much word data because, as folks who are familiar with the form understand, and according to the words of Adam Bradley who is a scholar and thinker of hip hop poetics, there’s more speech data per line in a rap song than there is in a song on another genre.

REGINA BARBER: So other than the words per line, there’s also a rapper’s unique style. So actually, does that make it easier to mimic?

ENONGO LUMUMBA-KASONGO: That’s a really great question. Part of my research and interest in this has led me down these interesting wormholes in the world of rap generators. So there are online platforms for folks to quote unquote “craft” verses and, in some cases, craft verses in the style of a particular artist.

So what that means is, for that particular rap generator, the pool of words and phrases are coming directly from a specific artist’s catalog. So there’s a generator for the rapper MF DOOM that’s amazing and totally nonsensical and very silly to play with, which was developed by Nabil Hassein, who’s a technologist.

And he developed this tool to generate rhymes that are in the quote unquote “style” of MF DOOM. So it’s like what does that mean? Well, what that means is that his entire pool of verses has been mined for different rhyme sounds, different combinations of words. And those are the types of bars that are served up to the quote unquote “writer.”

REGINA BARBER: And we’ll get to how good those generators are, but let’s first talk about you’d written a piece for public books that this ability for AI to convincingly impersonate Black artists is part of a legacy of white people impersonating Black performers. Can you unpack that a little bit for us?

ENONGO LUMUMBA-KASONGO: Absolutely, so folks might be familiar with terms like high-tech blackface and digital blackface, which have emerged in popular conversations recently and were coming from the world of critical media studies to talk about the new tools of the digital age that allow non-Black people to adopt Black personhood through their avatars and across different platforms.

Many folks will be familiar with TikTok and the way that it allows folks to mime particular artists. And it’s interesting to think about what circulates, what kinds of clips achieve virality.

And so when we talk about digital blackface, the term blackface itself refers to blackface minstrelsy, which is racist theater and musical form that emerged in the early 19th century. And it became America’s first national form of entertainment. This practice has been further understood, not just as a space for mockery and disgust, but also a reflection of white fascination with the other, with the risk of entering into that space, becoming this other group, right?

Becoming the other allows for some kind of transcendence or connection with a quote unquote “primal self.” These kinds of ideas are what undergirded my understanding of what was happening with tools like Uber Duck, what the potential is for tools like this to perpetuate some of these racist practices.

REGINA BARBER: Right, and because none of this is new, this kind of appropriation, how could AI remove accountability for appropriating Black artists’ music?

ENONGO LUMUMBA-KASONGO: Yeah, so I think this can happen in a number of ways. Primarily, I think part of how appropriation in the digital age has functioned is that the distance between creator and appropriator, for lack of a better term, has been widened, right? The gulf between where the material is coming from and the person who’s able to co-opt that has really widened in some ways.

And so folks are much less– phrases and ideas will pop up on our feeds, right? And we might have no relationship with where that thing came from. And so in so many cases, we’ve seen African-American Vernacular English phrases that are coming specifically from Black folks that then make their way through the ecosystem of the internet.

And next thing we know, it’s the catch phrase of a brand, right? A brand is utilizing this phrase for monetary gain, or a particular artist or influencer who’s not at all connected with the space from which these ideas and creations emerge.

And so I think with AI specifically, the gulf is further widened because we have this generalizable pool, right? There’s a rapper from nowhere, essentially. We’re able to create a voice from quote unquote “nowhere” especially when the pool from which these words and phrases and ideas is not known to the listener and/or writer. And so that’s what really concerns me because I worry there’s no way to account for credit in this system.

REGINA BARBER: I want to circle back to talk more about using the AI. Remember, we said is it good? You have some firsthand experience in working to develop a rap generator for a video game based on the HBO TV series Insecure. Congratulations. That sounds amazing.

[LAUGHTER]

ENONGO LUMUMBA-KASONGO: Thank you.

REGINA BARBER: In the research process you tried out some of the online lyric generators. What did you find? What were their limitations? What were their benefits?

ENONGO LUMUMBA-KASONGO: Well, so in terms of the limitations, one of the things that I found was that, for the most part, the lyrics are kind of nonsensical. You’ll type in a particular prompt, and I picked the word anything.

[LAUGHTER]

So yeah, I want to rap about anything. So tell me how to do that. So I plugged that in. And the verse that came back to me was fairly incoherent as a narrative. But one thing that was interesting as I continued to utilize the tool is I would see the same kinds of lyrics, language that could be coded as misogynist and/or queer antagonistic kept popping up no matter what I was trying to rap about.

And so I wondered, is this a reflection of the pool of words and lyrics that the developers decided to use? Or is this a reflection of the biases of the developers to make sure that those kinds of phrases and framings were showing up in every single iteration of any kind of rap verse I wanted to develop, which I think reflects a broader problem around how people view and listen to hip hop music more broadly.

I know through my own experience as a writer and producer and performer how vast and incredible and incredibly creative and ingenious emcees can be. And yet, in these tools, in the tool that I was using I was so limited in the ways that I could speak about the world. It kept reflecting the tired ideas about hip hop as uniquely perverse or uniquely invested in misogyny or queer antagonism.

REGINA BARBER: It just makes me think of that Outkast line. You thought hip hop was only guns and alcohol.

ENONGO LUMUMBA-KASONGO: Yeah, period.

[LAUGHTER]

REGINA BARBER: You also decided to put some guardrails on what words players could and could not say in the game. How do you go about making those decisions?

ENONGO LUMUMBA-KASONGO: Yeah, ooh, that was so complicated. And I think that’s part of it. I think that the community aspect of developing a tool is critical to its success and critical to its engagement with the communities for whom the thing ostensibly is supposed to serve, right?

I have my ideas about what counts as a dope verse. Or I have my ideas about what’s offensive or what’s interesting. But as a group, as a game studio, we would regularly have conversations about particular words, not running away from words that could be perceived as harmful, but thinking about where the harm is coming from and how to subvert that harm.

So the game allows folks to use the b-word, for example, and you can use that word but only to refer to yourself affirmatively. And so that was a caveat that we threw into the game process because we didn’t want it to be abused or used as a way of demeaning somebody.

But there were certain other words that we excised, the n-word, we made sure that that wasn’t accessible because of the racial politics of the moment. We didn’t have the capacity to figure out an artful way to engage with that, not knowing who was going to be playing the game.

REGINA BARBER: I’m Regina Barber, and this is Science Friday from WNYC Studios. You recently also tested ChatGPT to see how well it might write some of your verses. Were they any good, better than the previous lyric generators you’ve tested before?

ENONGO LUMUMBA-KASONGO: So the verses were definitely light years ahead of the other generators that I tried just in terms of coherence. So I plugged myself in, and the verse that’s supposedly coming from me says, I don’t fit the mold. That’s for sure. I’m not just another rapper. I’m something more. I spit fire on the mic like a dragon’s roar, and I’m not afraid to speak up. That’s what I’m here for.

[LAUGHTER]

REGINA BARBER: OK.

ENONGO LUMUMBA-KASONGO: Yeah, so it’s getting at the things that are important for that particular rapper.

REGINA BARBER: Like dragons.

ENONGO LUMUMBA-KASONGO: I love dragons.

[LAUGHTER]

So that’s one but. Also this idea of not fitting the mold, that is actually something that I’m invested in as an emcee. And it understands what we’re about but cannot approach the way that that emerges as a kind of sonic representation or as a written verse, that the complexity of how that’s expressed through each individual artist, the richness of that is completely cut from the picture.

REGINA BARBER: One of the things that ChatGPT isn’t really great at is slant rhymes. Can you explain what a slant rhyme is?

ENONGO LUMUMBA-KASONGO: There are perfect rhymes, and there are slant rhymes. So a perfect rhyme is when the rhyme sounds are exactly the same, so like car and bar, right? And a slant rhyme is when the vowel sound is similar or shared, but the actual makeup of the word is not totally the same.

So the first few lines of a song I have called 1080p say, I’m kind of scared of the Academy. I think that my parents are proud of me. I just wish I knew how to be comfortable here. I never feel like I’m allowed to breathe. So Academy, proud of me, allowed to breathe, they have similar rhyme sounds. And through the magic of rap vocalizing, you can bend them to sound more similar than they would in speech. But that’s an example of how slant rhymes emerge in the rap space.

REGINA BARBER: For one, I love 1080p. But now, let’s listen to an example of another good slant rhyme of yours. Here’s a bit from your verse on Open Mic Eagle’s track Hymnal.

[AUDIO PLAYBACK]

[OPEN MIC EAGLE, “HYMNAL”] (RAPPING) I’d rather be hiding alone like some Ewoks, up in treetops, creeping around like on T-Boz, steeping the grounds of my teapots, but I’m Steve Jobs, on my Apple updating my eshops, eat an Apple a day, take a brief pause–

REGINA BARBER: Yeah, your job is not being taken any time soon.

[LAUGHTER]

It’s just– it’s not. You’re amazing.

ENONGO LUMUMBA-KASONGO: Thank you. Thank you. It was great to listen to that again.

REGINA BARBER: We’ve just run through all the negatives, most of the negatives of AI and its interaction with hip hop. Do you see any possibility for something good to come out of using AI in hip hop?

ENONGO LUMUMBA-KASONGO: Yeah, so something interesting that came up when I was using ChatGPT is I typed in something like write a rhyme in the style of Jay-Z. And the first thing that popped up was this note.

“I do not have the ability to rap like Jay-Z or any other artist. However, I can suggest that if you want to learn how to rap like Jay-Z, you might consider studying his music and style, practicing freestyling and writing lyrics, and working with a vocal coach or other experienced rapper to develop your skills. Remember that becoming a skilled rapper takes time, dedication, and practice. So don’t get discouraged if it takes a while to achieve your goals. Good luck.”

And I thought that was really amazing because one of the things that I feel like all of these rap generators has missed is an understanding of the complexity of the form. This is really incredible, poetic, and sonic work. And the last thing that I think could be interesting is that there could be a space in which we start building our own libraries and sharing that with folks and having that become an interesting art form.

So I’m thinking about the words of Alexis Andre who works for Sony computer science laboratories. And I was on a panel with him about ethics and aesthetics last year. And he brought up this really provocative idea about the data itself representing a kind of art form or asset.

And so it’s like here’s the SAMMUS library, right? Here’s the library of common phrases and terms and ideas that come up in this artist’s work. And people might be able to develop that for themselves, which I think could be interesting, could be fun and exciting.

REGINA BARBER: Dr. Lumumba-Kasongo, SAMMUS, thank you so much for this great conversation. It was wonderful.

ENONGO LUMUMBA-KASONGO: Thank you. This was awesome.

REGINA BARBER: Dr. Enongo Lumumba-Kasongo, also known as SAMMUS, is an assistant professor of music at Brown University based in Providence, Rhode Island.

Copyright © 2023 Science Friday Initiative. All rights reserved. Science Friday transcripts are produced on a tight deadline by 3Play Media. Fidelity to the original aired/published audio or video file might vary, and text might be updated or amended in the future. For the authoritative record of Science Friday’s programming, please visit the original aired/published recording. For terms of use and more information, visit our policies pages at http://www.sciencefriday.com/about/policies/.

Meet the Producers and Host

About Shoshannah Buxbaum

Shoshannah Buxbaum is a producer for Science Friday. She’s particularly drawn to stories about health, psychology, and the environment. She’s a proud New Jersey native and will happily share her opinions on why the state is deserving of a little more love.

About Regina G. Barber

@ScienceRegina

Regina G. Barber is a scientist in residence at Short Wave, from NPR.

Cookie	Duration	Description
_abck	1 year	This cookie is used to detect and defend when a client attempt to replay a cookie.This cookie manages the interaction with online bots and takes the appropriate actions.
ASP.NET_SessionId	session	Issued by Microsoft's ASP.NET Application, this cookie stores session data during a user's website visit.
AWSALBCORS	7 days	This cookie is managed by Amazon Web Services and is used for load balancing.
bm_sz	4 hours	This cookie is set by the provider Akamai Bot Manager. This cookie is used to manage the interaction with the online bots. It also helps in fraud preventions
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
csrftoken	past	This cookie is associated with Django web development platform for python. Used to help protect the website against Cross-Site Request Forgery attacks
JSESSIONID	session	The JSESSIONID cookie is used by New Relic to store a session identifier so that New Relic can monitor session counts for an application.
nlbi_972453	session	A load balancing cookie set to ensure requests by a client are sent to the same origin server.
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
TiPMix	1 hour	The TiPMix cookie is set by Azure to determine which web server the users must be directed to.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
visid_incap_972453	1 year	SiteLock sets this cookie to provide cloud-based website security services.
X-Mapping-fjhppofk	session	This cookie is used for load balancing purposes. The cookie does not store any personally identifiable data.
x-ms-routing-name	1 hour	Azure sets this cookie for routing production traffic by specifying the production slot.

Cookie	Duration	Description
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
bcookie	2 years	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
bscookie	2 years	LinkedIn sets this cookie to store performed actions on the website.
lang	session	LinkedIn sets this cookie to remember a user's language setting.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
S	1 hour	Used by Yahoo to provide ads, content or analytics.
sp_landing	1 day	The sp_landing is set by Spotify to implement audio content from Spotify on the website and also registers information on user interaction related to the audio content.
sp_t	1 year	The sp_t cookie is set by Spotify to implement audio content from Spotify on the website and also registers information on user interaction related to the audio content.
UserMatchHistory	1 month	LinkedIn sets this cookie for LinkedIn Ads ID syncing.

Cookie	Duration	Description
__jid	30 minutes	Cookie used to remember the user's Disqus login credentials across websites that use Disqus.
_gat	1 minute	This cookie is installed by Google Universal Analytics to restrain request rate and thus limit the collection of data on high traffic sites.
_gat_UA-28243511-22	1 minute	A variation of the _gat cookie set by Google Analytics and Google Tag Manager to allow website owners to track visitor behaviour and measure site performance. The pattern element in the name contains the unique identity number of the account or website it relates to.
AWSALB	7 days	AWSALB is an application load balancer cookie set by Amazon Web Services to map the session to the target.
countryCode	session	This cookie is used for storing country code selected from country selector.

Cookie	Duration	Description
_fbp	3 months	This cookie is set by Facebook to display advertisements when either on Facebook or on a digital platform powered by Facebook advertising, after visiting the website.
fr	3 months	Facebook sets this cookie to show relevant advertisements to users by tracking user behaviour across the web, on sites that have Facebook pixel or Facebook social plugin.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
NID	6 months	NID cookie, set by Google, is used for advertising purposes; to limit the number of times the user sees an ad, to mute unwanted ads, and to measure the effectiveness of ads.
personalization_id	2 years	Twitter sets this cookie to integrate and share features for social media and also store information about how the user uses the website, for tracking and targeting.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
vglnk.Agent.p	1 year	VigLink sets this cookie to track the user behaviour and also limit the ads displayed, in order to ensure relevant advertising.
vglnk.PartnerRfsh.p	1 year	VigLink sets this cookie to show users relevant advertisements and also limit the number of adverts that are shown to them.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
_dc_gtm_UA-28243511-20	1 minute	No description
abtest-identifier	1 year	No description
AnalyticsSyncHistory	1 month	No description
ARRAffinityCU	session	No description available.
ccc	1 month	No description
COMPASS	1 hour	No description
cookies.js_dtest	session	No description
debug	never	No description available.
donation-identifier	1 year	No description
f	never	No description available.
GFE_RTT	5 minutes	No description available.
incap_ses_1185_2233503	session	No description
incap_ses_1185_823975	session	No description
incap_ses_1185_972453	session	No description
incap_ses_1319_2233503	session	No description
incap_ses_1319_823975	session	No description
incap_ses_1319_972453	session	No description
incap_ses_1364_2233503	session	No description
incap_ses_1364_823975	session	No description
incap_ses_1364_972453	session	No description
incap_ses_1580_2233503	session	No description
incap_ses_1580_823975	session	No description
incap_ses_1580_972453	session	No description
incap_ses_198_2233503	session	No description
incap_ses_198_823975	session	No description
incap_ses_198_972453	session	No description
incap_ses_340_2233503	session	No description
incap_ses_340_823975	session	No description
incap_ses_340_972453	session	No description
incap_ses_374_2233503	session	No description
incap_ses_374_823975	session	No description
incap_ses_374_972453	session	No description
incap_ses_375_2233503	session	No description
incap_ses_375_823975	session	No description
incap_ses_375_972453	session	No description
incap_ses_455_2233503	session	No description
incap_ses_455_823975	session	No description
incap_ses_455_972453	session	No description
incap_ses_8076_2233503	session	No description
incap_ses_8076_823975	session	No description
incap_ses_8076_972453	session	No description
incap_ses_867_2233503	session	No description
incap_ses_867_823975	session	No description
incap_ses_867_972453	session	No description
incap_ses_9117_2233503	session	No description
incap_ses_9117_823975	session	No description
incap_ses_9117_972453	session	No description
li_gc	2 years	No description
loglevel	never	No description available.
msToken	10 days	No description

Subscribe to Science Friday

Segment Guests

Segment Transcript

Meet the Producers and Host

About Shoshannah Buxbaum

About Regina G. Barber

Explore More

Blending The Sounds Of Climate Change With Appalachian Music

How Will AI Image Generators Affect Artists?