02/24/2023

Rapper And Scholar Sammus Confronts AI In Hip-Hop

17:07 minutes

a Black woman against a wall holding her hands to the sides of her face looking directly at the camera
Rapper and scholar Sammus. Credit: Vrinda Jagota and Zoloo Brown

Over the last six months, there’s been a lot of movement and discussion about the effects that generative AI will have on visual art and writing. But what about its effects on music—in particular, hip-hop?

A few years ago, a deep fake of Kanye West rapping a verse from “Bohemian Rhapsody” by Queen went viral. It was created with just a few clicks using the program Uberduck, which can output AI-generated raps from text of the users’ choice. And it turns out that the rhythmic qualities that make hip-hop performers’ verses so spellbinding is exactly what makes them easier to mimic in deep fakes, as opposed to other genres of music. 

Guest host Regina Barber talks with rapper and music, science, and technology scholar Dr. Enongo Lumumba-Kasongo, also known as Sammus, about the unexpected crossovers between hip-hop and the growing field of generative AI. She is also an assistant professor of music at Brown University in Providence, Rhode Island.


Segment Guests

Enongo Lumumba-Kasongo

Dr. Enongo Lumumba-Kasongo is an assistant professor of Music at Brown University in Providence, Rhode Island.

Segment Transcript

JOHN DANKOSKY: This is Science Friday. I’m John Dankosky.

REGINA BARBER: And I’m Regina Barber. On Science Friday, we’ve been following closely the explosion of new artificial intelligence tools available to the public and the ethical implications in using AI within creative mediums. This week, we’re focusing on how AI intersects with the most popular music genre in the United States, hip hop.

Take a listen to this viral deepfake of Kanye West rapping a verse from Bohemian Rhapsody by Queen.

RAPPER (ON RECORDING): This is this the real life. This is this just fantasy caught in a landslide no escape from reality opinion. I looked up to the skies and see I’m just a poor boy need no sympathy.

REGINA BARBER: It was created with just a few clicks using the program Uber Duck, and it turns out that what makes hip hop performers verses so spellbinding is what makes them easier to mimic in deep fakes than other genres of music.

Joining me now to talk more about the role of AI and the future of hip hop is my guest, the rapper and music, science, and technology scholar Dr. Enongo Lumumba-Kasongo, also known as SAMMUS. She’s an assistant professor of music at Brown University based in Providence, Rhode Island. Welcome to Science Friday, Dr. Lumumba-Kasongo.

ENONGO LUMUMBA-KASONGO: Hi, this is so exciting.

REGINA BARBER: Let’s talk about that Kanye deepfake. What was your reaction when you first heard it?

ENONGO LUMUMBA-KASONGO: Oh, I was horrified. I was totally horrified, in part, just because of the space that Kanye West occupies culturally right now.

REGINA BARBER: Right.

ENONGO LUMUMBA-KASONGO: But then also, of course, I immediately– my brain started spinning and thinking about what are the implications of this thing? How is this going to change the landscape, soundscape of the art form in which I work and the art form that I’ve fallen in love with?

REGINA BARBER: Is the ability to create convincing deepfakes unique to hip hop? I imagine you could also create a deepfake to say somebody like Taylor Swift or something.

ENONGO LUMUMBA-KASONGO: Yeah, it’s totally possible to create deepfakes in other genres. As long as you have the sonic data, then you’re able to manipulate it and create a pool from which to draw material. The thing that’s interesting and troubling about the development within hip hop is that there’s so much speech data.

There’s so much word data because, as folks who are familiar with the form understand, and according to the words of Adam Bradley who is a scholar and thinker of hip hop poetics, there’s more speech data per line in a rap song than there is in a song on another genre.

REGINA BARBER: So other than the words per line, there’s also a rapper’s unique style. So actually, does that make it easier to mimic?

ENONGO LUMUMBA-KASONGO: That’s a really great question. Part of my research and interest in this has led me down these interesting wormholes in the world of rap generators. So there are online platforms for folks to quote unquote “craft” verses and, in some cases, craft verses in the style of a particular artist.

So what that means is, for that particular rap generator, the pool of words and phrases are coming directly from a specific artist’s catalog. So there’s a generator for the rapper MF DOOM that’s amazing and totally nonsensical and very silly to play with, which was developed by Nabil Hassein, who’s a technologist.

And he developed this tool to generate rhymes that are in the quote unquote “style” of MF DOOM. So it’s like what does that mean? Well, what that means is that his entire pool of verses has been mined for different rhyme sounds, different combinations of words. And those are the types of bars that are served up to the quote unquote “writer.”

REGINA BARBER: And we’ll get to how good those generators are, but let’s first talk about you’d written a piece for public books that this ability for AI to convincingly impersonate Black artists is part of a legacy of white people impersonating Black performers. Can you unpack that a little bit for us?

ENONGO LUMUMBA-KASONGO: Absolutely, so folks might be familiar with terms like high-tech blackface and digital blackface, which have emerged in popular conversations recently and were coming from the world of critical media studies to talk about the new tools of the digital age that allow non-Black people to adopt Black personhood through their avatars and across different platforms.

Many folks will be familiar with TikTok and the way that it allows folks to mime particular artists. And it’s interesting to think about what circulates, what kinds of clips achieve virality.

And so when we talk about digital blackface, the term blackface itself refers to blackface minstrelsy, which is racist theater and musical form that emerged in the early 19th century. And it became America’s first national form of entertainment. This practice has been further understood, not just as a space for mockery and disgust, but also a reflection of white fascination with the other, with the risk of entering into that space, becoming this other group, right?

Becoming the other allows for some kind of transcendence or connection with a quote unquote “primal self.” These kinds of ideas are what undergirded my understanding of what was happening with tools like Uber Duck, what the potential is for tools like this to perpetuate some of these racist practices.

REGINA BARBER: Right, and because none of this is new, this kind of appropriation, how could AI remove accountability for appropriating Black artists’ music?

ENONGO LUMUMBA-KASONGO: Yeah, so I think this can happen in a number of ways. Primarily, I think part of how appropriation in the digital age has functioned is that the distance between creator and appropriator, for lack of a better term, has been widened, right? The gulf between where the material is coming from and the person who’s able to co-opt that has really widened in some ways.

And so folks are much less– phrases and ideas will pop up on our feeds, right? And we might have no relationship with where that thing came from. And so in so many cases, we’ve seen African-American Vernacular English phrases that are coming specifically from Black folks that then make their way through the ecosystem of the internet.

And next thing we know, it’s the catch phrase of a brand, right? A brand is utilizing this phrase for monetary gain, or a particular artist or influencer who’s not at all connected with the space from which these ideas and creations emerge.

And so I think with AI specifically, the gulf is further widened because we have this generalizable pool, right? There’s a rapper from nowhere, essentially. We’re able to create a voice from quote unquote “nowhere” especially when the pool from which these words and phrases and ideas is not known to the listener and/or writer. And so that’s what really concerns me because I worry there’s no way to account for credit in this system.

REGINA BARBER: I want to circle back to talk more about using the AI. Remember, we said is it good? You have some firsthand experience in working to develop a rap generator for a video game based on the HBO TV series Insecure. Congratulations. That sounds amazing.

[LAUGHTER]

ENONGO LUMUMBA-KASONGO: Thank you.

REGINA BARBER: In the research process you tried out some of the online lyric generators. What did you find? What were their limitations? What were their benefits?

ENONGO LUMUMBA-KASONGO: Well, so in terms of the limitations, one of the things that I found was that, for the most part, the lyrics are kind of nonsensical. You’ll type in a particular prompt, and I picked the word anything.

[LAUGHTER]

So yeah, I want to rap about anything. So tell me how to do that. So I plugged that in. And the verse that came back to me was fairly incoherent as a narrative. But one thing that was interesting as I continued to utilize the tool is I would see the same kinds of lyrics, language that could be coded as misogynist and/or queer antagonistic kept popping up no matter what I was trying to rap about.

And so I wondered, is this a reflection of the pool of words and lyrics that the developers decided to use? Or is this a reflection of the biases of the developers to make sure that those kinds of phrases and framings were showing up in every single iteration of any kind of rap verse I wanted to develop, which I think reflects a broader problem around how people view and listen to hip hop music more broadly.

I know through my own experience as a writer and producer and performer how vast and incredible and incredibly creative and ingenious emcees can be. And yet, in these tools, in the tool that I was using I was so limited in the ways that I could speak about the world. It kept reflecting the tired ideas about hip hop as uniquely perverse or uniquely invested in misogyny or queer antagonism.

REGINA BARBER: It just makes me think of that Outkast line. You thought hip hop was only guns and alcohol.

ENONGO LUMUMBA-KASONGO: Yeah, period.

[LAUGHTER]

REGINA BARBER: You also decided to put some guardrails on what words players could and could not say in the game. How do you go about making those decisions?

ENONGO LUMUMBA-KASONGO: Yeah, ooh, that was so complicated. And I think that’s part of it. I think that the community aspect of developing a tool is critical to its success and critical to its engagement with the communities for whom the thing ostensibly is supposed to serve, right?

I have my ideas about what counts as a dope verse. Or I have my ideas about what’s offensive or what’s interesting. But as a group, as a game studio, we would regularly have conversations about particular words, not running away from words that could be perceived as harmful, but thinking about where the harm is coming from and how to subvert that harm.

So the game allows folks to use the b-word, for example, and you can use that word but only to refer to yourself affirmatively. And so that was a caveat that we threw into the game process because we didn’t want it to be abused or used as a way of demeaning somebody.

But there were certain other words that we excised, the n-word, we made sure that that wasn’t accessible because of the racial politics of the moment. We didn’t have the capacity to figure out an artful way to engage with that, not knowing who was going to be playing the game.

REGINA BARBER: I’m Regina Barber, and this is Science Friday from WNYC Studios. You recently also tested ChatGPT to see how well it might write some of your verses. Were they any good, better than the previous lyric generators you’ve tested before?

ENONGO LUMUMBA-KASONGO: So the verses were definitely light years ahead of the other generators that I tried just in terms of coherence. So I plugged myself in, and the verse that’s supposedly coming from me says, I don’t fit the mold. That’s for sure. I’m not just another rapper. I’m something more. I spit fire on the mic like a dragon’s roar, and I’m not afraid to speak up. That’s what I’m here for.

[LAUGHTER]

REGINA BARBER: OK.

ENONGO LUMUMBA-KASONGO: Yeah, so it’s getting at the things that are important for that particular rapper.

REGINA BARBER: Like dragons.

ENONGO LUMUMBA-KASONGO: I love dragons.

[LAUGHTER]

So that’s one but. Also this idea of not fitting the mold, that is actually something that I’m invested in as an emcee. And it understands what we’re about but cannot approach the way that that emerges as a kind of sonic representation or as a written verse, that the complexity of how that’s expressed through each individual artist, the richness of that is completely cut from the picture.

REGINA BARBER: One of the things that ChatGPT isn’t really great at is slant rhymes. Can you explain what a slant rhyme is?

ENONGO LUMUMBA-KASONGO: There are perfect rhymes, and there are slant rhymes. So a perfect rhyme is when the rhyme sounds are exactly the same, so like car and bar, right? And a slant rhyme is when the vowel sound is similar or shared, but the actual makeup of the word is not totally the same.

So the first few lines of a song I have called 1080p say, I’m kind of scared of the Academy. I think that my parents are proud of me. I just wish I knew how to be comfortable here. I never feel like I’m allowed to breathe. So Academy, proud of me, allowed to breathe, they have similar rhyme sounds. And through the magic of rap vocalizing, you can bend them to sound more similar than they would in speech. But that’s an example of how slant rhymes emerge in the rap space.

REGINA BARBER: For one, I love 1080p. But now, let’s listen to an example of another good slant rhyme of yours. Here’s a bit from your verse on Open Mic Eagle’s track Hymnal.

[AUDIO PLAYBACK]

[OPEN MIC EAGLE, “HYMNAL”] (RAPPING) I’d rather be hiding alone like some Ewoks, up in treetops, creeping around like on T-Boz, steeping the grounds of my teapots, but I’m Steve Jobs, on my Apple updating my eshops, eat an Apple a day, take a brief pause–

REGINA BARBER: Yeah, your job is not being taken any time soon.

[LAUGHTER]

It’s just– it’s not. You’re amazing.

ENONGO LUMUMBA-KASONGO: Thank you. Thank you. It was great to listen to that again.

REGINA BARBER: We’ve just run through all the negatives, most of the negatives of AI and its interaction with hip hop. Do you see any possibility for something good to come out of using AI in hip hop?

ENONGO LUMUMBA-KASONGO: Yeah, so something interesting that came up when I was using ChatGPT is I typed in something like write a rhyme in the style of Jay-Z. And the first thing that popped up was this note.

“I do not have the ability to rap like Jay-Z or any other artist. However, I can suggest that if you want to learn how to rap like Jay-Z, you might consider studying his music and style, practicing freestyling and writing lyrics, and working with a vocal coach or other experienced rapper to develop your skills. Remember that becoming a skilled rapper takes time, dedication, and practice. So don’t get discouraged if it takes a while to achieve your goals. Good luck.”

And I thought that was really amazing because one of the things that I feel like all of these rap generators has missed is an understanding of the complexity of the form. This is really incredible, poetic, and sonic work. And the last thing that I think could be interesting is that there could be a space in which we start building our own libraries and sharing that with folks and having that become an interesting art form.

So I’m thinking about the words of Alexis Andre who works for Sony computer science laboratories. And I was on a panel with him about ethics and aesthetics last year. And he brought up this really provocative idea about the data itself representing a kind of art form or asset.

And so it’s like here’s the SAMMUS library, right? Here’s the library of common phrases and terms and ideas that come up in this artist’s work. And people might be able to develop that for themselves, which I think could be interesting, could be fun and exciting.

REGINA BARBER: Dr. Lumumba-Kasongo, SAMMUS, thank you so much for this great conversation. It was wonderful.

ENONGO LUMUMBA-KASONGO: Thank you. This was awesome.

REGINA BARBER: Dr. Enongo Lumumba-Kasongo, also known as SAMMUS, is an assistant professor of music at Brown University based in Providence, Rhode Island.

Copyright © 2023 Science Friday Initiative. All rights reserved. Science Friday transcripts are produced on a tight deadline by 3Play Media. Fidelity to the original aired/published audio or video file might vary, and text might be updated or amended in the future. For the authoritative record of Science Friday’s programming, please visit the original aired/published recording. For terms of use and more information, visit our policies pages at http://www.sciencefriday.com/about/policies/.

Meet the Producers and Host

About Shoshannah Buxbaum

Shoshannah Buxbaum is a producer for Science Friday. She’s particularly drawn to stories about health, psychology, and the environment. She’s a proud New Jersey native and will happily share her opinions on why the state is deserving of a little more love.

About Regina G. Barber

Regina G. Barber is a scientist in residence at Short Wave, from NPR.

Explore More