SciFri Findings is a series that explores how we understand the impact of science journalism, media and programming on our audiences. Sign up for our newsletter to get the latest reports!
Getting feedback from those who we serve has always been an integral part of Science Friday’s objective in making science more accessible. Audience research is no different— I am always curious to know what value does the work we do provide our audiences? Where can we make things better? How can we have deeper engagement and impact? In June 2023 we launched an audience survey across multiple platforms (radio, social media, newsletters, donors, etc). This audience survey was informed by in depth interviews on radio programming conducted in Fall 2022. I excitedly waited as the survey went into the world, hoping 200-300 people would care enough to complete it.
The joy and the surprise as the numbers started trickling and then monsooning in— 1, then 100, 500, 2500, all the way up to over 6800! Our Director of Audience, Ariel Zych, and I started playing with the data–we wanted to use ChatGPT to theme and summarize some of the qualitative data. We copied some open responses into Chat and started noticing something odd here and there in the data we were pasting over.
There, in our own raw data, were a damning number of clearly AI-generated responses, shamelessly self-disclosing with responses to open preference questions like “As an AI language model, I do not have personal preferences…” Scrolling more through the data it became clear…AI bots had struck our study, HARD! We looked at each other and couldn’t help but laugh at the irony. Here we share some lessons learned (and some we had forgotten) after we wiped away our tears and started cleanup.
Tips for Your Next Online Survey
Use survey software that has a CAPTCHA: “Completely Automated Public Turing test to tell Computers and Humans Apart ” or CAPTCHAs are questions we have all likely seen. These programs differentiate humans from bot respondents. Many online software companies provide CAPTCHA options for surveys but only for paid subscriptions.
No CAPTCHA? Trap ‘em: You may not have the budget for licensing survey tools with a CAPTCHA feature. Trap questions are an alternative to a CAPTCHA that can help provide some coverage against bots. They are used to identify respondents who are not paying attention to survey questions (e.g. someone choosing “Strongly Agree” for all questions). A trap question can take many forms, including a question to identify an object in an embedded picture, a prompt to type specific words into a text box, etc. Once the data is collected, you can filter out any respondents with incorrect answers. Trap questions not only protect against bots, but also bad actors such as trolls with an agenda, or those who don’t actually know about the product or program but who want to receive the cash incentive. By providing a small number of trap questions, you can ensure your target audiences are the ones providing you with good data, and eliminate the rest.
We incorporated a trap question during the design phase of our audience survey. Participants were asked to identify Science Friday’s host, Ira Flatow. Answer choices included only other male science journalists and communicators so that all options could be viable options and limit the number of bots/bad actors in the data. We used this type of trap question because we wanted to survey existing audiences who should know the host, not new audiences. This one step eliminated almost 20% (N=1357) of our sample!
More money, more bots, more problems: Cash is king in the survey world. Participants are often rewarded with cash or gift cards for each completed survey. Even the chance for a lottery incentive has shown to increase response rates for online research. We chose to provide a $50 e-gift card lottery incentive to balance the length of the survey and motivate more audience members to complete it. Money is great, but with more money comes higher incentives for bot creators, bad actors, and trolls to participate for cash alone. We quickly realized $50 was a lot to offer for a ~12-13 min survey. It made me think: How can I make sure to value the participants time while still making sure I get the information I need? Next time, we will consider lowering the threshold of our cash incentive. Perhaps it could have been limited to $25 instead? If this didn’t yield enough participants, maybe a second recruitment waive would be in order? In the future, particularly for audience surveys, we might consider offering other things of value, such as merchandise or free event tickets instead. Non-cash offers might reduce the number of people interested in just being paid for survey completion. It can also provide value by giving participants tangible materials and/or deeper engagement with your organization.
Segment audiences: Whenever feasible, use different utm or referral links for different recruitment pathways for your surveys. We used different links for each platform (i.e. Twitter, Newsletters, Donors) to understand where traffic was coming from, look for differences in the preferences between audiences, and to capture the possible universe size for our sample. We had more than half our respondents come from Facebook, which is disproportionately higher than we usually see for surveys. Generally, we find our radio audiences to be the largest referrals so seeing so many come from Facebook was a red flag. Additionally, segmenting audiences can identify any strange patterns in the data. For example, if you have previously surveyed audiences, you may already have demographic data to check against new data. If you know your organization primarily serves older adults, and see that your survey consists of only young participants the data may be compromised. Consider whether it could be the topic of the survey, recruitment, or if this anomaly is a potential bot.
Cleaning Up The Data
After a few laughs and tears, I had the task of figuring out how exactly to clean up the jumbled mess of data we had. With a filtered dataset (thanks trap question!), I started cleaning the data using
Impossible timestamps: Responses submitted within the same second of each other were removed. Many of the most suspicious responses were submitted with nearly the same time stamp late at night (12-3 am) or early morning (4-7 am) which are unlikely times for our US-based audiences to complete surveys.
Obvious AI language: I had a number of open-ended questions for the survey. Any responses that had very obvious language (“As an AI language model, I do not have personal preferences…”) were removed.
Non human sounding responses: Some of our open-ended questions included asking why participants preferred certain broadcast formats. We eliminated any responses that didn’t sound authentic to an audience voice. For example, “Live call can increase the audience’s sense of participation and loyalty…” It is doubtful that an audience member would be discussing loyalty.
Human-sounding, but identical, open responses: There were some responses that repeated often. This includes phrases like “It can create memorable moments for both the host and the audience” and “Maintained the authenticity of the program”. It was highly unlikely that multiple individual respondents used the exact same phrasing.
Designing audience centered content is an inherently inclusive process. Audience surveys are an opportunity to listen to the needs and concerns of our audiences. Surveys are just one tool we use to help gather audience feedback at Science Friday. When all the cleaning was said and done, we were still left with 1200+ survey participants in our sample! This was significantly higher than the 200-300 we initially anticipated. As online research continues to grow, so does the potential for AI bots. I am appreciative of having discovered new ways to improve my practice even if it cost me hours of work and some new gray hairs.
Nahima Ahmed is Science Friday’s Manager of Impact Strategy. She is a researcher who loves to cook curry, discuss identity, and helps the team understand how stories can shape audiences’ access to and interest in science.
What Do Two Anesthesiologists Do For The Fears Of A General Audience?
Using an Ask-An-Expert model leads to increased knowledge and comfort levels on anesthesia for audiences.
What’s That Smell? An AI Nose Knows
A computer model can map the structure of a chemical to predict what it probably smells like.
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
This cookie is used to detect and defend when a client attempt to replay a cookie.This cookie manages the interaction with online bots and takes the appropriate actions.
Issued by Microsoft's ASP.NET Application, this cookie stores session data during a user's website visit.
This cookie is managed by Amazon Web Services and is used for load balancing.
This cookie is set by the provider Akamai Bot Manager. This cookie is used to manage the interaction with the online bots. It also helps in fraud preventions
Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
This cookie is associated with Django web development platform for python. Used to help protect the website against Cross-Site Request Forgery attacks
The JSESSIONID cookie is used by New Relic to store a session identifier so that New Relic can monitor session counts for an application.
A load balancing cookie set to ensure requests by a client are sent to the same origin server.
This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
The TiPMix cookie is set by Azure to determine which web server the users must be directed to.
SiteLock sets this cookie to provide cloud-based website security services.
This cookie is used for load balancing purposes. The cookie does not store any personally identifiable data.
Azure sets this cookie for routing production traffic by specifying the production slot.
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better experience for the visitors. They help us to know which pages are the most and least popular and see how visitors move around the site. All information these cookies collect is aggregated and therefore anonymous. If you do not allow these cookies we will not know when you have visited our site, and will not be able to monitor its performance.
Cookie used to remember the user's Disqus login credentials across websites that use Disqus.
This cookie is installed by Google Universal Analytics to restrain request rate and thus limit the collection of data on high traffic sites.
A variation of the _gat cookie set by Google Analytics and Google Tag Manager to allow website owners to track visitor behaviour and measure site performance. The pattern element in the name contains the unique identity number of the account or website it relates to.
AWSALB is an application load balancer cookie set by Amazon Web Services to map the session to the target.
This cookie is used for storing country code selected from country selector.
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide anonymized information on metrics the number of visitors, bounce rate, traffic source, etc. The use of these cookies is strictly limited to measuring the site's audience. These cookies do not allow the tracking of navigation on other websites and the data collected is not combined or shared with third parties.