Tracking Tweets To Forecast Smoky Skies
The Rocky Fire and the Jerusalem Fire scorched nearly 100,000 acres in northern California in July and August of 2015… and when the prevailing winds were right, smoke drifted all the way down into the San Francisco Bay Area.
That’s when locals began tweeting their observations:
— Patricia Britton (@geewhizpat) August 15, 2015
Over last 72 hours, ER visits from cough up 90%, other resp. ER visits up 411%, according to Fresno Cty Dept of Public Health #RoughFire
— Kerry Klein (@EineKleineKerry) September 11, 2015
Now, scientists at the U.S. Forest Service have analyzed 39,000 tweets like these from the 2015 wildfire season, and found that social media data can be a reliable way to augment existing air quality monitoring data in predicting the extent—and the public health effects—of wildfire smoke.
The researchers presented the findings at a conference in Denmark earlier this month, and study author Sonya Sachdeva joins Science Friday to talk about how tweets can be a useful tool to learn about air quality and people’s perspectives on nearby wildfires.
View some of the images of the wildfires ripping through the West below.
Sonya Sachdeva is a research social scientist with the US Forest Service Northern Research Station in Chicago, Illinois.
JOHN DANKOSKY: This is Science Friday. I’m John Dankosky.
The West is burning, again. New fires exploded up and down the state of California this week, killing several people. And the Ferguson fire, which forced Yosemite Valley to close, is only a quarter contained.
As the expression goes, where there’s fire, there’s smoke and a lot of it. It can sometimes be hard for authorities to track where all that smoke is going. What the public health effects might be, depending on where the winds blow. So they’re looking to a new source of data on that smoky air pollution– your tweets.
Joining me now to talk about using social media data for air quality prediction is Sonya Sachdeva, who is a research social scientist with the US Forest Service Northern Research Station in Chicago. And she joins us from WBEZ.
Sonya, welcome to Science Friday. Thanks so much for being here.
SONYA SACHDEVA: Thank you so much for having me, John.
JOHN DANKOSKY: So first of all, how do we currently track wildfire smoke? Monitoring stations, that sort of thing?
SONYA SACHDEVA: Yeah, right now the most robust estimates of air quality come to us from the thousands of monitoring stations that the EPA maintains across the country. And usually, those estimates tend to be gathered on a daily basis.
JOHN DANKOSKY: So that’s the way we’re doing it now so what are you doing here? How do you think Twitter can improve our air quality forecasting?
SONYA SACHDEVA: So what we found in two consecutive studies is actually, the frequency with which users post content about wildfire and specifically, wildfire and smoke, tends to be a pretty good estimator of air quality impacts in the regions surrounding wildfires. And gives us a pretty dynamic real-time estimate of how the smoke is moving across a region.
JOHN DANKOSKY: How did you think to do this?
SONYA SACHDEVA: Well, there have actually been a lot of researchers– mostly in China– who have used Weibo, which is, I think, a Chinese version of Twitter or a pretty good similar version of that. And they’ve been using Weibo to look at other posts on Weibo can actually estimate air quality independent of wildfires in China. And they’ve found that in the absence of monitoring stations or reliable data from monitoring stations, their Weibo posts can actually be a good estimator.
JOHN DANKOSKY: Can you give a sense of what people are thinking about a fire from their tweets? About how concerned they are for their safety? How close the fire might be– that sort of thing?
SONYA SACHDEVA: Yeah, and as social scientists, that was actually one of the most interesting aspects of this project for my co-author Sarah Mccaffrey and I. We were able to build a conceptual model of how people are actually reacting to and adapting to wildfire smoke in their regions.
JOHN DANKOSKY: When people express something about a fire obviously, there’s a lot of fear, probably, I’m wondering how their opinions as communicated on social media could be viewed as interesting data, but maybe, they can help you predict a little bit more. Tell me more about the content that people are actually providing in these tweets.
SONYA SACHDEVA: And it’s interesting that you bring up the fear. That was actually a topic we did not see emerging in the automated text analysis that we did of these tweets.
What we saw was people were obviously, very concerned about their air quality impacts. So there be a lot of tweets about, I can’t breathe today. The air is thick with smoke. And what we found was that when we added that semantic content into our predictor models of air quality, we were actually able to increase by about 50%, the predictive power of our models.
In addition to those topics about smoke and air quality, we also saw that people were tweeting about offering their gratitude and thanks to the firefighters who are putting themselves in harm’s way. They were also really concerned about their own safety and asking the world to pray for them. And especially with some of the more recent fires, like the Carr fire, I have, once again, seen those types of topics emerge.
JOHN DANKOSKY: We had asked our listeners to tweet us about what they’re seeing. [? Bonnie ?] on Twitter said, it’s a little smoky in Reno, Nevada today from the Ferguson fire near Yosemite. We’re getting forecasts for smoke, along with our local weather forecasts.
What she’s telling me is the wildfire smoke is traveling quite away. Reno’s not real close to Yosemite. You’re seeing some of the smoke go a long way.
SONYA SACHDEVA: Hundreds of miles, in fact. And actually, one analysis that I saw that was done by the NRDC, suggests that about 2/3 of the counties in the US are affected by wildfires smoke– to some extent.
JOHN DANKOSKY: That’s amazing.
What you did here is an analysis of tweets after the fact, is this something that’s almost ready to go? Could you use this in real-time in some way? That
SONYA SACHDEVA: Is our eventual goal and right now, one of the biggest problems that we see with that is in order to get the most high quality tweets, we have to rely on the hashtags. So it’s #carrfire, #fergusonfire and tweets gleaned on those keywords tend to give us the highest quality tweets.
But of course, if you’re building something like this in real-time, you can’t really rely on those hashtags because in a lot of cases, they haven’t even been developed yet. So we’re trying to figure out, what are the best key words to give us some of those really high quality data points?
JOHN DANKOSKY: But I’m wondering, here you are on a national radio program and you’re telling people about this, is there a way to get people to think about Twitter as a way to help you get information? Like if everyone used the hashtag and everyone gave you certain types of information about their location and the type of smoke they were seeing, you might get a lot more data than you ever had before.
SONYA SACHDEVA: That is an excellent point and we are hoping to work with local managers in promoting those types of norms. And actually, it’s funny that you mentioned it, but just this morning, I was on Twitter and I was looking at some of the posts around the Carr fire. And I noticed that there were users policing other users and asking them not to conflate multiple fires in a single post.
And I found that to be so interesting that these norms are evolving and are being self-policed by users on Twitter.
JOHN DANKOSKY: How much do you have to cancel out the noise, though, that you get with any sorts of large information coming from sources that might not be reliable? I think of myself as a news person who uses Twitter a lot, but every time there’s a breaking news story, you have to look at things with a little bit of a grain of salt. Is there are a lot of data that you’re getting that maybe, isn’t all that useful?
SONYA SACHDEVA: Yes. As with any data source– as you pointed out– there’s going to be some level of noise. And with social media sources just because we’re reaching millions of people, there might be a lot more noise. And there is an intensive data cleaning process that we go through and as we’re building a real-time monitoring system for fire, that is something that we have to be very aware of.
JOHN DANKOSKY: There’s a couple of other sources that are out there right now. There’s an account on Twitter @wildfiresignal and the EPA has an app called Smoke Sense. Can you tell us about those and how they might interact with some of the work that you’re doing?
SONYA SACHDEVA: Yeah, I think it’s a really great time to be using social media data and basically, just crowdsource data as a whole in all of these multifaceted ways to target this specific issue. In terms of the Twitter bot @wildfiresignal, it’s a little bit different than what we’re doing because what Descartes Labs has actually done is they identify active fires on the basis of a governmental database. And then, they use a satellite to take pictures of where the smoke is coming so it’s a more distant perspective on how smoke is traveling.
And Smoke Sense– as you mentioned, the app by EPA– is another great example of bi-directional communication that can be done via the use of smart phones. However, they’re not really able to harness social media conversations, which tend to be much more prolific than users of a particular app.
JOHN DANKOSKY: Wildfires seem to just be getting worse, is this something that in the next couple of years you think you can operationalize at a higher level than you are right now?
SONYA SACHDEVA: Yeah, we’re definitely hoping to scale this project out. Not just in the US, but wildfires are becoming a global concern or have been a global concern. And particularly in regions where monitoring stations are sparse, but social media use is widespread, we’re hoping this could be a really influential tool.
JOHN DANKOSKY: Sonya Sachdeva is a research social scientist with the US Forest Service Northern Research Station in Chicago. Thank you so much for joining me. I really appreciate it.
SONYA SACHDEVA: Thank you.