I spent the majority of the most recent presidential debate thinking of a fun #rstats project. I had three objectives for the post: (1) that it be fun to do; (2) that it showcase some functionality of rcicero or rcanvas; and (3) that it be interesting to the public. The third objective was admittedly negotiable.

In the next two posts, I’m going to analyze the tweets from our US senators in response to #debate. Part I will show how to obtain each senator’s twitter account with the rcicero package, and Part II will perform some basic text analysis on the corpus of tweets with the awesome rtweet package.

Prep Work

rcicero requires some prep work before use which I’ve spelled out on the package’s GitHub README. Let’s jump right in:

library(rcicero)
library(tidyverse)
set_token_and_user("Your_Cicero_Account_Name", "Your_Cicero_Account_Password")
Your Cicero API user and token options have been set.

To get the twitter accounts, I needed a pair of longitude and latitude coordinates from each state, so I just grabbed some from the zipcodes package:

library(zipcodes)
data(zipcode)
coords <- zipcode %>%
group_by(state) %>%
slice(1) %>%
filter(!is.na(latitude))

Get the Twitter Handles

Now I need to pass each pair of coordinates through the Cicero API with a custom function that gets each senator’s twitter account, and, just for fun, their party:

get_party_and_twitter_account <- function(lat, lon, district_type) {
x <- get_official(lat = lat, lon = lon, district_type = district_type) #FYI get_official() returns a list of data.frames
twitters <- x$identifiers %>%
select(last_name, first_name, identifier, identifier_type) %>%
filter(identifier_type == "TWITTER")
parties <- x$gen_info %>%
select(last_name, first_name, party)
left_join(twitters, parties)
}

Iterate

With our custom function in hand, it’s time to iterate with the purrr package which is so beautiful and elegant, I can’t even bear to look at some of my old code. Warning: the code below is fairly expensive–it will run you about 59 Cicero API credits.

safe_function <- possibly(get_party_and_twitter_account, NULL)
args <- list(coords$latitude,
coords$longitude,
"NATIONAL_UPPER")
twitter_accounts <- args %>%
pmap(safe_function) %>% #FYI we're passing three arguments into the safe function here
bind_rows()

Voila! Our twitter_accounts data.frame now has each senator’s last_name, first_name, twitter handle, and party:

head(twitter_accounts)
Source: local data frame [6 x 5]

last_name first_name identifier identifier_type party
<chr> <chr> <chr> <chr> <chr>
1 Murkowski Lisa lisamurkowski TWITTER Republican
2 Sullivan Daniel SenDanSullivan TWITTER Republican
3 Sessions Jefferson senatorsessions TWITTER Republican
4 Shelby Richard SenShelby TWITTER Republican
5 Boozman John JohnBoozman TWITTER Republican
6 Cotton Tom tomcottonar TWITTER Republican

Stay tuned for Part II.