This work was inspired by the “Anatomy of the Osama Tweet“, in which the news of Osama Bin Laden’s death spreads just before it was officially announced by U.S. President Obama. Naturally the original speculative tweet spread virally through Twitter. My colleague Andrew Deacon and I have undertaken a similar analysis of information flows during an incident which occurred on the University of Cape Town (UCT) yesterday. With the proliferation and adoption of social networks, information has a tendency to spread through networks faster and more virally. Now because it is happening electronically we can retrospectively run an analysis of how it all plays out. But it takes the right kind of incident to actually warrant the viral spread of information. As this article highlights, most of the information posted to popular social networks actually goes nowhere, in terms of amplification or repetition (93% of tweets go nowhere for example).
So when I heard about the crash landing of a helicopter at UCT I knew we would have a viral situation on our hands. In fact on the way in to the office, just following the incident, I was already hearing students chatting about it on the city campus. People chat in corridors, sms messages are sent, phones ring, bbm’s buzz, whatsapp’s beep, Facebook statuses are updated, and Tweets are posted to Twitter. The latter, Twitter, can be said to be the most open and accessible of networks for analysis. We collect tweets from Twitter on an ongoing basis whenever someone tweets about #UCT and our dataset is slowly growing. This incident prompted many people to tweet about UCT and the helicopter crash from all over South Africa and the world. So whereas only a few years ago one might have only heard about this incident on the six o’clock news, people all over the world were already discovering the story by getting the news from social networks.
We harvested 1168 nodes and 1681 edges worth of data from Twitter at about 3pm on August 3rd, 2011. The tweets had to have included the words “UCT” and “helicopter” to be included. That means our analysis includes 1168 distinct people who Tweeted about the incident and within this network of people 1681 were connected in some way. This connection may be a retweet (amplification) or mention (reference). The lines show these connections between the nodes (people) tweeting about the incident. Zoom in with your mouse or the + button in the image to explore.
One can see the twits that were most responsible for the spread of information via Twitter in this visual quite clearly. Mr_capeTown and GarethCliff are shown as very central nodes in the diagram. As is the media outlets Radio702, 945Kfm, MyNews24 and UctRadio. There are a host of satellite networks around the outside of the diagram as well, which are not connected to the central network.
If you want to explore this social network graph in higher detail download this PDF file. Within the PDF you can zoom in ultra close and even search to find your Twitter user name.
Most frequently retweeted or mentioned users are shows in the graph below.
The incident was said to have happened around 9:30 am and the tweets started pouring in fast. The image below shows the number of tweets per five minute interval following the incident.
The first photos were posted by folks who captured the scene on their cell phone cameras. The two most popular images which are still available here and here. Gradually, news teams arrived and the following video by official media outlet News24 was posted to YouTube.
Only a portion of users tweets had their location data. There is an option in Twitter to have your coordinates included with each tweet. Based on the data we had available we came up with the following view of tweets originating in South Africa. The lines again represent retweets and mentions between locations.
Circles representing tweets in the bottom left are from Cape Town, then along the Southern Coast to Port Elizabeth, East London and Durban. The large cluster diagonally up from Cape Town would be Johannesburg/Pretoria with Bloemfontein in between.
Globally the available location data is shown below.
We can infer that tweets went out from parts of Europe, Asia, North America and other African countries. In many cases these foreign tweeters were in some way connected to a South Africa tweeter indicating a viral global spread.
There are a number of limitation to this small scale real-time study. We have only captured the conversations which were happening within Twitter in the first case. Furthermore we only capture tweets which include the words “UCT” and “helicopter”, thus missing tweets which use abbreviations such as “heli” etc. We are also missing the conversations which exist within the other social networks. I am sure that the conversations which happened in Facebook, a much more widely adopted social network in South Africa, would also be incredibly interesting to analyze. We have additionally noticed activity streams on the event happening in LinkedIn.
One of the toughest current hurdles in doing such an analysis is actually getting at the data you want. In this case we used NodeXL to extract the data from Twitter using the Twitter API. We then build the network graphs in Gephi while using excel to summarize the data. One of the reasons we were able to get access to this data was that the event happened suddenly and quickly and was not too massive. Larger events which unfold over time seem to be more difficult to gather data for at this point.
I think its remarkable that this event happened yesterday and we are already able to analyze the social nature of some of the information flows around the event. The next phase of this analysis is to try and replay the flow of tweets based on their timestamp. We want to examine exactly how the information flowed through the network over time. I am surprised that the current toolset we have does not allow this. Gephi does have a ‘Graph Streaming’ tool, but it has to be hooked up to a live feed of information using JSON. We are looking for help or suggestions as to how to use this dataset to reconstruct the social spread of information. Leave us a comment if you have any tips!