How the narrative changed after the Boomers' win

On the 24th of August, 2019, the Australian Men’s Basketball team, the Boomers, created history when they were able to enact their own David and Golliath moment, taking down Team USA for the first time ever. More impressively was the fact that the US hadn’t lost a game in 13 years. It was a marvelous moment in Australian basketball history, and while the US fielded what is widely considered one of their weaker teams, it was a moment for us to all savour. The fact that we were able to bring the best country in the world to play some exhibition games should have been celebrated, regardless of the result.

That was not to be the case…

After the historic first men’s basketball game at Marvel Stadium, the narrative from the media appeared to be full of negativity, predominantly focusing on the expensive seating with poor views for fans.

I suspect that this happens everywhere, but we Aussies seem to let our opinions swing more than most - we’re more than happy to rag on individuals or teams when things aren’t tracking so well, but flip pretty quickly when the are (Nick Kyrgios says hi).

This analysis will look at the Twitter activity around the time of both games in an attempt to prove or disprove the narrative change.

To collect tweet data, the rtweet and ROAuth packages were used. Tweets between the 22nd August and 2pm on the 25th of August were collected and analysed in the blow analysis. Tweets using the official hashtag - #BoomersUSA are referencing @BasketballAus were scraped.

The full code base and data for this project can be found here

If the tweet was recorded after 2019-08-24 04:00:00 UTC (tip-off time for game 2), it was classed as a tweet in Game2 Starts. This allows us to compare tweets leading in to the historic second game with those that occurred during and after the magical night.

Tweet Analysis

Looking at all tweets since game 1, we can see that game 2 had more tweets per hour. No doubt the shock result played a massive part in this.

With 4019 ‘favourites’, the following tweet from NBATV was the most favoured tweet:

It feels awesome. … I hope we can all build on this.” @Patty_Mills describes his emotions after leading @BasketballAus to its first ever win over the U.S.

The most retweeted tweet from the period analysed came from NBL and had 833 retweets. The tweet was:

Patty Mills in the 4th quarter is a piece of art 🔥 #BoomersUSA @SBSVICELAND @SBSOnDemand

Tweet Words Used

Before we can measure the sentiments of tweets, the tweet strings need to be split into ‘tokens’ (or individual words).

Once these tokens have been unnested (split out), we can plot the most frequently used words. Importantly, stop-words and other words we don’t want in our analysis have been omitted. Stop-words include “and”, “the”, “a”, etc - words that don’t add a lot to a sentiment analysis. Additionally, “BoomersUSA” was removed, as this was the hashtag for the game and was mentioned in almost all tweets.

The 20 most frequently used words for tweets that occured before and after game 2 are plotted below.

As expected, “seats”, “plastic”, “seating” were words that frequently appeared in tweets before Game 2, where only “seats” appeared in tweets after Game 2 started. For tweets after game 2 started, “history”, “Patty Mills”, “awesome” and “love” all appeared in tweets frequently - very soft and mushy hey?

The plot below allows us to get an even better look at the differences between the words used before and after Game 2. Words to the bottom right of the diagonal line running through the plot indicate words more frequently used in tweets before game two, while words above the line were more frequently used during and after Game 2.

Looks like we were a happy bunch finally…

Tweet Sentiment Analysis

Once the tokens have been separated, a sentiment score can be calculated.

The method that will be used is the common lexicon for sentiment analysis created by Finn Årup Nielsen (http://www2.imm.dtu.dk/pubdb/views/publication_details.php?id=6010), called the AFINN lexicon. Words that are more positive (say, “awesome” for example) are given positive score further from zero than more negative words (like “devastated”), which are given socres further below zero.

To get a feel for the power of sentiment analysis, the following tweet was the most positive tweet, with a positivity index of 17:

Totally thrilled with the Boomers win today, and Patty Mills was just brilliant when it counted. Really wonderful team effort. Bring on the World Cup! #BoomersUSA

At the (complete) opposite end of the spectrum, the following tweet was the most negative, with a score of -18 (sorry for the profanities, I’ve done my best to clean them out):

Who the f!&% hired these people to organise this event, what a f!&%ing bs stich up. \$ chairs from Bunnings does that come with a forking snag c!&%. Like WTF surely the company who funding to set up the stadium is not that broke.#BoomersUSA

No surprises, the most positive tweet was after we won, the most negative after Game 1.

Plotting the distribution of sentiment scores for each tweet, we can clearly see that the tweets after game 2 started became considerably more positive - the median positivity score (the ratio of positive to negative words) for these tweets was 0.95, over doube the 0.43 median for tweets prior to game 2.

As I suspected, we became much happier after our historic win… almost to the point where we’d forgotten about the seating “debacle”, even though the second game was played at the exact venue, with the exact seating arrangement… very strange.

I might put this theory of us Aussies to the ultimate test one day and see how we respond to Kyrgios’ success… if he ever tastes the ultimate success!

Feel free to leave some feedback if you like in the comments below.

Jason Zivkovic
Data Scientist

A sports mad Data Scientist just having some fun.