Twitch Live Dashboard – Accessing Twitch API

What is Twitch?

Twitch is the world’s largest live streaming platform focused primarily towards gamers and e-sports. With an average of 15 million unique daily viewers, the world spent a whopping 560 billion minutes watching content on Twitch in 2018. [1]
We decided to see take a look at the Top Games and Top Streamers being watched on Twitch. This was made easy by Twitch as they provide a ton of developers tools including relevant data accessible through Twitch API.

Here’s a pretty simple dashboard we created on Tableau

You can take a look at an interactive version published on my Tableau Public profile. This dashboard shows Top Games streamed and Top Streamers based on the number of Viewers. If you select a game then you can see the Top Streamers for a particular game. Tableau Public only takes data extracts so this public dashboard is not pulling live data from Twitch.

Link to the GitHub repo: https://github.com/kaivalyapowale/Twitch-Dashboard/

Team project by Vardayini Sharma, Maddhujeet Chandra, and Kaivalya Powale.

Tutorial

I have outlined a complete tutorial from accessing Twitch API using a python script to the final Tableau Dashboard design.
(For our real-time dashboard we used an Amazon Web Services EC2 instance to run our python script and AWS RDS to store the live data. I am not going over those in this tutorial. Instead, I am using a locally stored csv file which my python script updates automatically every 60 seconds.)

1. Create a Twitch Developer account at https://dev.twitch.tv/. Go to your Dashboard and ‘Register a New Application’. You will need this to get your Client ID and Client Secret.
Use localhost as redirect link. You will get your Client ID and Client Secret on this page.
2. Check out the Twitch API documentation to fully understand the API calls, what data it offers, and what authentication it requires for the data you need.
Twitch API has a lot of available requests. Use: https://dev.twitch.tv/docs/api/reference
3. Authenticating ourselves to getting the Access Token.

I used a Jupyter Notebook to write my python script. You can use whatever you are comfortable with. For beginners, I’d recommend Jupyter Notebooks. (do some research to see what best fits your skill level)

First begin by importing the necessary libraries

import json
import requests
import pandas as pd
from pandas.io.json import json_normalize
import time
import threading

Now move onto authentication to get the access token

#Client ID: –***—
#Client Secret: —***—
#Client ID and Client Secret are sensititve and you should not share them
client_id= <yourclientid>
client_secret= <yourclientsecret>
#Request for the access code using requests library
#I have chosen this method of authentication with my goal in mind
access_code = requests.post('https://id.twitch.tv/oauth2/token?client_id=&#39;+str(client_id)+'&client_secret='+str(client_secret)+'&grant_type=client_credentials')
#access token response is a JSON-encoded app access token
access_token = json.loads(access_code.text)
access_token = access_token['access_token']
#Sample response is
"""
{
"access_token": "prau3ol6mg5glgek8m89ec2s9q5i3i",
"refresh_token": "",
"expires_in": 3600,
"scope": [],
"token_type": "bearer"
}
"""
view raw authentication.py hosted with ❤ by GitHub
4. We need two types of API calls for this dashboard Get Top Games and Get Streams.

We will first access the API for Top 100 Games by number of viewers. Using the Game IDs for these games, we will get Stream data for them.

# Getting data for Top 100 Games by number of viewers
# Default response is for 20 games so you will have to set the parameter 'first to 100'
headers = {
'Authorization' : 'Bearer '+str(access_token),
}
games_response = requests.get('https://api.twitch.tv/helix/games/top?first=100&#39;, headers=headers)
# The response will be a JSON which will include the response data and the pagination cursor
# We need to extract the data from the JSON and convert it into a pandas dataframe
games_response_json = json.loads(games_response.text)
topgames_data = games_response_json['data']
# Converting to a pandas dataframe
topgames_df = pd.DataFrame.from_dict(json_normalize(topgames_data), orient='columns')
# See the first few lines. The response includes id, name, and box art url for the game
topgames_df.head()

To get the Top Streams for these games we will have to pass the game IDs as strings in the API call one at a time. For this, we need to create a FOR loop to get data for all the Games.

# I am getting only the top 25 streamers for the first game
headers = {
'Authorization' : 'Bearer '+str(access_token),
}
topstreamsforgame_response = requests.get('https://api.twitch.tv/helix/streams?game_id=&#39;+str(topgames_df['id'][0])+'&first=25', headers=headers)
# Load the JSON
topstreamsforgame_response_json = json.loads(topstreamsforgame_response.text)
# Extracting data from the JSON
topstreamsforgame_data = topstreamsforgame_response_json['data']
# Converting into a DataFrame
topstreamsforgame_df = pd.DataFrame.from_dict(json_normalize(topstreamsforgame_data), orient='columns')
# FOR loop to get top 25 streamers for rest of the games in our list
# To keep the dashboard lightweight and relevant, I am using only the Top 20 Games and Top 25 Streamers per game
for i in range(1,19) :
headers = {
'Authorization' : 'Bearer '+str(access_token),
}
topstreamsforgame_response = requests.get('https://api.twitch.tv/helix/streams?game_id=&#39;+str(topgames_df['id'][i])+'&first=25', headers=headers)
topstreamsforgame_response_json = json.loads(topstreamsforgame_response.text)
topstreamsforgame_data = topstreamsforgame_response_json['data']
topstreamsforgame_df_temp = pd.DataFrame.from_dict(json_normalize(topstreamsforgame_data), orient='columns')
frames = [topstreamsforgame_df, topstreamsforgame_df_temp]
topstreamsforgame_df = pd.concat(frames, ignore_index=True)
# Look at the data we retrieved
topstreamsforgame_df.info()

Now, for the final trick, we will define a function which will enclose all our code and put a Timer so that it pulls the data every 60 seconds.

def twitch():
threading.Timer(60.0, twitch).start()
# Top Games
headers = {
'Authorization' : 'Bearer '+str(access_token),
}
games_response = requests.get('https://api.twitch.tv/helix/games/top?first=100&#39;, headers=headers)
games_response_json = json.loads(games_response.text)
topgames_data = games_response_json['data']
topgames_df = pd.DataFrame.from_dict(json_normalize(topgames_data), orient='columns')
# Top Streamers
headers = {
'Authorization' : 'Bearer '+str(access_token),
}
topstreamsforgame_response = requests.get('https://api.twitch.tv/helix/streams?game_id=&#39;+str(topgames_df['id'][0])+'&first=25', headers=headers)
topstreamsforgame_response_json = json.loads(topstreamsforgame_response.text)
topstreamsforgame_data = topstreamsforgame_response_json['data']
topstreamsforgame_df = pd.DataFrame.from_dict(json_normalize(topstreamsforgame_data), orient='columns')
for i in range(1,19) :
headers = {
'Authorization' : 'Bearer '+str(access_token),
}
topstreamsforgame_response = requests.get('https://api.twitch.tv/helix/streams?game_id=&#39;+str(topgames_df['id'][i])+'&first=25', headers=headers)
topstreamsforgame_response_json = json.loads(topstreamsforgame_response.text)
topstreamsforgame_data = topstreamsforgame_response_json['data']
topstreamsforgame_df_temp = pd.DataFrame.from_dict(json_normalize(topstreamsforgame_data), orient='columns')
frames = [topstreamsforgame_df, topstreamsforgame_df_temp]
topstreamsforgame_df = pd.concat(frames, ignore_index=True)
# Now that the FOR loop is exited and we have all our data, we export it into a csv
export_topgames_csv = topgames_df.to_csv (r'<filepath>.csv', index = None, header=True) #Don't forget to add '.csv' at the end of the path
export_topstreamsforgame_csv = topstreamsforgame_df.to_csv (r'<filepath>.csv', index = None, header=True)
# Our function is defined and it overwrites the CSV every 60 seconds. Now, we call it.
twitch()

As you can see, I have exported the csv within the function. So, it updates automatically every 60 seconds. Now, we have to connect it to Tableau and make our Dashboard.

5. Creating the Dashboard on Tableau

Open Tableau and connect the csv file for Top Games. Pull the Top Streams csv and create an inner join on ‘Id = Game Id”. Ensure that you have a live connection with the data source.

Now that the connection is made, go over to sheet 1. Pull the Viewer Count from the Measures into the Sheet and pull the Name into the Rows section. Make it a bar chart and order it.

Make another Sheet for top Streamers.

For the Dashboard, pull in both the sheets. And set the Games Sheet as a filter for the Streamers sheet. This will show you overall Top Streamers and game-wise Top Streamers.

This is our final dashboard. After this, I formatted it to make it look prettier and match Twitch’s design guide.
Set background to black i.e. HEX #000000 and the bar colors to Twitch purple i.e. HEX #6441a5

Remember, our data is updating live so we need to set the dashboard to auto-refresh every 60 seconds or refresh it manually.

6. Setting auto-refresh

I set the auto-refresh to update the dashboard every 60 seconds. This can be done using an auto-refresh algorithm or something like an AutoHotkey. Use Cmd+R to refresh the data source on mac and F5 on Windows. I have attached a file for auto refresh on Windows.

That’s it. Hope this was helpful. Please get in touch if you have any recommendations or doubts. Thanks for reading through!

References:
[1] https://www.businessofapps.com/data/twitch-statistics/

One thought on “Twitch Live Dashboard – Accessing Twitch API

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s