Analyze Twitch streamers and their audiences with Neo4j APOC and Graph Data Science
Twitch is one of the newer social network platforms. The idea behind the platform is that it allows anyone to stream or broadcast content. Other users can then support streamers through subscriptions and donations. A couple of months ago, I wrote a series of blogs covering constructing the Twitch social network and then performing network analysis on top of it.
Neo4j Twitch Sandbox
Today, I am happy to announce that the Twitch dataset was introduced into the hall of Neo4j Sandbox fame.
What does that mean for you?
You can dive into the basics of network analysis without having to download, install, and configure a Neo4j environment. Not only that, but after you open the Twitch project in Neo4j Sandbox, you’ll have an interactive browser guide waiting for you that will help you get started with graph analysis and algorithms.
A sandbox lasts for three days but can be extended to a maximum of 10 days. As mentioned, once you open the Neo4j Browser, you’ll have the browser guide waiting for you to take you through the network analysis.
By walking through the browser guide, you’ll learn how to use Cypher query language to evaluate overall network statistics, use PageRank to determine the most influential streamers. In the last part, you’ll use the Node Similarity algorithm to analyze which streamers have the highest overlap of viewer audiences.
You can then also use Neo4j Bloom (as in the image above) in Sandbox to visualize your algorithmic results and color/style/size your nodes and relationships based on the computed metrics.
If you have some ideas how to analyze the data that are not included in the guide, you can experiment on your own and come up with new insights.
If you have some experience with Neo4j and graphs, or even if this is your first contact with graphs, but want to embark on the network analysis journey, the Twitch sandbox is the perfect first step.