Neo4j Live: Construct the Matrix interaction network based on the movie script

This stream we will present how to combine web scraping, OCR, and NLP techniques to construct the Matrix interaction network.
– Scraping Matrix fandom page with Selenium
– Using PyTesseract to read the Matrix movie script PDF
– Extract characters in each scene by using the SpaCy’s rule-based matcher
– Construct and analyze the character’s co-occurrence network in Neo4j

Blog: https://towardsdatascience.com/construct-the-matrix-interaction-network-based-on-the-movie-script-738b4fa9b46d
Neo4j Sandbox: https://dev.neo4j.com/try
Colab Notebook: https://github.com/tomasonjo/blogs/blob/master/matrix/MatrixNLP.ipynb
Matrix Characters: https://matrix.fandom.com/wiki/Category:Characters_in_The_Matrix

Follow Tomaz: https://twitter.com/tb_tomaz
Graph Algorithms for Data Science: https://www.manning.com/books/graph-algorithms-for-data-science – use code au35bra for 35% discount