Neo4j Live: Construct the Matrix interaction network based on the movie script

10 Feb, 2022



This stream we will present how to combine web scraping, OCR, and NLP techniques to construct the Matrix interaction network. – Scraping Matrix fandom page with Selenium – Using PyTesseract to read the Matrix movie script PDF – Extract characters in each scene by using the SpaCy’s rule-based matcher – Construct and analyze the character’s co-occurrence network in Neo4j Blog: https://towardsdatascience.com/construct-the-matrix-interaction-network-based-on-the-movie-script-738b4fa9b46d Neo4j Sandbox: https://dev.neo4j.com/try Colab Notebook: https://github.com/tomasonjo/blogs/blob/master/matrix/MatrixNLP.ipynb Matrix Characters: https://matrix.fandom.com/wiki/Category:Characters_in_The_Matrix Follow Tomaz: https://twitter.com/tb_tomaz Graph Algorithms for Data Science: https://www.manning.com/books/graph-algorithms-for-data-science – use code au35bra for 35% discount

Related Videos