Note: Timings
for
all events
are listed in the local timezone detected
from your browser -
This stream we will present how to combine web scraping, OCR, and NLP techniques to construct the Matrix interaction network.
– Scraping Matrix fandom page with Selenium
– Using PyTesseract to read the Matrix movie script PDF
– Extract characters in each scene by using the SpaCy’s rule-based matcher
– Construct and analyze the character’s co-occurrence network in Neo4j
Blog: https://towardsdatascience.com/construct-the-matrix-interaction-network-based-on-the-movie-script-738b4fa9b46d
Follow Tomaz: https://twitter.com/tb_tomaz