apoc.nlp.azure.entities.stream

Procedure APOC Full

Provides a entity analysis for provided text

Signature

apoc.nlp.azure.entities.stream(source :: ANY?, config = {} :: MAP?) :: (node :: NODE?, value :: MAP?, error :: MAP?)

Input parameters

Name Type Default

source

ANY?

null

config

MAP?

{}

Output parameters

Name Type

node

NODE?

value

MAP?

error

MAP?

Install Dependencies

The NLP procedures have dependencies on Kotlin and client libraries that are not included in the APOC Library.

These dependencies are included in apoc-nlp-dependencies-4.4.0.34.jar, which can be downloaded from the releases page. Once that file is downloaded, it should be placed in the plugins directory and the Neo4j Server restarted.

Setting up API Key

We can generate an API key and URL by following the instructions in the Quickstart: Use the Text Analytics client library article. Once we’ve done that, we should be able to see a page listing our credentials, similar to the screenshot below:

azure text analytics keys
Figure 1. Azure Text Analytics credentials

In this case our API URL is https://neo4j-nlp-text-analytics.cognitiveservices.azure.com/, and we can use either of the hidden keys.

Let’s populate and execute the following commands to create parameters that contains these details.

The following define the apiKey and apiSecret parameters
:param apiKey => ("<api-key-here>");
:param apiUrl => ("<api-url-here>");

Alternatively we can add these credentials to apoc.conf and retrieve them using the static value storage functions. See Static Value Storage

apoc.conf
apoc.static.azure.apiKey=<api-key-here>
apoc.static.azure.apiUrl=<api-url-here>
The following retrieves AWS credentials from apoc.conf
RETURN apoc.static.getAll("azure") AS azure;
Table 1. Results
azure

{apiKey: "<api-key-here>", apiUrl: "<api-url-here>"}

Usage Examples

The examples in this section are based on the following sample graph:

CREATE (:Article {
  uri: "https://neo4j.com/blog/pokegraph-gotta-graph-em-all/",
  body: "These days I’m rarely more than a few feet away from my Nintendo Switch and I play board games, card games and role playing games with friends at least once or twice a week. I’ve even organised lunch-time Mario Kart 8 tournaments between the Neo4j European offices!"
});

CREATE (:Article {
  uri: "https://en.wikipedia.org/wiki/Nintendo_Switch",
  body: "The Nintendo Switch is a video game console developed by Nintendo, released worldwide in most regions on March 3, 2017. It is a hybrid console that can be used as a home console and portable device. The Nintendo Switch was unveiled on October 20, 2016. Nintendo offers a Joy-Con Wheel, a small steering wheel-like unit that a Joy-Con can slot into, allowing it to be used for racing games such as Mario Kart 8."
});

We can use this procedure to extract the entities from the Article node. The text that we want to analyze is stored in the body property of the node, so we’ll need to specify that via the nodeProperty configuration parameter.

The following streams the entities for the Pokemon article:

MATCH (a:Article {uri: "https://neo4j.com/blog/pokegraph-gotta-graph-em-all/"})
CALL apoc.nlp.azure.entities.stream(a, {
  key: $apiKey,
  url: $apiUrl,
  nodeProperty: "body"
})
YIELD value
UNWIND value.entities AS entity
RETURN entity;
Table 2. Results
entity

{name: "Nintendo Switch", wikipediaId: "Nintendo Switch", type: "Other", matches: [{length: 15, text: "Nintendo Switch", wikipediaScore: 0.8339868065025469, offset: 56}], bingId: "b3d617ef-81fc-4188-9a2b-a5cf1f8534b5", wikipediaLanguage: "en", wikipediaUrl: "https://en.wikipedia.org/wiki/Nintendo_Switch"}

{name: "Nintendo Switch", type: "Organization", matches: [{length: 15, entityTypeScore: 0.94, text: "Nintendo Switch", offset: 56}]}

{name: "Oberon Media", wikipediaId: "Oberon Media", type: "Organization", matches: [{length: 6, text: "I play", wikipediaScore: 0.032446316016667254, offset: 76}], bingId: "166f6e0f-33b7-8707-bb8b-5a932c498333", wikipediaLanguage: "en", wikipediaUrl: "https://en.wikipedia.org/wiki/Oberon_Media"}

{name: "a week", subType: "Duration", type: "DateTime", matches: [{length: 6, entityTypeScore: 0.8, text: "a week", offset: 166}]}

{name: "Mario Kart 8", wikipediaId: "Mario Kart 8", type: "Other", matches: [{length: 12, text: "Mario Kart 8", wikipediaScore: 0.7802000593632747, offset: 205}], bingId: "ce6f55ec-d3d7-032a-0bf8-15ad3e8df3f4", wikipediaLanguage: "en", wikipediaUrl: "https://en.wikipedia.org/wiki/Mario_Kart_8"}

{name: "Mario Kart", type: "Organization", matches: [{length: 10, entityTypeScore: 0.72, text: "Mario Kart", offset: 205}]}

{name: "8", subType: "Number", type: "Quantity", matches: [{length: 1, entityTypeScore: 0.8, text: "8", offset: 216}]}

{name: "Neo4j", wikipediaId: "Neo4j", type: "Other", matches: [{length: 5, text: "Neo4j", wikipediaScore: 0.8150388253887939, offset: 242}], bingId: "bc2f436b-8edd-6ba6-b2d3-69901348d653", wikipediaLanguage: "en", wikipediaUrl: "https://en.wikipedia.org/wiki/Neo4j"}

{name: "Europe", wikipediaId: "Europe", type: "Location", matches: [{length: 8, text: "European", wikipediaScore: 0.00591759926701263, offset: 248}], bingId: "501457aa-5b70-cfba-cfd8-be882b4bac1e", wikipediaLanguage: "en", wikipediaUrl: "https://en.wikipedia.org/wiki/Europe"}

We get back 9 different entities, although we can see that some of them are referring to the same things, albeit with different type values. We could then apply a Cypher statement that creates one node per entity and an ENTITY relationship from each of those nodes back to the Article node.

The following streams the entities for the Pokemon article and then creates nodes for each entity:

MATCH (a:Article {uri: "https://neo4j.com/blog/pokegraph-gotta-graph-em-all/"})
CALL apoc.nlp.azure.entities.stream(a, {
  key: $apiKey,
  url: $apiUrl,
  nodeProperty: "body"
})
YIELD value
UNWIND value.entities AS entity
WITH a, entity.name AS entity, collect(entity.type) AS types
MERGE (e:Entity {name: entity})
SET e.type = types
MERGE (a)-[:ENTITY]->(e);

If we want to automatically create an entity graph, see apoc.nlp.azure.entities.graph.