8.1. One Hot Encoding

This section describes the One Hot Encoding function in the Neo4j Graph Algorithms library.

The One Hot Encoding function is used to convert categorical data into a numerical format that can be used by Machine Learning libraries.

8.1.1. One Hot Encoding sample

One hot encoding will return a list equal to the length of the available values. In the list, selected values are represented by 1, and unselected values are represented by 0.

The following will run the algorithm on hardcoded lists: 

RETURN algo.ml.oneHotEncoding(["Chinese", "Indian", "Italian"], ["Italian"]) AS vector

Table 8.1. Results
vector

[0,0,1]

The following will create a sample graph: 

MERGE (french:Cuisine {name:'French'})
MERGE (italian:Cuisine {name:'Italian'})
MERGE (indian:Cuisine {name:'Indian'})

MERGE (zhen:Person {name: "Zhen"})
MERGE (praveena:Person {name: "Praveena"})
MERGE (michael:Person {name: "Michael"})
MERGE (arya:Person {name: "Arya"})

MERGE (praveena)-[:LIKES]->(indian)
MERGE (zhen)-[:LIKES]->(french)
MERGE (michael)-[:LIKES]->(french)
MERGE (michael)-[:LIKES]->(italian);

The following will return a one hot encoding for each user and the types of cuisine that they like: 

MATCH (cuisine:Cuisine)
WITH cuisine ORDER BY cuisine.name
WITH collect(cuisine) AS cuisines
MATCH (p:Person)
RETURN p.name AS person,
       algo.ml.oneHotEncoding(cuisines, [(p)-[:LIKES]->(cuisine) | cuisine]) AS encoding
ORDER BY person

Table 8.2. Results
Name Embedding

Arya

[0,0,0]

Michael

[1,0,1]

Praveena

[0,1,0]

Zhen

[1,0,0]

Table 8.3. Parameters
Name Type Default Optional Description

availableValues

list

null

yes

The available values. If null, the function will return an empty list.

selectedValues

list

null

yes

The selected values. If null, the function will return a list of all 0’s.

Table 8.4. Results
Type Description

list

One hot encoding of the selected values.