One Hot Encoding

The One Hot Encoding function is used to convert categorical data into a numerical format that can be used by Machine Learning libraries.

This feature is in the alpha tier. For more information on feature tiers, see API Tiers.

One Hot Encoding sample

One hot encoding will return a list equal to the length of the available values. In the list, selected values are represented by 1, and unselected values are represented by 0.

The following will run the algorithm on hardcoded lists:
RETURN gds.alpha.ml.oneHotEncoding(['Chinese', 'Indian', 'Italian'], ['Italian']) AS value
Table 1. Results
value

[0,0,1]

The following will create a sample graph:
CREATE (french:Cuisine {name:'French'}),
       (italian:Cuisine {name:'Italian'}),
       (indian:Cuisine {name:'Indian'}),

       (zhen:Person {name: "Zhen"}),
       (praveena:Person {name: "Praveena"}),
       (michael:Person {name: "Michael"}),
       (arya:Person {name: "Arya"}),

       (praveena)-[:LIKES]->(indian),
       (zhen)-[:LIKES]->(french),
       (michael)-[:LIKES]->(french),
       (michael)-[:LIKES]->(italian)
The following will return a one hot encoding for each user and the types of cuisine that they like:
MATCH (cuisine:Cuisine)
WITH cuisine
  ORDER BY cuisine.name
WITH collect(cuisine) AS cuisines
MATCH (p:Person)
RETURN p.name AS name, gds.alpha.ml.oneHotEncoding(cuisines, [(p)-[:LIKES]->(cuisine) | cuisine]) AS value
  ORDER BY name
Table 2. Results
name value

Arya

[0,0,0]

Michael

[1,0,1]

Praveena

[0,1,0]

Zhen

[1,0,0]

Table 3. Parameters
Name Type Default Optional Description

availableValues

list

null

yes

The available values. If null, the function will return an empty list.

selectedValues

list

null

yes

The selected values. If null, the function will return a list of all 0’s.

Table 4. Results
Type Description

list

One hot encoding of the selected values.