GraphGists

Idea

The aim of this work is to show that it is posible to build an interesting service (at least from my point of view:) ) with open data retrieved from sources of public institutions. This exemplary service facilitates travels and uses public data about countries and cities provided by The United Nations Economic Commission for Europe (UNECE) but it is also enriched with information about restaurants and hostels retrieved from OpenStreetMap . Currently, even when you are no very rich, you can afford traveling the world - there are cheap airlines, you can sleep on someone’s couch for free, find tons of useful tips in the Internet about how to see the world and not being ruined. Nowadays, the problem might be actually where to go for a weekend, for a short trip or for a one in a lifetime adventure as the choice of possibilies is huge. The Travel Helper service is meant to recommend people traveling directions and facilities.

Data

To realize basic functionalities of the service presented above, the follwing data is needed:

  • data concerning different places: countries, regions, cities - in this case open data was retrieved from The United Nations Economic Commission for Europe (UNECE).

  • data about POI: hotels and restaurants that can be fetched from OpenStreetMap. This data is available under the Open Database Licence and is the property of © OpenStreetMap contributors

  • Everything shouch be enriched with the information about people, their travels to various places and visits in restaurants, bars. I believe this missing data can be collected by the Travel Help service itself.

Data Model

The designed data model is presented on the figure below:

fSZBomW

The nodes and relationships of the model together with appropriate examples are shown in the following tables:

Table 1. Domains
Domain Attributes Example

Person

name, age, blog address

Bill, age 32

Place:Country

name, code

Poland

Place:Region

name, code, type

Provence (France)

Place:City

name, code, coordinates

Warsaw

PlaceToSleep

name, website, address

Hilton Hotel in New York, Camp in Scottish Highlands

Sustenance

name, website, address, cuisine

Creperie in Paris, Burger Bar in Chicago

Trip

type, duration, year season

Trip arround Europe, weekend in London

Table 2. Relationships
Start Node Relationship End Node Example

Place

BELONGS_TO

Place

Warsaw BELONGS_TO Poland

Person

LIVES_IN (from, to)

Place

Kate LIVES_IN Moscow (from 2014)

Person

WENT_FOR

Trip

Bob WENT_FOR a trip around the world

Trip

TO (transportation)

Place

trip TO London (transportation = plane)

Trip

STARTS_FROM

Place

trip to London STARTS_FROM Berlin

Trip

IS_PART_OF (order no)

Trip

trip to London IS_PART_OF trip around the world (order_no = 1)

Trip

STAYED_AT (rate, avg price per night)

PlaceToSleep

during trip to London Kate STAYED at Hilton Hotel (rate = 5, avg price per night = 1000)

Trip

WENT_TO (rate, avg money spent)

Sustenance

during trip to London Kate WENT_TO at Dawsan Restaurant (rate = 5, avg money spent = 1000)

Sustenance

IS_LOCATED_IN

Place

Dawsan Restaurant IS_LOCATED_IN London

PlaceToSleep

IS_LOCATED_IN

Place

Hilton Hotel IS_LOCATED_IN London

Graph data upload

Firtly, the test data is added to the database.

The uploaded data consists of information about people, places and trips of these people to various places:

Places:

  • Poland : Warsaw, Cracow, Zakopane, Torun, Gdansk, Poznan;

  • France : Paris, Nice, Avignon, Lyon, Marseille, Perpignon;

  • Italy : Rome, Milan, Palermo, Neapol, Bari

  • Spain : Barcelona, Madrid, Seville, Bilbao

  • Portugal : Porto, Lisbon, Cascais, Faro

  • United Kingdom: London, Glasgow, Manchester, Cardiff

  • USA : Chicago, New York, Boston, Philadelphia, Washington, Seattle, San Francisco, San Jose, Monterey, Santa Barbara, Los Angeles, Las Vegas

The data about those places comes from UNECE sources. The CSV files containing coutries, regions and cities have been modified this way so they consist only data concerning the above countries, cities and regions. This way, graph rendering is possible in this place. This data is uploaded the following way:

create index on :Place(name);
create index on :Country(name);
create index on :Region(name);
create index on :City(name);
create index on :Place(code);
create index on :Country(code);
create index on :Region(code);
create index on :City(code);

load csv with headers from
'https://gist.githubusercontent.com/justynaGithub/45be86f418c009f0dcaf/raw/f8666bcc7cd9a9e8a0e191f148c67b88b6b58d06/countries.csv' as line fieldterminator ','
WITH line.CountryCode as CountryCode, line.CountryName as CountryName
CREATE (p:Place:Country{code:CountryCode, name:CountryName});

load csv with headers from
'https://gist.githubusercontent.com/justynaGithub/ce3bc36eb55c71a7931a/raw/fd9962071e13b1db7ace1cb2b971c150c91cda50/subdiv.csv' as line fieldterminator ','
WITH line.CountryCode as CountryCode, line.RegionCode as RegionCode, line.RegionName as RegionName, line.RegionType as RegionType
MATCH (country:Country {code:CountryCode})
CREATE (p:Place:Region{code:RegionCode, name:RegionName})-[:BELONGS_TO]->country;

load csv with headers from
'https://gist.githubusercontent.com/justynaGithub/d7708b8cd2891f876199/raw/e4a64ab07772452b9a23f48adbab16dd7213d522/cities.csv' as line fieldterminator ','
WITH line.CountryCode as CountryCode, line.CityCode as CityCode, line.CityNameNoSpecialChars as CityName, line.RegionCode as RegionCode, line.Coordinates as Coordinates
MATCH (country:Country {code:CountryCode})
OPTIONAL MATCH country<-[:BELONGS_TO]-(region:Region{code:RegionCode})
FOREACH (o IN CASE WHEN region IS NOT NULL THEN [region] ELSE [] END |
	CREATE (c:Place:City{code:CityCode, name:CityName, coordinates:Coordinates})-[:BELONGS_TO]->(region)
)
FOREACH (o IN CASE WHEN region IS NULL THEN [region] ELSE [] END |
	CREATE (c:Place:City{code:CityCode, name:CityName, coordinates:Coordinates})-[:BELONGS_TO]->(country)
);

Now, there is already the graph of chosen test countries with identified cities and districts they belong to:

The next step is to upload the data about restaurants and hotels in Warsaw - only this one city has been chosen to show the application of this data. The data from OpenStreetMap has been retrieved with use of https://overpass-turbo.eu/s/e6d and translated to CVS file.

//restaurants
load csv with headers from
'https://gist.githubusercontent.com/justynaGithub/a5fdb93fc28988d03eb8/raw/554fd7f02a5e57b819533bbb618e0774c6a1755b/restaurantsWarsaw.csv' as line fieldterminator ','
WITH line.name as Name, line.lon as Lon, line.lat as Lat, line.cuisine as Cuisine, line.addr_city as City, line.addr_treet as Street, line.addr_housenumber as Housenumber, line.website as Website
MATCH (warsaw:City{name:'Warszawa'})
CREATE (:Sustenance:Restaurant{name:Name, lon:Lon, lat:Lat,city: City,street:Street, housenumber:Housenumber, cuisine:Cuisine, website:Website})-[:IS_LOCATED_IN]->(warsaw);

//hotels
load csv with headers from
'https://gist.githubusercontent.com/justynaGithub/ee34f74812779b2b692d/raw/2509cd53639b209987a26590cf776ee563679d57/hotelsWarsaw.csv' as line fieldterminator ','
WITH line.name as Name, line.lon as Lon, line.lat as Lat, line.addr_city as City, line.addr_street as Street, line.addr_housenumber as Housenumber, line.website as Website
MATCH (warsaw:City{name:'Warszawa'})
CREATE (:PlaceToSleep:Hotel{name:Name, lon:Lon, lat:Lat, city:City, street:Street, housenumber:Housenumber, website:Website})-[:IS_LOCATED_IN]->warsaw

Restaurants and hotels in Warsaw:

MATCH (a)-[r:IS_LOCATED_IN]->(warsaw:City{name:'Warszawa'})
RETURN a, r, warsaw

Next thing is to add some examplary people and they trips to various places.

People:

  • Kate, age: 30, from Madrid in Spain, traveled around USA, went to Barcelona

  • Ben, age: 56, from London in UK, went to USA

  • Tom, age: 40, from Madrid in Spain, spent a weekend in London

  • John, age: 34, from Madrid in Spain, spent a weekend in Barcelona

  • Claudia, age: 26, from Lisbon in Portugal, traveled around Poland

  • Norah, age: 18, from Chicago in USA, traveled around Poland

  • Lucas, age: 30, from Warsaw in Poland, traveled around Europe

  • Pedro, age: 32, from Rome in Italy, traveled around Poland

  • Pierre, age: 40, from Nice in France, traveled around Poland

  • Laura, age: 31, from Madrid in Spain, looking for an inspiration for traveling

Use cases

Having the data collected about people and their travels to various places, one can use this data to recommend places and facilities that might be better suited for people needs. The examplary use cases can be divided into two groups: using TravelHelper when planning holidays in advance and TravelHelper when being in need during holidays. It is assumed in the examples below that person that is looking for help is Laura, age: 31, from Madrid in Spain.

1. How to use TravelHelper to plan holidays?

1a. I am Laura, 31, from Madrid. Where can I go for a weekend?

MATCH (weekend:Trip{duration:2})-[:STARTS_FROM]->(madrid:Place{name:'Madrid'}),
(trip:Trip)-[:IS_PART_OF]->(weekend),
(trip)-[:TO]->(place:Place)
WHERE place.name <> 'Madrid'
WITH place.name as placeName, count(place) as counts
RETURN placeName
ORDER BY counts DESC

1b. I am Laura, 31, from Madrid. I am planning to go to USA for one month. I want to see as many places as possible. Show me how people travel there.

MATCH (shortTrip:Trip)-[:TO]->(:Place)-[:BELONGS_TO*]->(:Country{code:'US'}),
(shortTrip)-[:IS_PART_OF]->(usaTrip:Trip)-[:STARTS_FROM]->(start_place:Place)
WHERE usaTrip.duration<32
WITH DISTINCT usaTrip, start_place.name as start_place
MATCH (:Country{code:'US'})<-[:BELONGS_TO*]-(city:Place)<-[to:TO]-(shortTrip:Trip)-[part:IS_PART_OF]->(usaTrip)
WITH usaTrip.name as tripName, start_place, city.name as name, part.order_no as order_no, to.transportation as by
ORDER BY order_no
WITH tripName, start_place, collect({order_no:order_no, to:name, by:by}) as cities
WITH tripName, start_place, cities, size(cities) as nbrOfCities
RETURN tripName, start_place, cities
ORDER BY nbrOfCities DESC

1c. I am Laura, 31, from Madrid. I need an inspiration for a long travel. I want to see as many places as possible. Show me travels of other people.

MATCH (:Trip)-[:IS_PART_OF]->(longTrip:Trip)-[:STARTS_FROM]->(start_place:Place)
WITH DISTINCT longTrip, start_place.name as start_place
MATCH (city:Place)<-[to:TO]-(shortTrip:Trip)-[part:IS_PART_OF]->(longTrip)
WITH longTrip.name as tripName, start_place, city.name as name, part.order_no as order_no, to.transportation as by
ORDER BY order_no
WITH tripName, start_place, collect({order_no:order_no, to:name, by:by}) as cities
WITH tripName, start_place, cities, size(cities) as nbrOfCities
RETURN tripName, start_place, cities
ORDER BY nbrOfCities DESC

2. How to use TravelHelper ad-hoc during holidays?

MATCH (restaurant:Sustenance)-[IS_LOCATED_IN]->(:Place{name:'Warszawa'}),
(client:Person)-[:WENT_FOR]->(:Trip)-[meal:WENT_TO]->restaurant
WHERE client.age>25 AND client.age<36
WITH DISTINCT restaurant.name as resto, collect(meal) as meals
WITH resto, (reduce(s = 0 , x IN meals | s + x.rate))/size(meals) as avg_rate
RETURN resto, avg_rate
ORDER BY avg_rate DESC
MATCH (hotel:PlaceToSleep)-[IS_LOCATED_IN]->(:Place{name:'Warszawa'}),
(client:Person)-[:WENT_FOR]->(:Trip)-[stay:STAYED_AT]->hotel
WITH DISTINCT hotel.name as hotel, hotel.website as website, collect(stay) as stays
WITH hotel, website, (reduce(s = 0 , x IN stays | s + x.avg_price_per_night))/size(stays) as avg_price
WHERE avg_price<200
RETURN hotel, website, avg_price
ORDER BY avg_price

2c. I am Laura, 31, from Madrid. Currently visiting Warsaw in Poland. I want to spend more time in Poland than I planned previously. Where can I go next?

MATCH (warsawTrip:Trip)-[:TO]->(place:Place{name:'Warszawa'}),
(warsawTrip)-[warsawPart:IS_PART_OF]->(longTrip:Trip),
(previousPlace:Place)<-[:TO]-(previousTrip)-[previousPart:IS_PART_OF]->longTrip,
(place)-[:BELONGS_TO*]->(:Country{name:'Poland'})<-[BELONGS_TO]-(previousPlace)
WHERE previousPart.order_no = warsawPart.order_no -1
RETURN previousPlace.name as place
UNION
MATCH (warsawTrip:Trip)-[:TO]->(place:Place{name:'Warszawa'}),
(warsawTrip)-[warsawPart:IS_PART_OF]->(longTrip:Trip),
(nextPlace:Place)<-[:TO]-(nextTrip)-[nextPart:IS_PART_OF]->longTrip,
(place)-[:BELONGS_TO*]->(:Country{name:'Poland'})<-[BELONGS_TO]-(nextPlace)
WHERE nextPart.order_no = warsawPart.order_no +1
RETURN nextPlace.name as place

Summary

Presented model already enables various recommendation as shown in use cases and it seems it has potential to be still expanded. The model can be enriched with additional relationships, like Person can FOLLOW another Person, Place IS CLOSE to another Place, additional labels can be added to places, like Island, Continent. These extra relationships and labels could help to improve the personalization of travel directions recommendations.