Cypher Sleuthing: Dealing with Dates, Part 4


Image by Markus Winkler, Unsplash
My previous part 1, part 2, and part 3 posts on this topic introduced Cypher dates, translated formats to the Cypher-supported ISO 8601 format, calculated durations, accessed components, and translated durations to certain values. If you read those, then this post is the next one in the series showing you how to convert durations across component categories. If you haven’t read the previous posts leading up to this one, feel free to catch up (recommend at least Part 3 as this post’s prequel) — though it isn’t required. 🙂 This is a five-part series, with each part focusing on a different piece of Cypher dates – part1, part2, part3, part4 (this one), part5. In our Part 3 post, we covered three duration categories (and related functions) for converting duration values to whole values — months, days, seconds. We also looked at components and how to use them to access other values outside of those months, days, and seconds. In this post, we will combine the three duration functions along with various components to translate values from a unit in one category to a unit in another category. Then, we will take a look at a couple of fun puzzles having to do with Cypher and dates to wrap up.

Rewind: Quick Catchup

Let’s review a couple of principles on duration categories and components from our last post before we dive into using them together. There are 3 component groups (shown again in the screenshot below), and these form the foundation for duration units and conversions.
The table can be found in the Cypher manual section.
As a reminder: the column on the right is the key — values in one column can be converted to any other unit in that same cell, but not another row. For instance, I could convert a duration from days to weeks, but not to months or minutes. Notice, also, that these categories correspond to our duration functions of inMonths(), inDays(), and inSeconds(), which we will discuss here! Finally, here are the rules I discovered for the Cypher component groups.
  1. You can only convert among units in a component grouping, not across groups.
  2. There must be whole values to convert to larger units. Components do not mix whole and fraction values (i.e. 36 hours -> 1.5 days). There are specific components that handle whole values of the unit, and there are separate components that handle only fractions of the unit.
Photo by Jon Tyson on Unsplash
Now, per our rules, we are limited in expressing certain durations as numbers that aren’t easy to understand or read. For instance, saying that my flight leaves in 28 hours (while precise) probably triggers a mental calculation to 1 day 4 hours. Because our brains are used to allocating resources into the largest buckets first, then remainders into smaller categories, our perception of time and planning seems to operate better in these formats. As another example, most things operate on a 12-hour clock (3:00pm), except for where precision matters. In those cases, you will see a 24-hour clock (15:00). But telling a random person on the street that the time is fifteen hundred will most likely return confused looks. Think smaller numbers, larger units (1 day vs 24 hours, 2 months vs 60 days, etc). OK, so how can we translate some of these larger-number-smaller-unit values into something more easily understood? Earlier, I mentioned that the duration categories (Months, Days, Seconds) align with the duration functions (inMonths(), inDays(), inSeconds()). This is to allow conversions across component groups!

Combining Duration Functions and Components

Using durations functions with components means we can convert our duration values from one component category into another, and then translate among the components within that group.
Note that precision of conversions may not always be 100% accurate, as certain conversions are not consistent throughout the year. However, the conversions use general standards (30 days in a month, 24 hours in a day). Let’s expand the power of converting durations across categories by combining functions and components! Example 1: Translate duration from months/days to weeks/days
MATCH (v:Vacation)
RETURN
duration.between(date(),v.startDate) as preciseDuration,
duration.inDays(date(),v.startDate).weeks as weeks, duration.inDays(date(),v.startDate).daysOfWeek as daysOfWeek;
In our example above, we are translating a duration in months and days into days and weeks. First, we need to calculate the entire value into days using the duration.inDays() method. Then, we can access the weeks and days components from that. Note that, depending on the input dates (and year), the results can vary — i.e. durations with those same dates but in a leap year or from June 30-Sept 1 that includes two consecutive months with 31 days each. Let’s take another example! Example 2: Calculate projected hours a resource works on a project
MATCH (:Employee)-[rel:ASSIGNED]-(p:Project)
RETURN 
duration.between(rel.startDate, p.expectedEndDate) as lengthAssigned, 
duration.inSeconds(rel.startDate, p.expectedEndDate).hours as lengthInHours;
You might respond that this assumes an employee spent 24 hours per day and seven days per week on a project, right? You would be correct! Let’s correct this, since our lengthInHours value is now just a number and not a duration value.
MATCH (:Employee)-[rel:ASSIGNED]-(p:Project)
RETURN duration.between(rel.startDate, p.expectedEndDate) as lengthAssigned, 
duration.inDays(rel.startDate, p.expectedEndDate).weeks as lengthInWeeks, 
duration.inDays(rel.startDate, p.expectedEndDate).weeks * 40 as projectHours;
Notice that in order to get the project hours, I took the approach to find the total number of weeks for the project, then calculate the number of hours per week worked (assuming a 40 hour work week and 100% allocation of the resource). Now, this number may not be the actual total of the completed project, but it would definitely be a good estimate number. Switching gears just a bit, infant age is always confusing to me because some people go by weeks, months, or years (e.g. baby is 14 months old… cue math calculation in brain). Let’s let Cypher do the work.   Example 3: Calculate age of infant
MERGE (b:Baby)
 SET b.dateOfBirth = date(‘2021–02–28’)
RETURN b.dateOfBirth, 
duration.between(b.dateOfBirth, date()) as age, 
duration.inMonths(b.dateOfBirth, date()).months as months, 
duration.inDays(date(‘2021–02–28’),date()).weeks as weeks;
In the code above, we set the baby’s birthdate as February 28, 2021, then in the return statement, calculate the baby’s age by precise duration, months, and weeks. In order to get months, we calculate the duration directly to months with the inMonths() function. Then, to get weeks, we first need to convert to the days/weeks category using the inDays() function, then grab the component for weeks. No more mental calculation! We could be even more specific with the components to preserve remainder values.   Example 4: Calculate age of infant with remainder units
MATCH (b:Baby)
RETURN b.dateOfBirth, 
duration.between(b.dateOfBirth, date()) as age, 
duration.inDays(b.dateOfBirth,date()).weeks as weeks, 
duration.inDays(b.dateOfBirth,date()).daysOfWeek as daysOfWeek;
We could use this in a baby tracker app to tell us that a baby is exactly 19 weeks and 1 day old. This could also be used for more precision on a product/application being live, amount of time without incidents, or many other use cases!

Cypher Puzzles

In the past couple of weeks, I have come across a couple of fun puzzles with Cypher dates that I’d like to share with you. I will include answers, but I’ll post those at the bottom, so that those who want to challenge themself without peeking first can solve the puzzles.

Postgres SQL Ranges

The first challenge is a calendar appointment query. Postgres received an update, which improves queries in SQL for range data. This gives us a fun opportunity to see what Cypher’s version of this looks like. Let’s take a look at the question.
  • Available dates in the next month (source)
Updated SQL solution:
SELECT datemultirange(daterange(‘2021–06–01’, ‘2021–06–30’, ‘[]’)) 
       - range_agg(appointment_dates) AS availability
FROM appointments
WHERE appointment_dates && 
      daterange(‘2021–06–01’, ‘2021–06–30’, ‘[]’);
Solve away! Answer will be posted at the bottom of this post.

Weekly Progress of Year

Our second challenge is to write a Cypher query that visualizes progress through the year on a weekly basis. We will stick to using regular characters to visualize the progress, so it won’t be anything fancy and no extra tools will be needed. Let’s see our task.
  • Create a progress bar for how many weeks of the year have passed (include percentage, too)

Example characters for progress visualization:

28*’#’ + 22*’-’
Have at it! Answer will be posted at the bottom of this post.

Kudos

Quick shout-out to my colleague Michael Hunger who suggested both of these challenges and provided far cleaner and efficient solutions than those I was able to draft. 😁 Michael is widely revered as the founder of the APOC library, contributor of many other core aspects of Neo4j, and guru on Cypher. He may very well be the cornerstone of Neo4j developers and maintains an impressive presence on all content platforms, so if you needed help on anything Neo4j-related over the years, there’s a high probability that you have run into him.

Solution: Postgres SQL Ranges

There are several ways you can write this query, and even more when you start considering different data models. However, we will see two solutions that work, then I’ll include a brief explanation of the logic. Option 1:
WITH date(‘2021–06–01’) as start, date(‘2021–06–30’) as end
UNWIND [days IN range(0,duration.between(start,end).days) | 
         start + duration({days:days})] as day
OPTIONAL MATCH (a:Appointment {date:day}) 
WITH * WHERE a IS NULL
RETURN day;
Option 2:
WITH date(‘2021–06–01’) as start, date(‘2021–06–30’) as end
UNWIND [days IN range(0,duration.between(start,end).days) | 
         start + duration({days:days})] as day
WITH * WHERE NOT EXISTS { (:Appointment {date:day}) }
RETURN day;
In both of our solutions, we are first setting a start and end date of the month (you could choose any, but we just picked last month) and unwinding the days between those two dates (from start to end of the duration between) as each day. Then, we take all those days and see if there is an appointment that already exists on any of them and return only the remaining days.

Solution: Weekly Progress of Year

Just as with the first challenge’s solution, there are some different ways to tackle this one. You could use a variety of characters and tools to create something intriguing, but we are keeping it simple, clean, and efficient. Option 1:
WITH datetime().week as week, 52 as weeks
RETURN reduce(r=’’, w in range(1,weeks) | 
        r + CASE WHEN w < week THEN ‘#’ ELSE ‘-’ END) 
        + ‘ ‘ + (100*week/weeks) +’%’ as progress;
Option 2:
WITH datetime().week as week, 52 as weeks
RETURN reduce(r=’’, w in range(1,weeks) | 
        r + CASE WHEN w < week THEN ‘>’ 
                 WHEN w=week THEN ‘8’ 
                 ELSE ‘<’ END) 
        + ‘ ‘ + (100*week/weeks) +’%’ as progress;
The solutions to this problem might appear a bit more complicated, but we can break them down into manageable pieces. First, we need to find out what week of the year is the current week and note the total weeks in the year as our starting and end points. The next line returns the calculation using a reduce() function, which hops through a list of items and aggregates the current item to the current aggregate-value of all previous items. For example, if I had a list of 1,2,3,then reduce with a sum would have final results of 6 (1+2+3). Inside reduce(), we establish a result variable, and loop through each week in the number of weeks in the year (range(1,weeks)). On the right side of the pipe character, we then have our expression to aggregate our variable at each item in the list. We evaluate r — when the week number from our loop is less than the current week (in the past), we use one character; (solution 2) when the week is current week, we use another character; and when the week is greater than current week (yet-to-come), then we use a different character. Finally, we attach a percentage to the end of the output by calculating the current week number divided by total weeks (28/52) and multiplying the resulting fraction by 100 for the result.

Wrapping Up!

In this post (Part 4!), we have seen how to combine duration functions and components in order to translate durations in one component category to another one. Then, we put our new skills to the test with a couple of Cypher date challenges, stating the problems to solve, and then walking through some solutions and their logic. If you’d like to see some more Cypher sleuthing, I presented a session at NODES (Neo4j’s online developer conference) that covered a high-level overview of these date concepts and other gotchas. The recording is now available on YouTube! In the next post, we will step through the date procedures/functions that the APOC library offers and discuss which ones are replaceable with built-in Cypher functionality or still required to accomplish specific tasks related to temporal data. Until next time, happy coding! To read the next part of this series, read part 5.

Resources


Cypher Sleuthing: Dealing with Dates Part 4 was originally published in Neo4j Developer Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.