Expressing Recurring Events

Paul Ganssle
Bloomberg


dateutil on Github
Github repo for this talk: pganssle-talks/boston-python-rrules

dateutil.rrule

rrule is an implementation of recurrence rules.

RFC 5545: "Internet Calendaring and Scheduling Core Object Specification"

Top level ToC for RFC 5545, illustrating that it's 170 pages!

Pat Morita's Birthday

Pat Morita and Ron Howard, Public Domain photo
In [6]:
rr = rrule(freq=YEARLY, bymonth=6, bymonthday=28,  # June 6 every year
           dtstart=datetime(1932, 6, 28),          # Starting in 1938
           until=datetime(2005, 11, 24))           # Ending in 2005
print_rr_elems(rr)
1932-06-28 00:00:00, 1933-06-28 00:00:00, ...

... falling on a Monday

In [7]:
rr = rr.replace(byweekday=MO)
print_rr_elems(rr)
1937-06-28 00:00:00, 1943-06-28 00:00:00, ...

... during the Vietnam War

In [8]:
rr.between(datetime(1955, 11, 1),
           datetime(1975, 4, 30))
Out[8]:
[datetime.datetime(1965, 6, 28, 0, 0), datetime.datetime(1971, 6, 28, 0, 0)]

RRULE components

Fundamental elements of an rrule are:

  • dtstart: The start point of the recurrence (this is similar to a phase)
  • freq: The units of the fundamental frequency of the recurrence. It takes the values YEARLY, MONTHLY, WEEKLY, DAILY, HOURLY, MINUTELY, SECONDLY
  • interval: The fundamental frequency of the recurrence, in units of freq. If unspecified, this is 1.
In [10]:
hourly = rrule(freq=HOURLY, interval=1, dtstart=datetime(2016, 7, 18, 9),  count=3)
interval_2 = hourly.replace(interval=2)
dtstart_rr = hourly.replace(dtstart=datetime(2016, 7, 18, 10))

print_rrs([hourly, dtstart_rr, interval_2], ['Hourly', 'dtstart', 'interval=2'])
          Hourly          |         dtstart          |        interval=2        
--------------------------------------------------------------------------------
     2016-07-18 09:00     |     2016-07-18 10:00     |     2016-07-18 09:00     
     2016-07-18 10:00     |     2016-07-18 11:00     |     2016-07-18 11:00     
     2016-07-18 11:00     |     2016-07-18 12:00     |     2016-07-18 13:00     

byxxx rules

byxxx rules serve to modify the frequency of the recurrence in some way. The supported rules are bymonth, bymonthday, byyearday, byweekno, byweekday, byhour, byminute and bysecond, bysetpos and byeaster.

  • byxxx rules greater than or equal to freq are constraints and (generally) reduce the frequency of the recurrence:
In [11]:
# Base is DAILY, but by restricted to Tuesdays in November
list(rrule(DAILY, bymonth=11, byweekday=(TU, ),
           dtstart=datetime(2015, 1, 1, 12), count=5))
Out[11]:
[datetime.datetime(2015, 11, 3, 12, 0),
 datetime.datetime(2015, 11, 10, 12, 0),
 datetime.datetime(2015, 11, 17, 12, 0),
 datetime.datetime(2015, 11, 24, 12, 0),
 datetime.datetime(2016, 11, 1, 12, 0)]
  • byxxx rules less than freq will generally increase the frequency of the recurrence:
In [12]:
list(rrule(MONTHLY, bymonthday=(1, 15, 30),
           dtstart=datetime(2015, 1, 16, 12, 15), count=4))
Out[12]:
[datetime.datetime(2015, 1, 30, 12, 15),
 datetime.datetime(2015, 2, 1, 12, 15),
 datetime.datetime(2015, 2, 15, 12, 15),
 datetime.datetime(2015, 3, 1, 12, 15)]

Limiting rules

If otherwise unspecified, recurrences can be generated to infinity (or at least until Python can't represent the date anymore). The two ways to specify a termination point as part of the rule are with the mutually exclusive count and until arguments.

  • count terminates the rule after a specific number of instances have been generated
In [13]:
# The next 2 instances where the 4th of July falls on a Friday
list(rrule(YEARLY, bymonth=7, bymonthday=4, byweekday=FR,
           dtstart=datetime(2016, 7, 5), count=2))
Out[13]:
[datetime.datetime(2025, 7, 4, 0, 0), datetime.datetime(2031, 7, 4, 0, 0)]
  • until terminates the rule on a specific date:
In [14]:
# The Friday the 13ths before January 1st, 2018
list(rrule(MONTHLY, bymonthday=13, byweekday=FR,
           dtstart=datetime(2016, 7, 17, 12), until=datetime(2018, 1, 1)))
Out[14]:
[datetime.datetime(2017, 1, 13, 12, 0), datetime.datetime(2017, 10, 13, 12, 0)]

Using rrules

It is also possible to retrieve specific subsets of the recurrence, e.g. the first recurence after a given date:

In [15]:
rr = rrule(DAILY, byhour=(9), byweekday=range(0, 5), dtstart=datetime(2016, 7, 1))

rr.after(datetime.now())      # The beginning of the next weekday
Out[15]:
datetime.datetime(2019, 1, 24, 9, 0)

You can retrieve the most recent recurrence before a given date:

In [16]:
rr.before(datetime(2017, 3, 14))   # Apparently this is a Saturday
Out[16]:
datetime.datetime(2017, 3, 13, 9, 0)

You can also get all the recurrences between two dates:

In [17]:
# byeaster is a non-standard extension in dateutil that calculates a day
# offset from easter. This rule generates all the easters between 1995 and 2000.
rr = rrule(YEARLY, byeaster=0, dtstart=datetime(1990, 1, 1))

rr.between(datetime(1995, 1, 1), datetime(2000, 1, 1))
Out[17]:
[datetime.datetime(1995, 4, 16, 0, 0),
 datetime.datetime(1996, 4, 7, 0, 0),
 datetime.datetime(1997, 3, 30, 0, 0),
 datetime.datetime(1998, 4, 12, 0, 0),
 datetime.datetime(1999, 4, 4, 0, 0)]

Missing datetimes

In [18]:
rr = rrule(freq=MONTHLY, dtstart=datetime(2019, 1, 1), bymonthday=31, count=5)
for dt in rr:
    print(dt)
2019-01-31 00:00:00
2019-03-31 00:00:00
2019-05-31 00:00:00
2019-07-31 00:00:00
2019-08-31 00:00:00

RFC 5545, ยง3.3.10:

Recurrence rules may generate recurrence instances with an invalid date (e.g., February 30)
or nonexistent local time (e.g., 1:30 AM on a day where the local time is moved forward by
an hour at 1:00 AM). Such recurrence instances MUST be ignored and MUST NOT be counted as
part of the recurrence set.

SKIP option

RFC 7529: "Non-Gregorian Recurrence Rules in the Internet Calendaring and Scheduling Core Object Specification"

Section 4.1, adds:

  • SKIP=OMIT - Move on to the next rule (default)
  • SKIP=BACKWARD - Fall back to the last valid date (doesn't have to match the rule)
  • SKIP=FORWARD - Return the next valid date

Example:

DTSTART:20141231T000000
RRULE:FREQ=MONTHLY;COUNT=4;RSCALE=GREGORIAN;SKIP=BACKWARD

Generates: 2014-12-31, 2015-01-31, 2015-02-28, 2015-03-31

Pull request implementing Skip option

rruleset

Some recurrences cannot be expressed in a single rrule. rruleset allows you to combine rrules and datetimes to generate an arbitrary recurrence schedule.

rruleset interface:

  • rruleset.rrule(): Add a recurrence rule to the set
  • rruleset.exrule(): Subtract a recurrence rule from the set
  • rruleset.rdate(): Add a specific datetime to the set
  • rruleset.exdate(): Subtract a specific datetime from the set

Bus schedule

In [22]:
dtstart = datetime(2016, 11, 1, 0, 0)    # The base date
WEEKDAYS = (MO, TU, WE, TH, FR);    WEEKENDS = (SA, SU)
bus_schedule = rruleset()
In [23]:
# During the week, it comes every hour on the 37 from 6:37AM to 10:37PM...
weekday_schedule = rrule(DAILY, byweekday=WEEKDAYS,
                         byhour=range(6, 22), byminute=37, dtstart=dtstart)
bus_schedule.rrule(weekday_schedule)       # Add an rrule to the rule set
In [24]:
# ..except after 6, when it comes every other hour - so exclude 7:37PM and 9:37PM!
weeknight_schedule = weekday_schedule.replace(byhour=(19, 21))
bus_schedule.exrule(weeknight_schedule)
In [25]:
# During the weekend, it comes every hour on the :07, from 8AM to 7PM
weekend_schedule = rrule(DAILY, byweekday=WEEKENDS,
                         byhour=range(8, 20), byminute=7, dtstart=dtstart)
bus_schedule.rrule(weekend_schedule)

rdate and exdate

In [26]:
# But on November 8th, 2016, politicians have arranged for busses to undergo
# "service", so the normal bus schedule is canceled that day
exdates = bus_schedule.between(datetime(2016, 11, 8, 0), datetime(2016, 11, 9))
for exdate in exdates:
    bus_schedule.exdate(exdate)
In [27]:
# And in its place they've added one bus at 4:32 AM
bus_schedule.rdate(datetime(2016, 11, 8, 4, 37))

# And one at 7:49 PM
bus_schedule.rdate(datetime(2016, 11, 8, 19, 49))

And display the schedule:

In [29]:
bus_list = bus_schedule.between(datetime(2016, 11, 7), datetime(2016, 11, 14))
o = print_bus_schedule(bus_list)
HTML(o)
Out[29]:
2016-11-07 2016-11-08 2016-11-09 2016-11-10 2016-11-11 2016-11-12 2016-11-13
Mon Tue Wed Thu Fri Sat Sun
06:37:00 04:37:00 06:37:00 06:37:00 06:37:00 08:07:00 08:07:00
07:37:00 19:49:00 07:37:00 07:37:00 07:37:00 09:07:00 09:07:00
08:37:00 None 08:37:00 08:37:00 08:37:00 10:07:00 10:07:00
09:37:00 None 09:37:00 09:37:00 09:37:00 11:07:00 11:07:00
10:37:00 None 10:37:00 10:37:00 10:37:00 12:07:00 12:07:00
11:37:00 None 11:37:00 11:37:00 11:37:00 13:07:00 13:07:00
12:37:00 None 12:37:00 12:37:00 12:37:00 14:07:00 14:07:00
13:37:00 None 13:37:00 13:37:00 13:37:00 15:07:00 15:07:00
14:37:00 None 14:37:00 14:37:00 14:37:00 16:07:00 16:07:00
15:37:00 None 15:37:00 15:37:00 15:37:00 17:07:00 17:07:00
16:37:00 None 16:37:00 16:37:00 16:37:00 18:07:00 18:07:00
17:37:00 None 17:37:00 17:37:00 17:37:00 19:07:00 19:07:00
18:37:00 None 18:37:00 18:37:00 18:37:00 None None
20:37:00 None 20:37:00 20:37:00 20:37:00 None None
pgp key
6B49 ACBA DCF6 BD1C A206
67AB CD54 FCE3 D964 BEFB

rrulestr

The iCalendar spec originally refers to a specific string format for specifying recurrence rules.

rrules can also be generated from these string using the rrulestr class:

In [31]:
# DST start and stop transition rules for Pacific Time
dst_start = rrulestr('DTSTART:19671029T020000;\n'
                     'FREQ=YEARLY;BYDAY=1SU;BYMONTH=4')
dst_end = rrulestr('DTSTART:19671029T020000;\n'
                   'FREQ=YEARLY;BYDAY=-1SU;BYMONTH=10')
rrset = rruleset()
rrset.rrule(dst_start)
rrset.rrule(dst_end)

rrset.between(datetime(2016, 1, 1), datetime(2018, 1, 1))
Out[31]:
[datetime.datetime(2016, 4, 3, 2, 0),
 datetime.datetime(2016, 10, 30, 2, 0),
 datetime.datetime(2017, 4, 2, 2, 0),
 datetime.datetime(2017, 10, 29, 2, 0)]

str(rrule)

You can generate RRULE strings from rrule objects as well:

In [32]:
# This string should be compatible with other applications using the iCalendar spec
print(rrule(YEARLY, byyearday=180, byhour=(1, 4, 12), dtstart=datetime(2014, 9, 13)))
DTSTART:20140913T000000
RRULE:FREQ=YEARLY;BYYEARDAY=180;BYHOUR=1,4,12
In [33]:
# Note that the BYEASTER directive is the only RFC-incompatible output
print(rrule(YEARLY, byeaster=0, dtstart=datetime(1990, 1, 1), count=14))
DTSTART:19900101T000000
RRULE:FREQ=YEARLY;COUNT=14;BYEASTER=0