R Language Parsing dates and datetimes from strings with lubridate


Example

The lubridate package provides convenient functions to format date and datetime objects from character strings. The functions are permutations of

LetterElement to parseBase R equivalent
yyear%y, %Y
m (with y and d)month%m, %b, %h, %B
dday%d, %e
hhour%H, %I%p
m (with h and s)minute%M
sseconds%S

e.g. ymd() for parsing a date with the year followed by the month followed by the day, e.g. "2016-07-22", or ymd_hms() for parsing a datetime in the order year, month, day, hours, minutes, seconds, e.g. "2016-07-22 13:04:47".

The functions are able to recognize most separators (such as /, -, and whitespace) without additional arguments. They also work with inconsistent separators.


Dates

The date functions return an object of class Date.

library(lubridate) 

mdy(c(' 07/02/2016 ', '7 / 03 / 2016', ' 7 / 4 / 16 '))
## [1] "2016-07-02" "2016-07-03" "2016-07-04"

ymd(c("20160724","2016/07/23","2016-07-25"))    # inconsistent separators
## [1] "2016-07-24" "2016-07-23" "2016-07-25"

Datetimes

Utility functions

Datetimes can be parsed using ymd_hms variants including ymd_hm and ymd_h. All datetime functions can accept a tz timezone argument akin to that of as.POSIXct or strptime, but which defaults to "UTC" instead of the local timezone.

The datetime functions return an object of class POSIXct.

x <- c("20160724 130102","2016/07/23 14:02:01","2016-07-25 15:03:00")
ymd_hms(x, tz="EST")
## [1] "2016-07-24 13:01:02 EST" "2016-07-23 14:02:01 EST"
## [3] "2016-07-25 15:03:00 EST"

ymd_hms(x)
## [1] "2016-07-24 13:01:02 UTC" "2016-07-23 14:02:01 UTC"
## [3] "2016-07-25 15:03:00 UTC"

Parser functions

lubridate also includes three functions for parsing datetimes with a formatting string like as.POSIXct or strptime:

FunctionOutput ClassFormatting strings accepted
parse_date_timePOSIXctFlexible. Will accept strptime-style with % or lubridate datetime function name style, e.g "ymd hms". Will accept a vector of orders for heterogeneous data and guess which is appropriate.
parse_date_time2Default POSIXct; if lt = TRUE, POSIXltStrict. Accepts only strptime tokens (with or without %) from a limited set.
fast_strptimeDefault POSIXlt; if lt = FALSE, POSIXctStrict. Accepts only %-delimited strptime tokens with delimiters (-, /, :, etc.) from a limited set.
x <- c('2016-07-22 13:04:47', '07/22/2016 1:04:47 pm')

parse_date_time(x, orders = c('mdy Imsp', 'ymd hms'))
## [1] "2016-07-22 13:04:47 UTC" "2016-07-22 13:04:47 UTC"

x <- c('2016-07-22 13:04:47', '2016-07-22 14:47:58')

parse_date_time2(x, orders = 'Ymd HMS')
## [1] "2016-07-22 13:04:47 UTC" "2016-07-22 14:47:58 UTC"

fast_strptime(x, format = '%Y-%m-%d %H:%M:%S')
## [1] "2016-07-22 13:04:47 UTC" "2016-07-22 14:47:58 UTC"

parse_date_time2 and fast_strptime use a fast C parser for efficiency.

See ?parse_date_time for formatting tokens.