Reading dates and times in GAUSS

Introduction

Time series data with inconsistently formatted dates and times can make your work frustrating. Dates and times are often stored as strings or text data and converting to a consistent, numeric format might seem like a daunting task.

Fortunately, GAUSS includes an easy tool for loading and converting dates and times – the date keyword.

Using the date keyword

The date keyword is used with GAUSS's formula string syntax to indicate to GAUSS that a variable is a date. The rest of the work is done internally and GAUSS automatically converts dates and times from strings to the POSIX format.

Consider the file bike_count.xlsx which contains string dates in a fairly standard Month/Day/Year HH:MM:SS format:

Excel time and dates loaded with the GAUSS date keyword.

Two steps must be taken to load the dates using the GAUSS loadd function:

  • A $ in front of the variable name tells GAUSS that the variable is a string.
  • The date operator tells GAUSS that the variable contains dates.
data = loadd("bike_count.xlsx", "date($Time) + Count");

The POSIX date format

The date keyword converts string formatted dates into the POSIX date and time format. This format represents a date as a scalar number reflecting the seconds that have passed since midnight on January 1st, 1970.

Looking at the data we loaded earlier, the POSIX dates are stored in the first column:

Excel time and dates loaded with the GAUSS date keyword.

Date formats

One of the most convenient features of the date keyword is that it internally detects a wide variety of string date and time formats. For example, dates stored as

"10/31/2018 15:00:00"

or

"31-10-2018 15:00:00"

or

"2018-10-31 15:00:00"

can all be automatically detected and converted to POSIX time using the date keyword. A full list of acceptable date formats is found at the bottom of this blog.

Loading non-standard formats

Though GAUSS can recognize a large number of date and time formats, you still may come across non-standard dates and times outside of the scope of the date function.

Consider a file named temperature.csv with the following contents:

"day","Phoenix","Seattle"
"November-8-2018",81,51
"November-9-2018",79,45
"November-10-2018",79,50
"November-11-2018",24,50
"November-12-2018",66,59
"November-13-2018",65,56
"November-14-2018",70,55

The date format shown above is not one that GAUSS will automatically recognize. In these cases, it is helpful to write a custom GAUSS function which uses the GAUSS string to POSIX conversion function, strctoposix. The strctoposix function uses format specifiers to load custom dates.

Below is a GAUSS procedure which will take dates in the format shown above (i.e. November-8-2018) and convert it to POSIX time.

proc (1) = loadMyDates(str);
    local fmt;

    // %B - full month name
    // %e - day of month
    // %Y - 4 digit year
    fmt = "%B-%e-%Y";

    retp(strctoposix(str, fmt));
endp;

This function can now be used in place of the date keyword to load your dates as shown below.

temp_data = loadd("temperature.csv", "loadMyDates($day) + Phoenix + Seattle");

This will create a GAUSS matrix named temp_data with the contents shown below:

GAUSS matrix containing dates.

Conclusion

In this blog you've learned:

  • How to load date and time data using the date keyword.
  • What date formats are compatible with the date keyword.
  • What format the date keyword converts string dates to.
  • How to use a custom function to load non-standard dates.

Code and data from this blog can be found here.

Recognizable date formats

20181031
31-10-2018
10-31-2018
2018-10-31
10/31/2018
2018/10/31
31 October 2018
201810311830
20181031 1830
31-10-2018 18:30
2018-10-31 18:30
10/31/2018 18:30
2018/10/31 18:30
31 October 2018 18:30
20181031183000
20181031 183000
18:30:00
31-10-2018 18:30:00
2018-10-31 18:30:00
10/31/2018 18:30:00
2018/10/31 18:30:00
10/31/2018 18:30:00 PM
2018/10/31 18:30:00.000
10/31/2018 18:30:00.000
31 October 2018 18:30:00
2018-10-31T18:30
2018-10-31T18:30T
2018-10-31T18:30:00
2018-10-31T18:30:00
2018-10-31T18:30:00.000
2018-10-31T18:30:00.000
Leave a Reply