Reading dates and times in GAUSS

Introduction

Time series data with inconsistently formatted dates and times can make your work frustrating. Dates and times are often stored as strings or text data and converting to a consistent, numeric format might seem like a daunting task.

Fortunately, GAUSS includes easy tools for loading and converting dates and times.

Loading Dates Into GAUSS

One of the most convenient things about GAUSS dataframes, is that they allow you load different data types, including dates with little or no additional steps. Fore example, Many date formats are automatically identified by GAUSS, and will load using the GAUSS loadd function, without any additional steps.

Consider the file bike_count.xlsx which contains string dates in a fairly standard Month/Day/Year HH:MM:SS format:

Excel time and dates loaded with the GAUSS date keyword.

Because the date column is stored in a common format, GAUSS will automatically detect that it should be loaded as a date.

// Load all data from bike_count.xlsx
data = loadd("bike_count.xlsx");

// Check column types
getColTypes(data);

The column types confirm that the first column we've loaded is a date column:

            type
            date
          number

The GAUSS Date Variable

The GAUSS date variable allows us to display dates in recognizable form while storing related POSIX date and time format under the hood. This makes them easy to read, while keeping them compatible with GAUSS date and time functions.

For example, we could find the time difference in hours between the second and first observation of time.

// Find difference in hours
timediffposix(data[2, "Time"], data[1, "Time"], "hours")

This tells us that one hour has passed between the first observation and the second:

1

Loading Non-Standard Formats

Though GAUSS can recognize a large number of date and time formats, you may come across non-standard dates and times. These can still be loaded as date variables but you will need to use:

  • The date keyword to indicate that you are loading a date variable.
  • The correct BSD specifier format string to tell GAUSS the format of your dates.

Consider a file named temperature.csv with the following contents:

"day","Phoenix","Seattle"
"2018, November-08",81,51
"2018, November-09",79,45
"2018, November-10",79,50
"2018, November-11",24,50
"2018, November-12",66,59
"2018, November-13",65,56
"2018, November-14",70,55

The date format shown above is not one that GAUSS will automatically recognize but we can load it as a date variable using a BSD specifier.

The date format 2021, June-09 is described by the BSD specifier format string, %Y, %B-%d:

Original contentsDescriptionTypeFormat string contents
2021The year with century as a decimal number.Date%Y
,A comma.Literal,
JuneThe full month name.Date%B
-A dash.Literal-
31The day of month as a number (01-31).Date%d

There are a few helpful things to note about our BSD specifiers:

  • They are enclosed within single ticks.
  • Any literal symbol can be added into the specifiers, as we did with the comma and dash. For example, if our dates look like 2021 # May-31 our BSD specifiers should be %Y # %b-%d.

Now we can load our temperature data using loadd:

temp_data = loadd("temperature.csv", "date(day, '%Y, %B-%d') + Phoenix + Seattle");

Conclusion

In this blog you've learned:

  • How to load common date and time data using loadd.
  • What the GAUSS date variable is.
  • How to load non-standard dates using the date keyword.

Further Reading

1.Dates and Times Made Easy
2.What Is A GAUSS Dataframe And Why Should You Care?

Recognizable date formats

20181031
31-10-2018
10-31-2018
2018-10-31
10/31/2018
2018/10/31
31 October 2018
201810311830
20181031 1830
31-10-2018 18:30
2018-10-31 18:30
10/31/2018 18:30
2018/10/31 18:30
31 October 2018 18:30
20181031183000
20181031 183000
18:30:00
31-10-2018 18:30:00
2018-10-31 18:30:00
10/31/2018 18:30:00
2018/10/31 18:30:00
10/31/2018 18:30:00 PM
2018/10/31 18:30:00.000
10/31/2018 18:30:00.000
31 October 2018 18:30:00
2018-10-31T18:30
2018-10-31T18:30T
2018-10-31T18:30:00
2018-10-31T18:30:00
2018-10-31T18:30:00.000
2018-10-31T18:30:00.000
Leave a Reply