GAUSS 23

Introduction

The new GAUSS 23 is the most practical GAUSS yet! It's built with the intention to save you time on everyday research tasks like finding, importing, and modeling data.

People working at a computer.

Data at Your Fingertips

Load data from DBnomics and FRED to GAUSS.

  • Access millions of global economic and financial data series with FRED and DBnomics integration.
  • Aggregate, filter, sort, and transform FRED data series during import.
  • Search FRED series from GAUSS.

Load Data from Anywhere on the Internet

// Load an Excel file from the aptech website
file_url = "https://www.aptech.com/wp-content/uploads/2019/03/skincancer2.xlsx";
skin_cancer = loadd(file_url);

// Print the first 5 rows of the dataframe
head(skin_cancer);
       State       Lat       Mort      Ocean        Long
     Alabama        33        219          1          87
     Arizona      34.5        160          0         112
    Arkansas        35        170          0        92.5
  California      37.5        182          1       119.5
    Colorado        39        149          0       105.5

Simplified Data Loading with...

Automatic Type Detection

Previous versions required formula strings with keywords to specify date, string, and categorical variables from some file types.

Code to load variables into GAUSS.

Smart data type detection in GAUSS 23 figures out the variable type so you do not have to specify it manually. Automatically detects nearly 40 popular date formats.

Automatic Header and Delimiter Detection

Replace old code like this:

load X[127,4] = mydata.txt;

with

X = loadd("mydata.txt");

Automatically handles

  • Present or absent header row.
  • Delimiter (tab, comma, semi-colon or space).
  • Number of rows and columns.
  • Variable types.

First-Class Dataframe Storage

Loading and saving GAUSS dataframes.

No new code to learn, just use the .gdat file extension with loadd and saved to load and store your dataframes.



Questions? Book some time with one of our friendly GAUSS experts

Expanded Quantile Regressions

Graph of quantile regression.

hitters = loadd("islr_hitters.xlsx"); 

tau = 0.90;

call quantileFit(hitters, "ln(salary) ~ AtBat + Hits + HmRun", tau);
Linear quantile regression

===============================================================================
Valid cases:                 263            Dependent variable:     ln_salary_
Missing cases:                 0               Deletion method:           None
Number variables:              3                       DF model              3
DF residuals                 259
=============================================================================== Name Coeff. Standard t-value P >|t| lb ub Error
------------------------------------------------------------------------------- Tau = 0.90
CONSTANT 6.285 0.194 32.433 0.0000 5.905 6.664 AtBat -0.001 0.002 -0.737 0.4621 -0.004 0.002 Hits 0.008 0.005 1.526 0.1281 -0.002 0.018 HmRun 0.017 0.009 1.951 0.0521 -0.000 0.034
  • New kernel estimated variance-covariance matrix.
  • Up to 4x speed improvement.
  • Expanded model diagnostics including pseudo R-squared, coefficient t-statistics and p-values, and degrees of freedom.

Kernel Density Estimations

  • Estimate unknown probability functions with 13 available kernels.
  • Automatic or user-specified bandwidth.
  • Kernel density plots with easy-to-use options for customization.


Improved Covariance Computations

// Load data
fname = getGAUSShome("examples/auto2.dta");
auto = loadd(fname);

// Declare control structure
struct olsmtControl ctl;
ctl = olsmtControlCreate();

// Turn on residuals
ctl.res = 1;

// Turn on HAC errors
ctl.cov = "hac";

call olsmt(auto, "mpg ~ weight + foreign", ctl); 
Valid cases:                    74      Dependent variable:                 mpg
Missing cases:                   0      Deletion method:                   None
Total SS:                 2443.459      Degrees of freedom:                  71
R-squared:                   0.663      Rbar-squared:                     0.653
Residual SS:               824.172      Std error of est:                 3.407
F(2,71):                    69.748      Probability of F:                 0.000
Durbin-Watson:               2.421

                                  Std                 Prob      Std    Cor with
Variable            Estimate     Error     t-value    >|t|      Est    Dep Var
-------------------------------------------------------------------------------

CONSTANT             41.6797    1.8989     21.95     0.000      ---         ---
weight              -0.00659    0.0006    -11.99     0.000   -0.885   -0.807175
foreign: Foreign    -1.65003    0.9071    -1.819     0.073   -0.131    0.393397

Note: HAC robust standard errors reported
  • New procedure for computing Newey-West HAC robust standard errors.
  • All robust covariance procedures now include the option to turn off small sample corrections.
  • Expanded dataframe and formula string compatibility.

New Functions for Data Cleaning and Exploration

between

Returns a binary vector indicating which observations fall in a specified range. It can be used with selif to select rows. Dates and ordinal categorical columns are supported.

// Return a 1 if the observation is between the listed dates
match = between(unemp[.,"DATE"], "2020-03", "2020-08");

// Select the matching observations
unemp = selif(unemp, match);
            DATE        UNRATE
      2020-03-01        4.4000
      2020-04-01        14.700
      2020-05-01        13.200
      2020-06-01        11.000
      2020-07-01        10.200
      2020-08-01        8.4000

where

Provides a convenient and intuitive way to combine or modify data. It returns elements from either a or b depending upon condition.

// Daily hotel room price
hotel_price = { 238, 405, 405, 329, 238 };

// Daily temperature forecast
temperature = { 89, 94, 110, 103, 97 };

// Decrease the price by 10% if the
// temperature will be more than 100 degrees
new_price = where(temperature .> 100,
                hotel_price .* 0.9,
                hotel_price);
new_price = 238 405 364.50 296.10 238

Speed-ups and Efficiency Improvements

  • Up to 10x speed-up and 50% decrease in memory usage for lag creation with shiftc and lagn.
  • Up to 2x speed-up (or more for large data) and 50% decrease in memory usage for miss, missrv.
  • Up to 2x speed-up (or more for large data) and 50% decrease in memory usage for element-by-element mathematical (+, -, .*, ./), relational (.>, .<, .>=, .<=, .==, .!=) and logical (.and, .not, .or, .xor) operators.
  • Up to 100x speed-up for some cases with indsav.
  • Up to 40% speed-up for reclassify.
  • Up to 3x speed-up for loading Excel® files with loadd and the Data Import Window.

Conclusion

For a complete list of all GAUSS 23 offers please see the complete changelog.


Discover how GAUSS 23 can help you make your mark

 
Leave a Reply