GAUSS 23 | Aptech

by aptech · Published December 7, 2022 · Updated December 7, 2022

Introduction

The new GAUSS 23 is the most practical GAUSS yet! It's built with the intention to save you time on everyday research tasks like finding, importing, and modeling data.

Data at Your Fingertips

Access millions of global economic and financial data series with FRED and DBnomics integration.
Aggregate, filter, sort, and transform FRED data series during import.
Search FRED series from GAUSS.

Load Data from Anywhere on the Internet


// Load an Excel file from the aptech website
file_url = "https://www.aptech.com/wp-content/uploads/2019/03/skincancer2.xlsx";
skin_cancer = loadd(file_url);
 
// Print the first 5 rows of the dataframe
head(skin_cancer);

       State       Lat       Mort      Ocean        Long
     Alabama        33        219          1          87
     Arizona      34.5        160          0         112
    Arkansas        35        170          0        92.5
  California      37.5        182          1       119.5
    Colorado        39        149          0       105.5

Simplified Data Loading with...

Automatic Type Detection

Previous versions required formula strings with keywords to specify date, string, and categorical variables from some file types.

Smart data type detection in GAUSS 23 figures out the variable type so you do not have to specify it manually. Automatically detects nearly 40 popular date formats.

Automatic Header and Delimiter Detection

Replace old code like this:


load X[127,4] = mydata.txt;

with


X = loadd("mydata.txt");

Automatically handles

Present or absent header row.
Delimiter (tab, comma, semi-colon or space).
Number of rows and columns.
Variable types.

First-Class Dataframe Storage

No new code to learn, just use the .gdat file extension with loadd and saved to load and store your dataframes.

Questions? Book some time with one of our friendly GAUSS experts

Expanded Quantile Regressions


hitters = loadd("islr_hitters.xlsx"); 
 
tau = 0.90;
 
call quantileFit(hitters, "ln(salary) ~ AtBat + Hits + HmRun", tau);

Linear quantile regression

===============================================================================
Valid cases:                 263            Dependent variable:     ln_salary_
Missing cases:                 0               Deletion method:           None
Number variables:              3                       DF model              3
DF residuals                 259

===============================================================================

                 Name    Coeff.  Standard   t-value  P >|t|      lb       ub
                                  Error

-------------------------------------------------------------------------------
Tau = 0.90


              CONSTANT    6.285    0.194   32.433    0.0000    5.905    6.664
                 AtBat   -0.001    0.002   -0.737    0.4621   -0.004    0.002
                  Hits    0.008    0.005    1.526    0.1281   -0.002    0.018
                 HmRun    0.017    0.009    1.951    0.0521   -0.000    0.034

New kernel estimated variance-covariance matrix.
Up to 4x speed improvement.
Expanded model diagnostics including pseudo R-squared, coefficient t-statistics and p-values, and degrees of freedom.

Kernel Density Estimations

Estimate unknown probability functions with 13 available kernels.
Automatic or user-specified bandwidth.
Kernel density plots with easy-to-use options for customization.

Start your free trial!

Improved Covariance Computations


// Load data
fname = getGAUSShome("examples/auto2.dta");
auto = loadd(fname);
 
// Declare control structure
struct olsmtControl ctl;
ctl = olsmtControlCreate();
 
// Turn on residuals
ctl.res = 1;
 
// Turn on HAC errors
ctl.cov = "hac";
 
call olsmt(auto, "mpg ~ weight + foreign", ctl);

Valid cases:                    74      Dependent variable:                 mpg
Missing cases:                   0      Deletion method:                   None
Total SS:                 2443.459      Degrees of freedom:                  71
R-squared:                   0.663      Rbar-squared:                     0.653
Residual SS:               824.172      Std error of est:                 3.407
F(2,71):                    69.748      Probability of F:                 0.000
Durbin-Watson:               2.421

                                  Std                 Prob      Std    Cor with
Variable            Estimate     Error     t-value    >|t|      Est    Dep Var
-------------------------------------------------------------------------------

CONSTANT             41.6797    1.8989     21.95     0.000      ---         ---
weight              -0.00659    0.0006    -11.99     0.000   -0.885   -0.807175
foreign: Foreign    -1.65003    0.9071    -1.819     0.073   -0.131    0.393397

Note: HAC robust standard errors reported

New procedure for computing Newey-West HAC robust standard errors.
All robust covariance procedures now include the option to turn off small sample corrections.
Expanded dataframe and formula string compatibility.

New Functions for Data Cleaning and Exploration

between

Returns a binary vector indicating which observations fall in a specified range. It can be used with selif to select rows. Dates and ordinal categorical columns are supported.


// Return a 1 if the observation is between the listed dates
match = between(unemp[.,"DATE"], "2020-03", "2020-08");
 
// Select the matching observations
unemp = selif(unemp, match);

            DATE        UNRATE
      2020-03-01        4.4000
      2020-04-01        14.700
      2020-05-01        13.200
      2020-06-01        11.000
      2020-07-01        10.200
      2020-08-01        8.4000

where

Provides a convenient and intuitive way to combine or modify data. It returns elements from either a or b depending upon condition.


// Daily hotel room price
hotel_price = { 238, 405, 405, 329, 238 };
 
// Daily temperature forecast
temperature = { 89, 94, 110, 103, 97 };
 
// Decrease the price by 10% if the
// temperature will be more than 100 degrees
new_price = where(temperature .> 100,
                hotel_price .* 0.9,
                hotel_price);

new_price = 238 405 364.50 296.10 238

Explore sample symmetry and tails with skewness and kurtosis functions.
Test for normality using the new JarqueBera function.

Speed-ups and Efficiency Improvements

Up to 10x speed-up and 50% decrease in memory usage for lag creation with shiftc and lagn.
Up to 2x speed-up (or more for large data) and 50% decrease in memory usage for miss, missrv.
Up to 2x speed-up (or more for large data) and 50% decrease in memory usage for element-by-element mathematical (+, -, .*, ./), relational (.>, .<, .>=, .<=, .==, .!=) and logical (.and, .not, .or, .xor) operators.
Up to 100x speed-up for some cases with indsav.
Up to 40% speed-up for reclassify.
Up to 3x speed-up for loading Excel® files with loadd and the Data Import Window.

Conclusion

For a complete list of all GAUSS 23 offers please see the complete changelog.

Discover how GAUSS 23 can help you make your mark

Talk with an expert Request pricing

	// Load an Excel file from the aptech website
	file_url = "https://www.aptech.com/wp-content/uploads/2019/03/skincancer2.xlsx";
	skin_cancer = loadd(file_url);

	// Print the first 5 rows of the dataframe
	head(skin_cancer);

	hitters = loadd("islr_hitters.xlsx");

	tau = 0.90;

	call quantileFit(hitters, "ln(salary) ~ AtBat + Hits + HmRun", tau);

	// Load data
	fname = getGAUSShome("examples/auto2.dta");
	auto = loadd(fname);

	// Declare control structure
	struct olsmtControl ctl;
	ctl = olsmtControlCreate();

	// Turn on residuals
	ctl.res = 1;

	// Turn on HAC errors
	ctl.cov = "hac";

	call olsmt(auto, "mpg ~ weight + foreign", ctl);

	// Return a 1 if the observation is between the listed dates
	match = between(unemp[.,"DATE"], "2020-03", "2020-08");

	// Select the matching observations
	unemp = selif(unemp, match);

	// Daily hotel room price
	hotel_price = { 238, 405, 405, 329, 238 };

	// Daily temperature forecast
	temperature = { 89, 94, 110, 103, 97 };

	// Decrease the price by 10% if the
	// temperature will be more than 100 degrees
	new_price = where(temperature .> 100,
	hotel_price .* 0.9,
	hotel_price);