Recent Posts

Classification with Regularized Logistic Regression

Logistic regression has been a long-standing popular tool for modeling categorical outcomes. It’s widely used across fields like epidemiology, finance, and econometrics. In today’s blog we’ll look at the fundamentals of logistic regression. We’ll use a real-world survey data application and provide a step-by-step guide to implementing your own regularized logistic regression models using the GAUSS Machine Learning library, including:
  1. Data preparation.
  2. Model fitting.
  3. Classification predictions.
  4. Evaluating predictions and model fit.

Machine Learning With Real-World Data

If you’ve ever done empirical work, you know that real-world data rarely, if ever, arrives clean and ready for modeling. No data analysis project consists solely of fitting a model and making predictions. In today’s blog, we walk through a machine learning project from start to finish. We’ll give you a foundation for completing your own machine learning project in GAUSS, working through:
  • Data Exploration and cleaning.
  • Splitting data for training and testing.
  • Model fitting and prediction.

Understanding Cross-Validation

If you’ve explored machine learning models, you’ve most likely encountered the term “cross-validation” at some point. Cross-validation is an important step for training robust and reliable maachine learning models. In this blog, we’ll break cross-validation into simple terms. Using a practical demonstration, we’ll equip you with the knowledge to confidently use cross-validation in your machine learning projects.

Fundamentals of Tuning Machine Learning Hyperparameters

Machine learning algorithms often rely on hyperparameters that can impact the performance of the models. These hyperparameters are external to the data and are part of the modeling choices that practitioners must make. An important step in machine learning modeling is optimizing model hyperparameters to improve prediction accuracy. In today’s blog, we will cover some fundamentals of parameter tuning and will look more specifically at fine-tuning our previous decision forest model.

Managing String Data with GAUSS Dataframes

Working with strings hasn’t always been easy in GAUSS. In the past, the only option in GAUSS was to store strings separately from numeric data. It made it difficult to work with datasets that contained mixed types. With the introduction of GAUSS dataframes in GAUSS 21 and the enhanced string capabilities of GAUSS 23, that has all changed! I would argue that GAUSS now offers one of the best environments for managing and cleaning mixed-type data. I recently used GAUSS to perform the very practical task of creating an email list from a string-heavy dataset – something I never would have chosen GAUSS for in the past. In this blog, we walk through this data cleaning task, highlighting several key features for handling strings.

Applications of Principal Components Analysis in Finance

Principal components analysis (PCA) is a useful tool that can help practitioners streamline data without losing information. In today’s blog, we’ll examine the use of principal components analysis in finance using an empirical example. We’ll look more closely at:
  • What PCA is.
  • How PCA works.
  • How to use the GAUSS Machine Learning library to perform PCA.
  • How to interpret PCA results.

Predicting Recessions with Machine Learning Techniques

Forecasts have become a valuable commodity in today’s data-driven world. Unfortunately, not all forecasting models are of equal caliber, and incorrect predictions can lead to costly decisions. Today we will compare the performance of several prediction models used to predict recessions. In particular, we’ll look at how a traditional baseline econometric model compares to machine learning models. Our models will include:
  • A baseline probit model.
  • K-nearest neighbors.
  • Decision forests.
  • Ridge classification.

The Fundamentals of Kernel Density Estimation

Today’s blog looks closely at the fundamentals of kernel density estimation. After reading this blog you should have an understanding of:
  • What kernel density estimation is.
  • How kernel density estimation works.
  • How to perform kernel density estimation in GAUSS.

Importing FRED Data to GAUSS

The GAUSS FRED database integration, introduced in GAUSS 23, is a time-saving feature that allows you to import FRED data directly into GAUSS. This means you have thousands of datasets at your fingertips without ever leaving GAUSS. These tools also ensure that FRED data is imported directly into a GAUSS dataframe format, which can eliminate hours of data cleaning and the headaches that come with it. In today’s blog, we will learn how to use the FRED import tools to:
  • Search for a FRED data series.
  • Import FRED data to GAUSS, including merging multiple series.
  • Use advanced import tools to perform data transformations.

Have a Specific Question?

Get a real answer from a real person

Need Support?

Get help from our friendly experts.