Hi there,
How can I apply "meanc" function ignoring missing or specific values?
Thank you
2 Answers
0
There are three ways to handle missing values, depending on what you want to do.
The first option is to remove those rows with the packr
(stands for "pack rows") function like this:
x = { 1, 3, ., 6, 2, ., 2 };
// Remove missing values
x = packr(x);
// Compute mean of remaining rows
x_mu = meanc(x);
The second option is to replace the missing values with another number. Here we will replace them with zero.
x = { 1, 3, ., 6, 2, ., 2 };
// Replace missing values with zero
x = missrv(x,0);
// Compute mean
x_mu = meanc(x);
The third option is to impute the missing values with the GAUSS impute
function. Here are a couple of examples.
x = { 1, 3, ., 6, 2, ., 2 };
// impute with mean of other values (default)
x1 = impute(x);
// Compute mean
x_mu_1 = meanc(x1);
// impute with most common value
x2 = impute(x, "mode");
// Compute mean
x_mu_2 = meanc(x2);
0
The second and third options are different from the first as the number of elements by which you divide will be different. For me the first seems the appropriate, since you divide by the actual number of elements in the vector.
Your Answer
2 Answers
There are three ways to handle missing values, depending on what you want to do.
The first option is to remove those rows with the packr
(stands for "pack rows") function like this:
x = { 1, 3, ., 6, 2, ., 2 };
// Remove missing values
x = packr(x);
// Compute mean of remaining rows
x_mu = meanc(x);
The second option is to replace the missing values with another number. Here we will replace them with zero.
x = { 1, 3, ., 6, 2, ., 2 };
// Replace missing values with zero
x = missrv(x,0);
// Compute mean
x_mu = meanc(x);
The third option is to impute the missing values with the GAUSS impute
function. Here are a couple of examples.
x = { 1, 3, ., 6, 2, ., 2 };
// impute with mean of other values (default)
x1 = impute(x);
// Compute mean
x_mu_1 = meanc(x1);
// impute with most common value
x2 = impute(x, "mode");
// Compute mean
x_mu_2 = meanc(x2);
The second and third options are different from the first as the number of elements by which you divide will be different. For me the first seems the appropriate, since you divide by the actual number of elements in the vector.