Hi there,
I'm trying to write a code that will return which rows of a column-vector match a certain value. So, consider the following example:
x = { 10 10 20 20 30 40 30 20 20 10 20 30 30 20 40 };
x = x';
my_rows = find_rows(40,x);
And I would expect that the new variable "my_rows" would contain the values 6 and 15 (because the number 40 was in rows 6 and 15). Does this find/index function exist?
I saw that the function indnv
gets close, but it only returns the first index value: 6.
Is there a way to find this quickly without having to build a for
-loop and using an if
statement?
I ask because my dataset is quite large and even though it's a simple operation, the size of my original dataset (230,000 obs) makes it take quite a while to process.
Thank you!
4 Answers
0
accepted
The function indexcat
does exactly what you are looking for.
// Use commas to create a column vector
x = { 10, 10, 20, 20, 30, 40, 30, 20, 20, 10, 20, 30, 30, 20, 40 };
// Find the indices of all rows, containing '40'
my_rows = indexcat(x, 40);
will set my_rows
equal to
6 15
0
Thank you!
And is there a slightly different version, where I find all rows that are NOT equal to a certain value?
So, in the example I gave, it would look like this:
x = { 10, 10, 20, 20, 30, 40, 30, 20, 20, 10, 20, 30, 30, 20, 40 };
my_rows = indexcat(x,40);
other_rows = opposite_indexcat(x,40)
And then my_rows contains {6, 15} while other_rows contains {1,2,3,4,5,7,8,9,10,11,12,13,14}.
Does this exist too?
Thanks again!
0
I don't believe there is a function which is the exact opposite of indexcat
, you can find the rows which do not match a certain value with the combination of a couple of GAUSS functions.
Method 1
// Create a column vector
x = { 10, 10, 20, 20, 30, 40, 30, 20, 20, 10, 20, 30, 30, 20, 40 };
// Create the sequence 1, 2, 3...rows(x)
idx = seqa(1, 1, rows(x));
// Remove the rows of 'idx' which correspond
// to the rows of 'x' that equal 40
// Note that the `.==` operator will return a vector
// of 0's and 1's
idx_2 = delif(idx, x .== 40);
Method 2
// Create a column vector
x = { 10, 10, 20, 20, 30, 40, 30, 20, 20, 10, 20, 30, 30, 20, 40 };
// 1. Find the indices of 'x' which equal 40.
// 2. Remove the rows found in step 1, from
// the sequence 1, 2, 3...rows(x)
idx_2 = delrows(seqa(1, 1, rows(x)), indexcat(x, 40));
0
Awesome, thank you! This is much faster than my original for-loop =)
Your Answer
4 Answers
The function indexcat
does exactly what you are looking for.
// Use commas to create a column vector
x = { 10, 10, 20, 20, 30, 40, 30, 20, 20, 10, 20, 30, 30, 20, 40 };
// Find the indices of all rows, containing '40'
my_rows = indexcat(x, 40);
will set my_rows
equal to
6 15
Thank you!
And is there a slightly different version, where I find all rows that are NOT equal to a certain value?
So, in the example I gave, it would look like this:
x = { 10, 10, 20, 20, 30, 40, 30, 20, 20, 10, 20, 30, 30, 20, 40 };
my_rows = indexcat(x,40);
other_rows = opposite_indexcat(x,40)
And then my_rows contains {6, 15} while other_rows contains {1,2,3,4,5,7,8,9,10,11,12,13,14}.
Does this exist too?
Thanks again!
I don't believe there is a function which is the exact opposite of indexcat
, you can find the rows which do not match a certain value with the combination of a couple of GAUSS functions.
Method 1
// Create a column vector
x = { 10, 10, 20, 20, 30, 40, 30, 20, 20, 10, 20, 30, 30, 20, 40 };
// Create the sequence 1, 2, 3...rows(x)
idx = seqa(1, 1, rows(x));
// Remove the rows of 'idx' which correspond
// to the rows of 'x' that equal 40
// Note that the `.==` operator will return a vector
// of 0's and 1's
idx_2 = delif(idx, x .== 40);
Method 2
// Create a column vector
x = { 10, 10, 20, 20, 30, 40, 30, 20, 20, 10, 20, 30, 30, 20, 40 };
// 1. Find the indices of 'x' which equal 40.
// 2. Remove the rows found in step 1, from
// the sequence 1, 2, 3...rows(x)
idx_2 = delrows(seqa(1, 1, rows(x)), indexcat(x, 40));
Awesome, thank you! This is much faster than my original for-loop =)