Hi,
Can anyone help me with this:
I have a correlation matrix NxN and I need to find all the index vectors of dimension greater or equal to 3, where all pairs included in each index vector are jointly correlated (in particular they obtain correlation(a,b)>c0 ).
for example assume we have the following table of correlation pairs:
variable index variable index
4.0000000 5.0000000 (4 and 5 with correlation > c0)
4.0000000 6.0000000 (4 and 6 with orrelation > c0)
4.0000000 9.0000000 (4 and 9 with correlation > c0)
5.0000000 9.0000000 (5 and 9 with correlation > c0)
5.0000000 10.000000 (5 and 10 with correlation > c0)
6.0000000 9.0000000 (6 and 9 with correlation > c0)
assume index vectors with column size greater or equal to 3.
Then the index vectors are
index_vec1 = 4|6|9 and
index_vec2 = 4|5|9.
Notice that 5,6 do not obtain correlation>c0, so they are not "correlated".
Can anyone help?
Thanks,
T.
2 Answers
1
accepted
Here is a start. I'll see if I can find some time later to make it loop over the vector to find all of the index vectors rather than just one of them.
c = { 4 5, 4 6, 4 9, 5 9, 5 10, 6 9 }; //Sort by first column, then secondarily by second column c = sortmc(c, 1|2); //Grab first variable var_1 = c[1,1]; //Grab first correlating variable var_2 = c[1,2]; //Select rows of 'var_1' except for //first row which references 'var_2' c_1 = selif(c, 0|(c[2:rows(c),1] .== var_1)); //Remove observations of 'var_1' c = delif(c, c[.,1] .== var_1); //Create vector of 'var_2's correlations c_2 = selif(c, (c[.,1] .== var_2)); //Find variables with which 'var_1' //and 'var_2' correlate idx_1 = selif(c_1[.,2], sumr(c_1[.,2] .== c_2[.,2]')); //Add 'var_1' and 'var_2' to the list //of shared correlations idx_1 = var_1 | var_2 | idx_1; print "idx_1 = " idx_1;
0
I think that, this is similar to the maximal clique problem
http://en.wikipedia.org/wiki/Clique_problem
Is there a Gauss Code for this?
Thanks
T.
Your Answer
2 Answers
Here is a start. I'll see if I can find some time later to make it loop over the vector to find all of the index vectors rather than just one of them.
c = { 4 5, 4 6, 4 9, 5 9, 5 10, 6 9 }; //Sort by first column, then secondarily by second column c = sortmc(c, 1|2); //Grab first variable var_1 = c[1,1]; //Grab first correlating variable var_2 = c[1,2]; //Select rows of 'var_1' except for //first row which references 'var_2' c_1 = selif(c, 0|(c[2:rows(c),1] .== var_1)); //Remove observations of 'var_1' c = delif(c, c[.,1] .== var_1); //Create vector of 'var_2's correlations c_2 = selif(c, (c[.,1] .== var_2)); //Find variables with which 'var_1' //and 'var_2' correlate idx_1 = selif(c_1[.,2], sumr(c_1[.,2] .== c_2[.,2]')); //Add 'var_1' and 'var_2' to the list //of shared correlations idx_1 = var_1 | var_2 | idx_1; print "idx_1 = " idx_1;
I think that, this is similar to the maximal clique problem
http://en.wikipedia.org/wiki/Clique_problem
Is there a Gauss Code for this?
Thanks
T.