I am using randomForest in R to identify which of two classes each event falls into.
Each event has 46 features.
As a sanity check, I hand the routine a data set with one copy of 1000 events in class 1 and a 2nd copy of the same events in class 2. I expect it to fail in some way but instead it classifies every single event correctly. It also shows reasonable "importance" values for each of the 46 features. It only lists those 46 feature variables in the "importance" list so it is not using the class value of each event to do the classification.
Any thoughts or suggestions would be welcome.
Thanks - Don
Don Krieger, Ph.D.; Research Scientist; University of Pittsburgh; [email protected]