NII Technical Report (NII-2015-003E)

Title Intrinsic Dimensional Outlier Detection in High-Dimensional Data
Authors Jonathan von Brünken, Michael E. Houle, and Arthur Zimek
Abstract We introduce a new method for evaluating local outliers, by utilizing a measure of the intrinsic dimensionality in the vicinity of a test point, the continuous intrinsic dimension (ID), which has been shown to be equivalent to a measure of the discriminative power of similarity functions. Continuous ID can be regarded as an extension of Karger and Ruhl's expansion dimension to a statistical setting in which the distribution of distances to a query point is modeled in terms of a continuous random variable. The proposed local outlier score, IDOS, uses ID as a substitute for the density estimation used in classical outlier detection methods such as LOF. An experimental analysis is provided showing that the precision of IDOS substantially improves over that of state-of-the-art outlier detection scoring methods, especially when the data sets are large and high-dimensional.
Language English
Published Mar 31, 2015
Pages 12p
PDF File 15-003E.pdf

NII Technical Reports
National Institute of Informatics