The pseudo-code of the BV-LOF approach is presented in Algorithm 1. For each ensemble iteration, the size of the feature subset (
R) is determined randomly in the range between
d/2 and
d-1 as proposed by
Lazarevic & Kumar (2005), where
d is the dimension of the dataset. After that, a set of features
Si with size
R is selected without replacement and these features are used to produce a lower-dimensional representation
Di of the dataset
D. The resulting data representation
Di has a lower dimensionality, but it has the same number of instances as the original dataset
D. BV-LOF feeds the LOF algorithm with different data representations
Di for each iteration of the ensemble. The LOF algorithm is run 100 times with different size of neighborhood (
k) parameter settings (from 1 to 100 by an increment of 1). The LOF algorithm returns a vector having labels for all the instances in the dataset. Totally, 100 output vectors are generated with respect to 100 different
k values. After that, these vectors are combined by an inner majority voting mechanism to produce a result set with a unique label for each instance. The instances that are considered as outliers by LOF are marked with -1 and the inliers are labeled as 1. If the majority of the output vectors mark an instance as outlier then it is assigned as an outlier for the corresponding subset. Otherwise, the result of the inner ensemble becomes 1, which indicates that this instance is an inlier. After all the subsets (totally
T subsets) have their own ensemble output vectors, outer majority voting is applied to get a final output vector in the same way.