Title
Nestedness and segmented nestedness
Abstract
Consider each row of a 0-1 dataset as the subset of the columns for which the row has an 1. Then a dataset is nested, if for all pairs of rows one row is either a superset or subset of the other. The concept of nestedness has its ori- gins in ecology, where approximate versions of it has been used to model the species distribution in different locations. We argue that nestedness and its extensions are interesting properties of datasets, and that they can be applied also to domains other than ecology. We first define natural measures of nestedness and study their properties. We then define the concept of k-nestedness: a dataset is (almost) k-nested if the set of columns can be partitioned to k parts so that each part is (almost) nested. We consider the algorithmic problems of computing how far a dataset is from being k-nested, and for finding a good partition of the columns into k parts. The algorithms are based on spectral partitioning, and scale to moderately large datasets. We apply the methods to real data from ecology and from other applications, and demonstrate the usefulness of the concept.
Year
DOI
Venue
2007
10.1145/1281192.1281245
KDD
Keywords
Field
DocType
good partition,segmented nestedness,nestedness,different location,large datasets,algorithmic problem,natural measure,0-1 matrices,presence/absence data,k part,approximate version,interesting property,species distribution
Row,Data mining,Subset and superset,Computer science,Partition (number theory),Nestedness
Conference
Citations 
PageRank 
References 
17
1.59
10
Authors
2
Name
Order
Citations
PageRank
Heikki Mannila165951495.69
Evimaria Terzi2158083.54