Extreme Binning: Scalable, Parallel Deduplication for Chunk-based File Backup ,
Sparse Indexing: Large Scale, Inline Deduplication Using Sampling and Locality,
Content-based Document Routing and Index Partitioning for Scalable Similarity-based Searches in a Large Corpus,
Click here for a list of recent collaborators.