I downloaded adroll/cantor to experiment getting hash counts of dataset intersections.  It uses Hyperloglog (HLL) + minhash but I find the test cases insufficient.  I need to know if ALL intersection calls use minhash even in a minimal way.  I found no blogs showing test cases and their feedback.  The documentation does not provide that fine-grained detail.  Anyone out there experienced with adroll/cantor?  - best wishes

Similar questions and discussions