If you are interested in trying out Hadoop, the easiest way is to download Cloudera's Quickstart VM - Hadoop, Impala, Hive, Pig, Oozie, Mahout, etc. are already installed. The VM also has working examples included.
You will need to read the hardware requirements and make sure your laptop meets them (Mac or PC, both are fine). Here is where you can download the VM (VMware, KVM, or VirtualBox):
After gaining some experience with the Quickstart VM examples, I would recommend reading this article which describes how to use Cloudera Impala to access GDELT (Global Database of Events, Language, and Tone): http://blog.gdelt.org/2013/11/06/fast-gdelt-queries-using-impala-and-parquet/
(Note: GDELT is suspended right now, but the data accumulated through Jan. 16, 2014 should be available to you.)
Well what is the problem you are interested in solving? Before any algorithms can be suggested, a problem needs to be considered. If that isn't desirable, are there any classes of problems it falls into so researchers can answers (e.g., intractable graph theoretic problem).
If you are interested in trying out Hadoop, the easiest way is to download Cloudera's Quickstart VM - Hadoop, Impala, Hive, Pig, Oozie, Mahout, etc. are already installed. The VM also has working examples included.
You will need to read the hardware requirements and make sure your laptop meets them (Mac or PC, both are fine). Here is where you can download the VM (VMware, KVM, or VirtualBox):
After gaining some experience with the Quickstart VM examples, I would recommend reading this article which describes how to use Cloudera Impala to access GDELT (Global Database of Events, Language, and Tone): http://blog.gdelt.org/2013/11/06/fast-gdelt-queries-using-impala-and-parquet/
(Note: GDELT is suspended right now, but the data accumulated through Jan. 16, 2014 should be available to you.)
I can suggest head/tail breaks and its related algorithms; see papers below
Jiang B. and Ma D. (2015), Defining least community as a homogeneous group in complex networks, Physica A: Statistical Mechanics and its Applications, 428, 154-160.
Jiang B. (2015a), Head/tail breaks for visualization of city structure and dynamics, Cities, 43, 69-77.
First you have to make it sure that data is heavy tailed.