Summary:
In this paper, the authors discuss YARN, the next generation of Hadoop platform, and summarize its design and development. They discussed how adoption and new types of applications has pushed the initial architecture well beyond what it was designed to accomplish, and the novel architectural transformation that lead to YARN. They argued how YARN provides better scalability, higher efficiency, and better cluster sharing. Their extensive experimental results of running YARN on all Yahoo! Grids show that YARN improves efficiency.
Pros:
Like the other paper, a big plus of their paper was the strong evaluation in terms of running YARN on 100% of Yahoo grids.
In general, a showing that separating resource management functions from the programming model provides a better flexibility.
Talking about many applications and frameworks native to YARN or ported to the platform to illustrate the generality of its architecture was a nicely-done job.
YARN permits simultaneous execution of a variety of programming models, including graph processing, iterative processing, and many other tasks.
Cons:
At the root of a YARN hierarchy is the ResourceManager. Failure in resource manager would make the whole system inconsistent; so basically the single point of failure problem comes in.
Aside from other points, YARN seemingly assumes that ApplicationMasters are buggy or even malicious and therefore treats them as unprivileged code! I’m not sure if this can be considered as a disadvantage though!
Thoughts for further development:
One possible to enhance the resource manager which is actually the root of YARN is to provide budget-based optimization to tackle the problem of bottlenecks. Providing good heuristics and algorithms to intelligently manage the assignment of tasks given the resource limitations of memory, CPU, energy, etc could still be interesting to do for future.
Critiques/Questions:
Could it be beneficial to come up with a mathematical modeling of resource limitations given the many resources, and try to solve a global optimization based on that? I believe resource manager can use some good heuristics to allocate resources and work with the NodeManagers to start and monitor their underlying application.