What is clustered resource management and YARN?

It is a protocol to distribute work across a number of machines

fault tolerant

scalable

Worker nodes

Node managers

Resource managers

Applications will negotiate for running themselves on in a distributed fashion

An app is submitted to Resource manager

Resource manager allocates an application master for that application on one of the node managers nodes

The application master will then negotiate the parallelism