YARN

What is YARN?

Yarn stands for Yet Another Resource Negotiator. It separates resource management layer from the processing layer.


Main components of YARN architecture:

  • Client - MapReduce job is submitted.

  • Resource Manager - It allocates cluster resources using scheduler and application manager.
            a) Scheduler - It performs scheduling based on allocated application and                     available resources.

            b) Application Manager - It is responsible for accepting the application and                     negotiating the first container from the resource manager. It also restarts the            Application Manager container if a task fails.
  • Application Master - It manages life cycle of job by indicating node manager to create or destroy container for a job.
  • Node Manager - It manages job in a specific node by creating and destroying container in a cluster.
  • Container - It is a set of resources like CPU, RAM, and memory on a single node and they are scheduled by resource manager and monitored by node manager.

Application workflow in YARN:



  1. Client submits an application.
  2. The Resource Manager allocates a container to start the Application Manager.
  3. The Application Manager registers itself with the Resource Manager.
  4. The Application Manager negotiates containers from the Resource Manager.
  5. The Application Manager notifies the Node Manager to launch containers.
  6. Application code is executed in the container.
  7. Client contacts Resource Manager/Application Manager to monitor application’s status.
  8. Once the processing is complete, the Application Manager un-registers with the Resource Manager.

Comments

Popular posts from this blog

Hadoop

Rack Awareness Algorithm

Big Data