Hadoop Interview questions

Total available count: 27
Subject - Apache
Subsubject - Hadoop

What is Task, with regards to Spark Job execution?

The task is an individual unit of work for executors to run. It is an individual unit of physical execution (computation) that runs on a single machine for parts of your Spark application on a data. All tasks in a stage should be completed before moving on to another stage. A task can also be considered a computation in a stage on a partition in a given job attempt. A Task belongs to a single stage and operates on a single partition (a part of a Resilient Distributed Datasets (RDD) ). Tasks are spawned one by one for each stage and data partition.

Next 5 interview question(s)

What is DAGSchedular and how it performs?
Please define executors in detail?
Please explain, how workers work, when a new Job submitted to them?
What are the workers?
What is the purpose of Driver in Spark Architecture?