Hadoop Interview questions


Total available count: 27
Subject - Apache
Subsubject - Hadoop

Data is spread in all the nodes of cluster, how spark tries to process this data?

By default, Spark tries to read data into an RDD from the nodes that are close to it. Since Spark usually accesses distributed partitioned data, to optimize transformation operations it creates partitions to hold the data chunks




Next 5 interview question(s)

1
What is wide Transformations?
2
What is Narrow Transformations?
3
How many type of transformations exist?
4
What is Preferred Locations?
5
How do you define actions?