Terminologies of HDFS and MapReduce

Terminology

  • PayLoad – Applications implement the Map and the Reduce functions, and form the core of the job.
  • Mapper – Mapper maps the input key/value pairs to a set of intermediate key/value pair.
  • NamedNode – Node that manages the Hadoop Distributed File System (HDFS).
  • DataNode – Node where data is presented in advance before any processing takes place.
  • MasterNode – Node where JobTracker runs and which accepts job requests from clients.
  • SlaveNode – Node where Map and Reduce program runs.
  • JobTracker – Schedules jobs and tracks the assign jobs to Task tracker.
  • Task Tracker – Tracks the task and reports status to JobTracker.
  • Job – A program is an execution of a Mapper and Reducer across a dataset.
  • Task – An execution of a Mapper or a Reducer on a slice of data.
  • Task Attempt – A particular instance of an attempt to execute a task on a SlaveNode.