A) When the types of the reduce operation's input key and input value match the types of the reducer's output key and output value and when the reduce operation is both communicative and associative.
B) When the signature of the reduce method matches the signature of the combine method.
C) Always. Code can be reused in Java since it is a polymorphic object-oriented programming language.
D) Always. The point of a combiner is to serve as a mini-reducer directly after the map phase to increase performance.
E) Never. Combiners and reducers must be implemented separately because they serve different purposes.
Correct Answer
verified
Multiple Choice
A) Have your system administrator copy the JAR to all nodes in the cluster and set its location in the HADOOP_CLASSPATH environment variable before you submit your job.
B) Have your system administrator place the JAR file on a Web server accessible to all cluster nodes and then set the HTTP_JAR_URL environment variable to its location.
C) When submitting the job on the command line, specify the -libjars option followed by the JAR file path.
D) Package your code and the Apache Commands Math library into a zip file named JobJar.zip
Correct Answer
verified
Multiple Choice
A) Six
B) Five
C) Four
D) Two
E) One
F) Three
Correct Answer
verified
Multiple Choice
A) Heath states checks (heartbeats)
B) Resource management
C) Job scheduling/monitoring
D) Job coordination between the ResourceManager and NodeManager
E) Launching tasks
F) Managing file system metadata
G) MapReduce metric reporting
H) Managing tasks
Correct Answer
verified
Multiple Choice
A) Reducers start copying intermediate key-value pairs from each Mapper as soon as it has completed. The programmer can configure in the job what percentage of the intermediate data should arrive before the reduce method begins.
B) Reducers start copying intermediate key-value pairs from each Mapper as soon as it has completed. The reduce method is called only after all intermediate data has been copied and sorted.
C) Reduce methods and map methods all start at the beginning of a job, in order to provide optimal performance for map-only or reduce-only jobs.
D) Reducers start copying intermediate key-value pairs from each Mapper as soon as it has completed. The reduce method is called as soon as the intermediate key-value pairs start to arrive.
Correct Answer
verified
Multiple Choice
A) Increase the block size on all current files in HDFS.
B) Increase the block size on your remaining files.
C) Decrease the block size on your remaining files.
D) Increase the amount of memory for the NameNode.
E) Increase the number of disks (or size) for the NameNode.
F) Decrease the block size on all current files in HDFS.
Correct Answer
verified
Multiple Choice
A) There is no difference in output between the two settings.
B) With zero reducers, no reducer runs and the job throws an exception. With one reducer, instances of matching patterns are stored in a single file on HDFS.
C) With zero reducers, all instances of matching patterns are gathered together in one file on HDFS. With one reducer, instances of matching patterns are stored in multiple files on HDFS.
D) With zero reducers, instances of matching patterns are stored in multiple files on HDFS. With one reducer, all instances of matching patterns are gathered together in one file on HDFS.
Correct Answer
verified
Multiple Choice
A) A SequenceFile contains a binary encoding of an arbitrary number of homogeneous Writable objects
B) A SequenceFile contains a binary encoding of an arbitrary number of heterogeneous Writable objects
C) A SequenceFile contains a binary encoding of an arbitrary number of WritableComparable objects, in sorted order.
D) A SequenceFile contains a binary encoding of an arbitrary number key-value pairs. Each key must be the same type. Each value must be the same type.
Correct Answer
verified
Multiple Choice
A) The values are in sorted order.
B) The values are arbitrarily ordered, and the ordering may vary from run to run of the same MapReduce job.
C) The values are arbitrary ordered, but multiple runs of the same MapReduce job will always have the same ordering.
D) Since the values come from mapper outputs, the reducers will receive contiguous sections of sorted values.
Correct Answer
verified
Multiple Choice
A) Increase the parameter that controls minimum split size in the job configuration.
B) Write a custom MapRunner that iterates over all key-value pairs in the entire file.
C) Set the number of mappers equal to the number of input files you want to process.
D) Write a custom FileInputFormat and override the method isSplitable to always return false.
Correct Answer
verified
Multiple Choice
A) The file will be marked as corrupted if data node B fails during the creation of the file.
B) Each data node locks the local file to prohibit concurrent readers and writers of the file.
C) Each data node stores a copy of the file in the local file system with the same name as the HDFS file.
D) The file can be accessed if at least one of the data nodes storing the file is available.
Correct Answer
verified
Multiple Choice
A) Implement a splittable compression algorithm.
B) Be a subclass of FileInputFormat.
C) Implement WritableComparable.
D) Override isSplitable.
E) Implement a comparator for speedy sorting.
Correct Answer
verified
Multiple Choice
A) As many final key-value pairs as desired. There are no restrictions on the types of those key-value pairs (i.e., they can be heterogeneous) .
B) As many final key-value pairs as desired, but they must have the same type as the intermediate key-value pairs.
C) As many final key-value pairs as desired, as long as all the keys have the same type and all the values have the same type.
D) One final key-value pair per value associated with the key; no restrictions on the type.
E) One final key-value pair per key; no restrictions on the type.
Correct Answer
verified
Multiple Choice
A) The number of values across different keys in the iterator supplied to a single reduce method call.
B) The amount of intermediate data that must be transferred between the mapper and reducer.
C) The number of input files a mapper must process.
D) The number of output files a reducer must produce.
Correct Answer
verified
Multiple Choice
A) HDFS command
B) Pig LOAD command
C) Sqoop import
D) Hive LOAD DATA command
E) Ingest with Flume agents
F) Ingest with Hadoop Streaming
Correct Answer
verified
Multiple Choice
A) The amount of RAM installed on the TaskTracker node.
B) The amount of free disk space on the TaskTracker node.
C) The number and speed of CPU cores on the TaskTracker node.
D) The average system load on the TaskTracker node over the past fifteen (15) minutes.
E) The location of the InsputSplit to be processed in relation to the location of the node.
Correct Answer
verified
Showing 21 - 36 of 36
Related Exams