Powered By Blogger

Aug 6, 2024

Data Analytics MCQ Unit- V

 

UNIT- V

 

1.      __________a distributed file-system that stores data on commodity machines, providing very high aggregate bandwidth across the cluster.

[A]   Hadoop YARN

[B]   Hadoop Distributed File System(HDFS)

[C]   HadoopMapReduce

[D]  None of the above

 

2.      __________a resource management platform responsible for managing computing resources in clusters and using them for scheduling of users applications.

[A]   Hadoop Distributed File System(HDFS)

[B]   HadoopMapReduce

[C]   Hadoop YARN

[D]  None of the above

 

3.      __________a query language designed for use with data in a JSON format and designed to work with semi-structured data.

[A]   Mahout

[B]   Jaql

[C]   Pig

[D]  None of the above

 

4.      __________is the first task, which takes input data and converts it into a set of data where individual elements are broken down into tuples.

[A]   The Reduce Task

[B]   The Map Task

[C]   Both of the above

[D]  None of the above

 

5.      __________task takes the output from a map task as input and combines those data tuples into a smaller set of tuples.

[A]  The Reduce Task

[B]   The Map Task

[C]   Both of the above

[D]  None of the above

 

6.      A MapReduce programis composed of a ________ method that performs filtering and sorting and a ________ method that performs a summary operation.

[A]   Reduce(), Map()

[B]   Map(), Reduce()

[C]   Both of the above

[D]  None of the above

 

7.      In ______ step each worker node applies the Map () function to the local data and writes the output to a temporary storage.

[A]   Reduce

[B]   Map

[C]   Shuffle

[D]  None of the above

 

8.      In ______ step worker nodes redistribute data based on the output keys such that all data belonging to one key is located on the same worker node.

[A]   Reduce

[B]   Map

[C]   Shuffle

[D]  None of the above

 

9.      In ______ step worker nodes now process each group of output data, per key, in parallel.

[A]  Reduce

[B]   Map

[C]   Shuffle

[D]  None of the above

 

10.  _____________ a record reader that translates each record in an input file and sends the parsed data to the mapper in the form of key-value pairs.

[A]   Map

[B]   Input Phase

[C]   Reducer

[D]  None of the above

 

11.  _____________ is a data warehouse infrastructure tool to process structured data in Hadoop.

[A]  Hive

[B]   Pig

[C]   Both of the above

[D]  None of the above

 

12.  ___________ stores schema in a database and processed data into HDFS

[A]  Hive

[B]   Pig

[C]   Both of the above

[D]  None of the above

 

13.  ___________ is designed for OLAP.

[A]  Hive

[B]   Pig

[C]   Both of the above

[D]  None of the above

 

14.  ___________provides SQL type language for querying called HiveQL or HQL.

[A]  Hive

[B]   Pig

[C]   Both of the above

[D]  None of the above

 

15.  ___________is not a relational database.

[A]  Hive

[B]   Pig

[C]   Both of the above

[D]  None of the above

 

16.  ___________ is not designed for OLTP.

[A]  Hive

[B]   Pig

[C]   Both of the above

[D]  None of the above

 

17.  ___________ is not a language real-time queries and row-level updates.

[A]  Hive

[B]   Pig

[C]   Both of the above

[D]  None of the above

 

18.  ___________is a data warehouse infrastructure software that can create interaction between user and HDFS.

[A]  Hive

[B]   Pig

[C]   Both of the above

[D]  None of the above

 

19.  ___________ is similar to SQL for querying on schema info on the metastore.

[A]  Hive

[B]   Python

[C]   Both of the above

[D]  None of the above

 

20.  __________ uses lazy evaluation.

[A]   SQL

[B]   Pig

[C]   MapReduce

[D]  None of the above

 

21.  __________uses extract, transform, and load (ETL).

[A]  Pig

[B]   SQL

[C]   Both of the above

[D]  None of the above

 

22.  __________is able to store data at any point during a pipeline.

[A]  Pig

[B]   SQL

[C]   Both of the above

[D]  None of the above

 

23.  __________declares execution plans.

[A]  Pig

[B]   SQL

[C]   Both of the above

[D]  None of the above

 

24.  __________supports pipeline splits, thus allowing workflows to proceed along DAGs instead of strictly sequential pipelines.

[A]  Pig

[B]   SQL

[C]   Both of the above

[D]  None of the above

 

25.  __________is too low-level and rigid, and leads to a great deal of custom user code that is hard to maintain and reuse.

[A]  Mapreduce

[B]   Pig

[C]   SQL

[D]  None of the above

 

26.  __________is a high level programming language useful for analyzing large data sets.

[A]  Pig

[B]   Mapreduce

[C]   Both of the above

[D]  None of the above

 

27.  ___________Is query language based on SQL.

[A]   Pig

[B]   MapReduce

[C]   HiveQL

[D]  None of the above

 

28.  ___________ is data flow scripting language.

[A]   MapReduce

[B]   HiveQL

[C]   Pig

[D]  None of the above

29.  Joins are simple to achieve in _________.

[A]  Pig

[B]   Mapreduce

[C]   Both of the above

[D]  None of the above

 

30.  In ____________ performing data sets joins is very difficult.

[A]   HiveQL

[B]   MapReduce

[C]   Pig

[D]  None of the above

No comments:

Post a Comment

Featured Post

Data Analysis

    What is data analysis and its significance?   Data analysis is the process of collecting, transforming, and organizing data to dr...

Popular Posts