Nitin Yashwant Suryawanshi : Data Analytics MCQ Unit- V

UNIT- V

1. __________a distributed file-system that stores data on commodity machines, providing very high aggregate bandwidth across the cluster.

[A]   Hadoop YARN
[B]   Hadoop Distributed File System(HDFS)
[C]   HadoopMapReduce
[D] None of the above

2. __________a resource management platform responsible for managing computing resources in clusters and using them for scheduling of users applications.

[A]   Hadoop Distributed File System(HDFS)
[B]   HadoopMapReduce
[C]   Hadoop YARN
[D] None of the above

3. __________a query language designed for use with data in a JSON format and designed to work with semi-structured data.

[A]   Mahout
[B]   Jaql
[C]   Pig
[D] None of the above

4. __________is the first task, which takes input data and converts it into a set of data where individual elements are broken down into tuples.

[A]   The Reduce Task
[B]   The Map Task
[C]   Both of the above
[D] None of the above

5. __________task takes the output from a map task as input and combines those data tuples into a smaller set of tuples.

[A] The Reduce Task
[B] The Map Task
[C] Both of the above
[D] None of the above

6. A MapReduce programis composed of a ________ method that performs filtering and sorting and a ________ method that performs a summary operation.

[A]   Reduce(), Map()
[B]   Map(), Reduce()
[C]   Both of the above
[D] None of the above

7. In ______ step each worker node applies the Map () function to the local data and writes the output to a temporary storage.

[A]   Reduce
[B]   Map
[C]   Shuffle
[D] None of the above

8. In ______ step worker nodes redistribute data based on the output keys such that all data belonging to one key is located on the same worker node.

[A]   Reduce
[B]   Map
[C]   Shuffle
[D] None of the above

9. In ______ step worker nodes now process each group of output data, per key, in parallel.

[A] Reduce
[B] Map
[C] Shuffle
[D] None of the above

10. _____________ a record reader that translates each record in an input file and sends the parsed data to the mapper in the form of key-value pairs.

[A]   Map
[B]   Input Phase
[C]   Reducer
[D] None of the above

11. _____________ is a data warehouse infrastructure tool to process structured data in Hadoop.

[A] Hive
[B] Pig
[C] Both of the above
[D] None of the above

12. ___________ stores schema in a database and processed data into HDFS

[A] Hive
[B] Pig
[C] Both of the above
[D] None of the above

13. ___________ is designed for OLAP.

[A] Hive
[B] Pig
[C] Both of the above
[D] None of the above

14. ___________provides SQL type language for querying called HiveQL or HQL.

[A] Hive
[B] Pig
[C] Both of the above
[D] None of the above

15. ___________is not a relational database.

[A] Hive
[B] Pig
[C] Both of the above
[D] None of the above

16. ___________ is not designed for OLTP.

[A] Hive
[B] Pig
[C] Both of the above
[D] None of the above

17. ___________ is not a language real-time queries and row-level updates.

[A] Hive
[B] Pig
[C] Both of the above
[D] None of the above

18. ___________is a data warehouse infrastructure software that can create interaction between user and HDFS.

[A] Hive
[B] Pig
[C] Both of the above
[D] None of the above

19. ___________ is similar to SQL for querying on schema info on the metastore.

[A] Hive
[B] Python
[C] Both of the above
[D] None of the above

20. __________ uses lazy evaluation.

[A]   SQL
[B]   Pig
[C]   MapReduce
[D] None of the above

21. __________uses extract, transform, and load (ETL).

[A] Pig
[B] SQL
[C] Both of the above
[D] None of the above

22. __________is able to store data at any point during a pipeline.

[A] Pig
[B] SQL
[C] Both of the above
[D] None of the above

23. __________declares execution plans.

[A] Pig
[B] SQL
[C] Both of the above
[D] None of the above

24. __________supports pipeline splits, thus allowing workflows to proceed along DAGs instead of strictly sequential pipelines.

[A] Pig
[B] SQL
[C] Both of the above
[D] None of the above

25. __________is too low-level and rigid, and leads to a great deal of custom user code that is hard to maintain and reuse.

[A] Mapreduce
[B] Pig
[C] SQL
[D] None of the above

26. __________is a high level programming language useful for analyzing large data sets.

[A] Pig
[B] Mapreduce
[C] Both of the above
[D] None of the above

27. ___________Is query language based on SQL.

[A]   Pig
[B]   MapReduce
[C]   HiveQL
[D] None of the above

28. ___________ is data flow scripting language.

[A]   MapReduce
[B]   HiveQL
[C]   Pig
[D] None of the above

29. Joins are simple to achieve in _________.

[A] Pig
[B]   Mapreduce
[C]   Both of the above
[D] None of the above

30. In ____________ performing data sets joins is very difficult.
[A]   HiveQL
[B]   MapReduce
[C]   Pig
[D] None of the above

Aug 6, 2024

Data Analytics MCQ Unit- V

No comments:

Post a Comment

Featured Post

Data Analysis

Popular Posts

Followers