Sample MCQs
Sample MCQs
Sample MCQs
Please select the most appropriate answer and write it on the answer booklet.
1. In SQL, a-------------------- subquery always returns two or more columns and single row.
a. row b. scalar c. table d. none of the above
3. The scaling out approach to increasing the capacity of relational DBMSs has the following advantages over
the scaling up approach: ----------------- --.
a. Data will be resilient b. More concurrency c. More processing & d. none of the above
storage capacity for
the same cost
4. Spearman’s ρ is equivalent to Pearson’s R² measure of correlation between two continuous variables but
applies to -------------------- data.
a. categorical b. ordinal c. interval d. ratio
6. The ----------stage of the map reduce programming pattern operates on the partial results produced by the
preceding stage.
a. reduce b. map c. design d. none of the above
7. In contrast with operational systems, data warehousing systems – where responsiveness of large queries is
the critical design factor – use the------------------------------- data model.
a. un-normalized b. normalized c. the un-normalized d. all of the above
multidimensional multidimensional single dimensional
model model
8. Additive measures in a data cube are numeric values associated with facts that ----------- --.
a. can be meaningfully b. cannot be c. cannot be d. none of the above
combined along any meaningfully meaningfully
dimension combined along combined along any
arbitrary dimensions dimensions
10. Relying on Euclidean distance to determine document similarity is -------------------- especially in very high
dimensionality.
a. very unreliable b. very reliable c. is the preferred d. none of the above
approach
Question Two:
Please select the best and most appropriate answer and write it on the answer booklet.
1. All SQL transaction Isolation levels guarantee that the _____________ problem will never occur.
a. Lost update b. Dirty read c. Non-repeatable d. none of the above
read
2. In SQL, a _________ subquery always returns a single column and a single row, that is, a single
value
a. row b. table c. scalar d. none of the above
3. According to Rob Kitchin's characterization of data, the structure of data can be : ____________.
a.structured b. semi-structured c. unstructured d. any of the above
4. A non-empty cell in a cube is called a fact and can contain several ________________
a. cubes b. tables c. measures d. relations
5. Spreadsheets are easily shared, possibly outside the business for which they were created which
creates a problem of data ____________.
a. cleansing b. sharing c. control d. acquisition
6. Replication is mostly concerned with data resilience whereas ________ is concerned with
capacity and performance.
a. sharding b. centralisation c. the user interface d. none of the above
8. Classification tasks like identifying email spam and classifying credit applications, which
use la-belled training data to try to classify unseen instances, are known as
_______________ tasks
a. regression b. unsupervised c. supervised learning d. clustering
learning
9. A relation is in First Normal Form (1NF) if each attribute contains only ______ values,
that is, it has no repeating groups of values.
a. atomic b. numerical c. string d. none of the above
10. Which of the following descriptive statistical techniques requires no training data ?
a. classification b. clustering c. KNN d. all of the above
Question Four:
Please select the best and most appropriate answer and write it on the answer booklet.
1. Which data quality problem is being exhibited by a data set that contains dates using
different calendars in the same column?
a. lack of b. lack of uniformity c. lack of validity d. lack of accuracy
completeness
2. In SQL, a _________ subquery always returns a single column and a single row, that is, a single
value
a. row b. table c. scalar d. none of the above
3. The primary responsibilities for processing personal data include: __________________.
a. identity b. accuracy c. security d. all of the above
4. A non-empty cell in a cube is called a fact and can contain several ________________
a. cubes b. tables c. measures d. relations
7. __________ is concerned with segmenting a diverse group of data into a number of similar sub-
groups.
a. clustering b. correlation c. combination d. regression
8. __________ is part of data integration and is concerned with ensuring domain consistency.
10. Which of the following descriptive statistical techniques requires no training data ?
a. list b. dict c. dataframe d. none of the above