Question 01
What is data science about?
Data science is fundamentally about extracting meaningful insights from large data sets (B) to support smarter decisions, while at its broadest it also encompasses using any data to make better decisions (E). Options A–D are either too narrow (stats only, viz tools) or miss the scale aspect (small datasets). Data science typically deals with large, complex data and aims to create value from it.
✓ Correct answers: B & E
Question 02
Which of the following is/are NOT task(s) included in using data?
The core lifecycle tasks for using data are Collect → Store → Manage → Analyze → Use. Sampling (D) is a statistical technique applied within analysis, not a top-level data task. Serialization (E) is a programming/storage encoding concept (like JSON or Pickle), not a standard data workflow step. Both are real operations but don't belong to the canonical task taxonomy taught in this course.
✓ Correct answers: D & E
Question 03
Which of the following describes "Machine-generated" data?
Machine-generated data comes directly from automated systems without direct human authorship. Software logs, hardware device outputs (B), and IoT sensor readings (E) are classic examples — they're produced continuously and automatically. Social media, emails, and messages are created by humans (even if delivered digitally), so they're classified as human-generated data.
✓ Correct answers: B & E
Question 04
Which of the following is an example of unstructured data?
Unstructured data lacks a predefined schema or model. BLOBs (Binary Large OBjects) stored in a database (D) — such as images, PDFs, videos, or raw text — have no inherent row/column structure. All other options (relational tables, banking records, EHRs, DBMS-managed datasets) have well-defined schemas that make them structured or semi-structured data.
✓ Correct answer: D
Question 05
What are the 5 V's of Big Data?
The canonical 5 V's framework characterises Big Data challenges: Volume (scale of data), Velocity (speed of generation), Variety (different types/formats), Veracity (trustworthiness/quality of data), and Value (the business worth extracted). Words like "Versatility" or "Validity" don't belong to the standard framework — watch out for these distractors.
✓ Correct answer: A
Question 06
Which of the following is a task of Prescriptive Analytics?
Prescriptive analytics answers "What should we do?" — it recommends optimal actions. Linear programming (D) is a classic optimisation technique used to prescribe best decisions. Graph-theoretic computations (C) (e.g., shortest path, network flow) are used to prescribe routing or resource allocation decisions. Linear regression and hypothesis testing are predictive/descriptive, and sequence rule mining is predictive/pattern-discovery.
✓ Correct answers: C & D
Question 07
What does association rule mining aim to discover?
Association rule mining (e.g., the Apriori algorithm) finds co-occurrence relationships between items — things that tend to appear together. The classic example: "customers who buy bread and butter also tend to buy milk." This is expressed as rules like bread, butter → milk, with metrics like support and confidence. It doesn't predict sequences (D, which is sequence mining) or cluster points (E, which is clustering).
✓ Correct answer: C
Question 08
What is the main goal of descriptive analytics?
The four analytics types each answer a different question: Descriptive → "What happened?" (D), Diagnostic → "Why did it happen?" (A), Predictive → "What will happen?" (B), Prescriptive → "What should we do?" (C). Descriptive analytics uses tools like dashboards, reports, and summary statistics to characterise historical data — it's the foundation for all other analytics types.
✓ Correct answer: D
Question 09
What type of data is generated from social media interactions?
Social media content (posts, comments, likes, photos, videos) is created by humans (B), making it human-generated. It is also predominantly unstructured (D) — free-form text, images, and multimedia that don't fit neatly into rows and columns. While social platforms store some structured metadata (timestamps, user IDs), the content itself is unstructured. It is not machine-generated (A), which would imply automated/sensor origin.
✓ Correct answers: B & D
Question 10
Which of the following questions is NOT an example of asking good questions from the data?
Good data science questions are framed around what you want to discover or achieve. Questions A, B, C, and E are all goal-oriented: finding patterns, understanding user needs, sourcing the right data, or grouping data. Option D — "How to ignore irrelevant data?" — is a technical preprocessing step, not a question that guides the analytics process. Good questions should drive what you're investigating, not pre-emptively filter what you'll look at.
✓ Correct answer: D
0/10
Quiz Complete!
See how you did below.