King's College London

Introduction to Data Science

Week 1 — Multiple Choice Quiz  ·  Dr Grigorios Loukides

Questions 10
Type Multiple select & single answer
Topic Data Science Fundamentals
0 / 10 answered
Score: 0
Question 01
What is data science about?
AAnalyzing small data sets to improve personal decision-making
BExtraction of useful information and knowledge from large volumes of data to improve business decision-making
CCollecting data for statistical analysis only
DCreating data visualization tools
EUsing data to make informed decisions
Data science is fundamentally about extracting meaningful insights from large data sets (B) to support smarter decisions, while at its broadest it also encompasses using any data to make better decisions (E). Options A–D are either too narrow (stats only, viz tools) or miss the scale aspect (small datasets). Data science typically deals with large, complex data and aims to create value from it.
✓ Correct answers: B & E
Question 02
Which of the following is/are NOT task(s) included in using data?
ACollect
BStore
CManage
DSample
ESerialize
The core lifecycle tasks for using data are Collect → Store → Manage → Analyze → Use. Sampling (D) is a statistical technique applied within analysis, not a top-level data task. Serialization (E) is a programming/storage encoding concept (like JSON or Pickle), not a standard data workflow step. Both are real operations but don't belong to the canonical task taxonomy taught in this course.
✓ Correct answers: D & E
Question 03
Which of the following describes "Machine-generated" data?
AData generated from human interactions with systems
BData generated from software systems and hardware devices
CData generated from social media interactions
DData generated from emails and messages
EData generated from sensors
Machine-generated data comes directly from automated systems without direct human authorship. Software logs, hardware device outputs (B), and IoT sensor readings (E) are classic examples — they're produced continuously and automatically. Social media, emails, and messages are created by humans (even if delivered digitally), so they're classified as human-generated data.
✓ Correct answers: B & E
Question 04
Which of the following is an example of unstructured data?
ARelational database tables
BBanking transactions
CElectronic health records
DTextual or binary data stored as BLOBs in a DBMS
EStructured datasets managed using a DBMS
Unstructured data lacks a predefined schema or model. BLOBs (Binary Large OBjects) stored in a database (D) — such as images, PDFs, videos, or raw text — have no inherent row/column structure. All other options (relational tables, banking records, EHRs, DBMS-managed datasets) have well-defined schemas that make them structured or semi-structured data.
✓ Correct answer: D
Question 05
What are the 5 V's of Big Data?
AVolume, Velocity, Variety, Veracity, Value
BVolume, Versatility, Variety, Validity, Value
CVelocity, Versatility, Veracity, Value, Variety
DVolume, Versatility, Validity, Value, Veracity
EVolume, Velocity, Versatility, Variety, Value
The canonical 5 V's framework characterises Big Data challenges: Volume (scale of data), Velocity (speed of generation), Variety (different types/formats), Veracity (trustworthiness/quality of data), and Value (the business worth extracted). Words like "Versatility" or "Validity" don't belong to the standard framework — watch out for these distractors.
✓ Correct answer: A
Question 06
Which of the following is a task of Prescriptive Analytics?
ALinear Regression
BStatistical hypothesis testing
CGraph-theoretic computations
DLinear programming
ESequence rule mining
Prescriptive analytics answers "What should we do?" — it recommends optimal actions. Linear programming (D) is a classic optimisation technique used to prescribe best decisions. Graph-theoretic computations (C) (e.g., shortest path, network flow) are used to prescribe routing or resource allocation decisions. Linear regression and hypothesis testing are predictive/descriptive, and sequence rule mining is predictive/pattern-discovery.
✓ Correct answers: C & D
Question 07
What does association rule mining aim to discover?
APredictive models for future trends
BStatistical significance in datasets
CRelationships between items in a dataset
DSequence patterns in data
EClustering of similar data points
Association rule mining (e.g., the Apriori algorithm) finds co-occurrence relationships between items — things that tend to appear together. The classic example: "customers who buy bread and butter also tend to buy milk." This is expressed as rules like bread, butter → milk, with metrics like support and confidence. It doesn't predict sequences (D, which is sequence mining) or cluster points (E, which is clustering).
✓ Correct answer: C
Question 08
What is the main goal of descriptive analytics?
ATo answer why something happened
BTo predict what will happen in the future
CTo determine what actions to take
DTo summarize and describe past data
ETo optimize data processing techniques
The four analytics types each answer a different question: Descriptive → "What happened?" (D), Diagnostic → "Why did it happen?" (A), Predictive → "What will happen?" (B), Prescriptive → "What should we do?" (C). Descriptive analytics uses tools like dashboards, reports, and summary statistics to characterise historical data — it's the foundation for all other analytics types.
✓ Correct answer: D
Question 09
What type of data is generated from social media interactions?
AMachine-generated data
BHuman-generated data
CStructured data
DUnstructured data
EMetadata
Social media content (posts, comments, likes, photos, videos) is created by humans (B), making it human-generated. It is also predominantly unstructured (D) — free-form text, images, and multimedia that don't fit neatly into rows and columns. While social platforms store some structured metadata (timestamps, user IDs), the content itself is unstructured. It is not machine-generated (A), which would imply automated/sensor origin.
✓ Correct answers: B & D
Question 10
Which of the following questions is NOT an example of asking good questions from the data?
AWhat patterns can you learn from a given dataset?
BWhat do people really want to know?
CWhat datasets might get you to your answers?
DHow to ignore irrelevant data?
EHow to group similar data points together?
Good data science questions are framed around what you want to discover or achieve. Questions A, B, C, and E are all goal-oriented: finding patterns, understanding user needs, sourcing the right data, or grouping data. Option D — "How to ignore irrelevant data?" — is a technical preprocessing step, not a question that guides the analytics process. Good questions should drive what you're investigating, not pre-emptively filter what you'll look at.
✓ Correct answer: D
0/10
Quiz Complete!
See how you did below.