What is Nominal data?

Nominal data has categories with no natural order. Examples include blood type (A, B, O), favorite color, or city names. You can count occurrences and find the mode, but you cannot calculate mean or arrange in meaningful order.

What is Ordinal data?

Ordinal data has categories that can be ranked or ordered. Examples include satisfaction ratings (Poor to Excellent), education levels, or T-shirt sizes. The order matters, but gaps between categories aren't necessarily equal.

What is Discrete data?

Discrete data consists of countable whole numbers with gaps between possible values. Examples include number of students (can't have 32.5 students), goals scored, or number of siblings. Obtained by counting, not measuring.

What is Continuous data?

Continuous data can take any value including decimals within a range. Examples include height (165.7 cm), weight (58.5 kg), or temperature (28.3°C). Obtained by measuring, with infinite possible values between any two points.

Why are phone numbers considered Nominal data?

Phone numbers use digits but serve as labels, not quantities. You can't add phone numbers or calculate their average meaningfully. They identify people, not measure anything. Similarly, PIN codes and jersey numbers are nominal despite containing digits.

What is data processing?

Data processing transforms raw data into usable format through cleaning (fixing errors), handling missing values, standardizing formats (Male/M/male → Male), removing duplicates, and validating accuracy. Clean data is essential for accurate AI models.

What is one-hot encoding in AI?

One-hot encoding converts categorical data into numerical format for AI. Each category becomes a separate column with 0 or 1 values. For colors (Red, Blue), it creates Is_Red and Is_Blue columns. This allows AI to process categorical data mathematically.

Types of Data in AI: How to Acquire, Process and Interpret Data (Qualitative vs Quantitative for Class 9)

Q: What is the difference between Qualitative and Quantitative data?

Qualitative data describes categories or qualities (colors, names, opinions) and answers 'what kind?' Quantitative data describes quantities with numbers (age, height, temperature) and answers 'how many?' or 'how much?' Different analysis methods apply to each.

What Will You Learn?

By the end of this lesson, you will be able to:

Distinguish between qualitative and quantitative data
Identify different subtypes of data (nominal, ordinal, discrete, continuous)
Understand how to acquire data from various sources
Learn methods to process and clean raw data
Interpret data correctly for AI applications

Imagine you’re conducting a survey about your school’s canteen. You might ask questions like:

“How would you rate the food quality?” (Excellent, Good, Average, Poor)
“What’s your favorite dish?” (Samosa, Sandwich, Biryani, Pasta)
“How many times do you eat at the canteen per week?” (1, 2, 3, 4, 5)
“How much do you spend on average?” (₹20, ₹35, ₹50, ₹75)

Notice that some answers are categories (words), while others are numbers. Some numbers can only be whole values, while others could be decimals. These differences matter a lot in AI.

Understanding types of data is like understanding different ingredients in cooking. You wouldn’t treat salt the same as flour, right, even though both are white? Similarly, AI treats different types of data differently.

Let’s explore this fascinating world of data types.

The Two Main Types of Data

All data falls into one of two broad categories:

Type	Also Called	What It Is	Example
Qualitative	Categorical	Describes qualities or categories of the data	Colors, names, opinions
Quantitative	Numerical	Describes quantities with numbers	Age, height, temperature

Think of it this way:

Qualitative answers “What kind?” or “Which category?”
Quantitative answers “How many?” or “How much?”

💡 Key Insight

The type of data determines what analysis methods you can use. You can calculate the average height (quantitative), but you can’t calculate the “average color” (qualitative).

Qualitative Data: Categories and Qualities

Qualitative data describes characteristics that cannot be measured with numbers. It categorizes or labels items.

Qualitative data describes characteristics that cannot be measured with numbers. It focuses on attributes, perceptions, and descriptive qualities that help you understand the nature of what you’re studying. This type of data often captures opinions, behaviors, or categories that add context to numerical findings. It is especially useful when you want to explore motivations, patterns, or meanings behind an outcome.

Types of Qualitative Data

1. Nominal Data

What it is: Categories with no natural order or ranking.

Characteristics:

Just labels or names
No category is “higher” or “better” than another
Cannot be arranged in meaningful order

Examples:

Category	Possible Values
Gender	Male, Female, Other
Blood type	A, B, AB, O
Favorite color	Red, Blue, Green, Yellow
City of residence	Delhi, Mumbai, Chennai, Kolkata
Programming language	Python, Java, C++, JavaScript

What you CAN do: Count occurrences, find the mode (most common)
What you CANNOT do: Calculate mean, find median, arrange in order

2. Ordinal Data

What it is: Categories that have a natural order or ranking.

Characteristics:

Categories can be ranked (1st, 2nd, 3rd…)
The gaps between ranks may not be equal
Order matters, but differences can’t be measured precisely

Examples:

Category	Ordered Values
Education level	Primary < Secondary < Graduate < Postgraduate
Customer satisfaction	Very Dissatisfied < Dissatisfied < Neutral < Satisfied < Very Satisfied
T-shirt size	XS < S < M < L < XL < XXL
Star rating	⭐ < ⭐⭐ < ⭐⭐⭐ < ⭐⭐⭐⭐ < ⭐⭐⭐⭐⭐
Spice level	Mild < Medium < Hot < Extra Hot

What you CAN do: Count, find mode, find median, compare rankings
What you CANNOT do: Calculate mean (the “average” of Good and Excellent isn’t meaningful)

🧪 Think About It

Is the difference between “Satisfied” and “Very Satisfied” the same as between “Neutral” and “Satisfied”? We don’t know — that’s why ordinal data is tricky!

Quantitative Data: Numbers and Measurements

Quantitative data represents quantities that can be measured and expressed as numbers.

Types of Quantitative Data

1. Discrete Data

What it is: Countable values, usually whole numbers.

Characteristics:

Can only take specific, separate values
Usually obtained by counting
Cannot have fractions or decimals (in context)
There are gaps between possible values

Examples:

Measurement	Why It’s Discrete
Number of students in class	Can’t have 32.5 students
Number of cars in parking	Can’t have 2.7 cars
Goals scored in a match	Can’t score 3.5 goals
Number of siblings	Can’t have 1.5 siblings
Eggs in a basket	Can’t have half an egg (in counting)

What you CAN do: Count, mean, median, mode, range, all mathematical operations

2. Continuous Data

What it is: Measurable values that can take any value within a range.

Characteristics:

Can take any value (including decimals)
Usually obtained by measuring
Infinite possible values between any two points
No gaps between possible values

Examples:

Measurement	Why It’s Continuous
Height	Can be 165.7 cm, 165.73 cm, etc.
Weight	Can be 58.5 kg, 58.52 kg, etc.
Temperature	Can be 28.3°C, 28.35°C, etc.
Time taken	Can be 45.7 seconds
Distance	Can be 5.25 km

What you CAN do: All mathematical operations, precise measurements, detailed analysis

Quick Comparison: All Four Types

Type	Category	Order	Math Operations	Example
Nominal	Qualitative	No order	Count, mode only	Blood type (A, B, O)
Ordinal	Qualitative	Has order	Count, mode, median	Rating (Poor to Excellent)
Discrete	Quantitative	Has order	All operations	Number of children
Continuous	Quantitative	Has order	All operations	Height in cm

Visual Summary

                    DATA
                      │
          ┌───────────┴───────────┐
          │                       │
     QUALITATIVE             QUANTITATIVE
     (Categories)              (Numbers)
          │                       │
     ┌────┴────┐             ┌────┴────┐
     │         │             │         │
  NOMINAL   ORDINAL      DISCRETE  CONTINUOUS
  (No order) (Ordered)   (Counted)  (Measured)

Identifying Data Types: Practice

Let’s practice identifying data types:

Data Example	Type	Reasoning
PIN code: 110001	Nominal	Numbers used as labels, not for math
Rank in class: 1st, 2nd, 3rd	Ordinal	Ordered categories
Marks obtained: 78, 85, 92	Discrete	Countable, whole numbers
Body temperature: 98.6°F	Continuous	Measurable, can have decimals
Movie genre: Action, Comedy	Nominal	Categories, no order
Pain level: 1-10 scale	Ordinal	Ordered rating scale
Number of pages in book	Discrete	Countable
Time to complete task	Continuous	Measurable duration

⚠️ Common Confusion

Just because something is a number doesn’t make it quantitative! Phone numbers, PIN codes, and jersey numbers are actually nominal — they’re labels, not quantities. You wouldn’t calculate the “average phone number”!

Data Acquisition: Gathering Your Data

Now that we understand data types, let’s learn how to acquire (collect) data.

Primary Data Collection Methods

Method	Description	Best For	Data Types
Surveys/Questionnaires	Asking people questions	Opinions, preferences	All types
Observations	Watching and recording	Behavior, events	All types
Experiments	Controlled testing	Cause-effect relationships	Quantitative
Interviews	In-depth conversations	Detailed insights	Mostly qualitative
Sensors/Devices	Automatic measurement	Physical measurements	Continuous

Secondary Data Sources

Source	Examples	Considerations
Government portals	data.gov.in, census data	Reliable, may be outdated
Research databases	Kaggle, UCI Repository	Clean, documented
Company records	Sales data, HR records	Need permission
Published reports	Industry reports, studies	May have bias
Web scraping	Social media, websites	Legal and ethical concerns

Designing Good Survey Questions

Question Type	Example	Data Type Generated
Multiple choice (single)	“What is your gender?”	Nominal
Multiple choice (ranked)	“Rate your satisfaction (1-5)”	Ordinal
Numeric input	“How old are you?”	Discrete
Scale/slider	“Rate your pain level”	Ordinal/Continuous
Open-ended	“Describe your experience”	Qualitative (text)

Tips for good questions:

Be clear and specific
Avoid leading questions
Provide appropriate options
Consider the data type you need

The Three Strategies for Data Acquisition

Beyond knowing where to collect data (primary or secondary), it is equally important to understand how data acquisition happens in practice. The CBSE AI curriculum identifies three distinct strategies:

1. Discovery

Discovery means finding and using data that already exists. You search for datasets that have already been collected — in government portals, research repositories, company databases, or published studies — and use them directly for your AI project.

Example: An AI project to predict crop yield could use rainfall and temperature data already published by the Indian Meteorological Department, rather than setting up new weather stations.

When to use it: When sufficient data already exists and collecting new data would be expensive or time-consuming.

2. Augmentation

Augmentation means enhancing an existing dataset to make it larger, more balanced, or more representative. You take data that already exists and systematically expand it.

Example: If you only have 200 photos of diseased leaves for training an AI plant disease detector, you can augment the dataset by rotating, flipping, and slightly adjusting the brightness of each image — creating 2,000 training examples from the original 200.

When to use it: When your existing data is too small, unbalanced, or lacks variety — and collecting entirely new data is not feasible.

3. Generation

Generation means creating entirely new, artificial (synthetic) data that did not previously exist. This is especially useful when real data is scarce, sensitive, or impossible to collect.

Example: Medical AI systems often lack enough examples of rare diseases. Researchers can generate synthetic patient records that mimic real data patterns but contain no actual patient information — solving the privacy problem while increasing training data.

When to use it: When real data cannot be obtained due to privacy concerns, cost, rarity of events, or ethical restrictions.

Strategy	What You Do	Example
Discovery	Find and use existing data	Government agricultural databases
Augmentation	Expand and enhance existing data	Image flipping and rotation
Generation	Create synthetic data from scratch	Simulated patient records

Data Processing: Cleaning and Preparing

Raw data is rarely ready for use. Data processing transforms raw data into usable format.

Common Data Problems

Problem	Example	Solution
Missing values	Age: , 25, 30, , 28	Fill with average, remove, or flag
Inconsistent formats	“Male”, “M”, “male”, “MALE”	Standardize to one format
Outliers	Heights: 165, 170, 168, 950, 172	Investigate: error or genuine?
Duplicates	Same person entered twice	Remove duplicates
Wrong data types	Age stored as “Twenty-five”	Convert to number (25)
Invalid values	Age: -5 or 500	Validate against possible range

Data Processing Steps

Step 1: COLLECT raw data
               ↓
Step 2: INSPECT for problems
               ↓
Step 3: CLEAN (fix errors, fill gaps)
               ↓
Step 4: TRANSFORM (convert formats, create categories)
               ↓
Step 5: VALIDATE (check everything is correct)
               ↓
Step 6: STORE in organized format

Example: Processing Survey Data

Raw data collection

Name	Age	City	Rating
Rahul	15	delhi	Good
Priya	sixteen	Delhi	EXCELLENT
Amit	14	Delhi	good
Rahul	15	delhi	Good
Neha		Mumbai	Average

Problems identified:

Inconsistent capitalization (delhi/Delhi, Good/good/EXCELLENT)
Age as text (“sixteen”)
Missing value (Neha’s age)
Duplicate entry (Rahul appears twice)

After processing:

Name	Age	City	Rating
Rahul	15	Delhi	Good
Priya	16	Delhi	Excellent
Amit	14	Delhi	Good
Neha	15*	Mumbai	Average

*Filled with average age

Three Qualities of Well-Processed Data: Structure, Cleanliness, and Accuracy

Once data has been collected, processing it is not just about fixing obvious errors. Good data processing ensures three fundamental qualities that make data genuinely ready for AI training.

Structure

Structured data is organised in a consistent, machine-readable format. Every column has one type of value, every row represents one record, and the format is uniform throughout.

Example of unstructured raw response: “My age is fifteen and I live in Mumbai.”

After structuring: Age = 15, City = Mumbai (two separate, typed fields).

AI models cannot learn from disorganised text that mixes different kinds of information. Structure is the foundation.

Cleanliness

Clean data is free from errors, duplicates, missing values, and noise. A single dirty entry can skew the AI’s learning if not corrected.

Signs of unclean data: Missing values, inconsistent spelling (“Delhi” vs “delhi”), impossible values (age = 500), duplicate rows.

Cleanliness is not just about removing errors — it is about ensuring every data point truly represents reality.

Accuracy

Accurate data correctly reflects the real-world facts it is supposed to capture. Data can be structured and clean but still inaccurate — for example, if survey respondents answered dishonestly, or if sensors were incorrectly calibrated.

Example: A dataset of students’ study hours might be structured and clean, but if students overreported their hours, the data is inaccurate and will produce a misleading model.

Quality	Question to Ask	What Goes Wrong Without It
Structure	Is the data organised consistently?	AI cannot parse or learn from it
Cleanliness	Is the data free of errors and noise?	Errors bias the model
Accuracy	Does the data reflect reality?	Model learns wrong patterns

Independent and Dependent Features

When preparing data for an AI model, it is essential to understand the two types of features (columns/variables) in your dataset:

Independent features are the input variables — the pieces of information the AI uses to make a prediction. They are independent because they are the starting conditions, not determined by the output.

Dependent features are the output variable — the thing the AI is trying to predict or classify. It is dependent because its value depends on the pattern of the input features.

Example: Predicting Student Performance

Independent Features (Inputs)	Dependent Feature (Output)
Study hours per day	Final exam score
Attendance percentage
Assignment submission rate
Previous test scores

The AI is given the independent features and must learn to predict the dependent feature. Getting this separation right is critical: if you accidentally include the output as one of the inputs, the AI will appear to perform perfectly during training (because you gave it the answer) but will fail in the real world.

Another example — Disease Prediction:

Independent features: Age, blood pressure, cholesterol, weight, smoking habits
Dependent feature: Has heart disease (Yes/No)

💡 Key Insight

In machine learning, independent features are also called predictors or input variables, and the dependent feature is also called the label, target, or output variable. Understanding which is which is the first step in setting up any AI model correctly.

Data Interpretation: Making Sense of Data

Data interpretation is extracting meaning from processed data. Just processing data in a

Textual, Tabular, and Graphical Interpretation

Textual interpretation means expressing findings in written sentences and paragraphs. It is best for summary conclusions, narrative explanations, or when the insight is qualitative.

Example: “The survey shows that most students prefer samosas, with biryani as a close second. Students from higher grades tend to spend more per visit.”

Tabular interpretation means organising data into rows and columns for precise comparison. Tables are best when you need to show exact numbers across multiple categories and allow the reader to look up specific values.

Example:

Dish	Votes	Percentage
Samosa	80	40%
Biryani	60	30%
Sandwich	40	20%
Pasta	20	10%

Graphical interpretation means representing data visually — through bar charts, pie charts, line graphs, scatter plots, and other visuals. Graphs are best when you want to show trends, comparisons, or distributions at a glance, without requiring the reader to study numbers closely.

Form	Best Used For	Limitation
Textual	Summaries, qualitative insights, narrative conclusions	Hard to compare many values
Tabular	Precise comparisons, looking up specific values	Takes effort to spot trends
Graphical	Trends, distributions, patterns at a glance	Less precise than a table

A well-prepared data report typically combines all three: a graph to show the pattern, a table for precise numbers, and text to explain what it means.

Interpreting Different Data Types

Data Type	Interpretation Methods
Nominal	Frequency counts, mode, percentage distribution
Ordinal	Median, percentiles, ranking analysis
Discrete	Mean, median, mode, frequency distribution
Continuous	Mean, median, standard deviation, range

Example: Interpreting Survey Results

Question: “How satisfied are you with school facilities?”

Results (200 students):

Rating	Count	Percentage
Very Dissatisfied	10	5%
Dissatisfied	30	15%
Neutral	40	20%
Satisfied	80	40%
Very Satisfied	40	20%

Interpretation:

Mode: Satisfied (most common response)
Median: Satisfied (middle value when ordered)
Positive responses: 60% (Satisfied + Very Satisfied)
Negative responses: 20% (Dissatisfied + Very Dissatisfied)
Insight: Most students are satisfied, but 20% are unhappy — worth investigating why.

Avoiding Interpretation Mistakes

Mistake	Example	Problem
Treating ordinal as continuous	“Average satisfaction is 3.7”	Gaps between categories aren’t equal
Ignoring sample size	“100% satisfied!” (based on 2 responses)	Too small to be meaningful
Confusing correlation with causation	“Ice cream sales and drowning both increase in summer, so ice cream causes drowning”	Both caused by a third factor (heat)
Cherry-picking data	Showing only favorable results	Misleading conclusions

Data Types in AI Applications

Different AI applications need different data types:

AI Application	Primary Data Types	Example Data
Image Classification	Nominal (labels) + Continuous (pixels)	“Cat” or “Dog” labels on images
Sentiment Analysis	Ordinal (sentiment scores)	Positive/Negative/Neutral ratings
Price Prediction	Continuous	House prices, stock prices
Customer Segmentation	Mixed	Demographics (nominal) + Spending (continuous)
Recommendation Systems	Ordinal (ratings) + Nominal (categories)	Movie ratings, genre preferences
Medical Diagnosis	Mixed	Symptoms (nominal), test results (continuous)

How AI Handles Different Data Types

Data Type	AI Treatment
Nominal	One-hot encoding (converting to binary columns)
Ordinal	Label encoding (converting to ordered numbers)
Discrete	Direct use or normalization
Continuous	Normalization or standardization

Example: One-Hot Encoding for Colors

Original: Color = [Red, Blue, Green, Red, Blue]

Encoded:

Is_Red	Is_Blue	Is_Green
1	0	0
0	1	0
0	0	1
1	0	0
0	1	0

This allows AI to work with categorical data mathematically.

Activity: Classify and Plan

Part A: Data Type Classification

Classify each as Nominal, Ordinal, Discrete, or Continuous:

Number of WhatsApp messages sent today
Your blood group
Temperature in your city
Your position in a race (1st, 2nd, 3rd)
Number of pets you have
Your favorite sport
Your height
Customer review stars (1-5)
Number of Instagram followers
Your mood today (Happy, Sad, Neutral)

Part B: Data Collection Planning

You want to understand Class 9 students’ study habits. Design a survey with:

2 nominal questions
2 ordinal questions
2 discrete questions
1 continuous question

(Answers in Answer Key)

Quick Recap

Qualitative data describes categories (Nominal: no order; Ordinal: has order).
Quantitative data describes numbers (Discrete: counted; Continuous: measured).
Nominal data includes categories like colors, names, and types — no natural order.
Ordinal data includes rankings and ratings — order matters but gaps aren’t equal.
Discrete data includes countable values like number of students — whole numbers.
Continuous data includes measurements like height and temperature — any value possible.
Data acquisition involves collecting data through surveys, observations, experiments, or secondary sources.
Data processing cleans and prepares raw data by fixing errors, filling gaps, and standardizing formats.
Data interpretation extracts meaning using appropriate methods for each data type.
AI handles different data types differently — nominal needs encoding, continuous needs normalization.

Next Lesson: Data Visualization with Tableau: How to Create Interactive Charts and Dashboards

Previous Lesson: Data Literacy for Beginners: Data Pyramid, Data Privacy and Cyber Security

EXERCISES

A. Fill in the Blanks

Data that describes categories or qualities is called __________________________ data.
Data that describes quantities with numbers is called __________________________ data.
Nominal data has categories with __________________________ natural order.
Ordinal data has categories that can be ____________________________.
Discrete data is obtained by __________________________ (counting/measuring).
Continuous data is obtained by __________________________ (counting/measuring).
Phone numbers and PIN codes are examples of __________________________ data, not quantitative.
The process of fixing errors and standardizing formats is called data __________________________.
Converting categorical data into numerical format for AI is called __________________________.
____________________________.gov.in is India’s official open data portal.

B. Multiple Choice Questions

1. Which is an example of qualitative data?

(a) Height: 165 cm
(b) Age: 15 years
(c) Favorite color: Blue
(d) Temperature: 28°C

2. Ordinal data differs from nominal data because:

(a) It uses numbers
(b) It has a natural order
(c) It can be measured
(d) It has no categories

3. “Number of students in a class” is what type of data?

(a) Nominal
(b) Ordinal
(c) Discrete
(d) Continuous

4. Body temperature (98.6°F) is an example of:

(a) Nominal data
(b) Ordinal data
(c) Discrete data
(d) Continuous data

5. Which operation is NOT valid for nominal data?

(a) Counting occurrences
(b) Finding the mode
(c) Calculating the mean
(d) Finding percentages

6. A customer satisfaction rating of “Excellent, Good, Average, Poor” is:

(a) Nominal
(b) Ordinal
(c) Discrete
(d) Continuous

7. Which is a primary data collection method?

(a) Using government databases
(b) Conducting surveys
(c) Downloading from Kaggle
(d) Reading research reports

8. One-hot encoding is used for:

(a) Continuous data
(b) Discrete data
(c) Nominal data
(d) Ordinal data

9. “Age: Twenty-five” instead of “25” is an example of:

(a) Missing value
(b) Wrong data type
(c) Duplicate entry
(d) Outlier

10. Jersey numbers on sports uniforms are:

(a) Nominal data
(b) Ordinal data
(c) Discrete data
(d) Continuous data

C. True or False

Qualitative data can always be measured with numbers. (__)
Ordinal data has categories that can be ranked. (__)
You can calculate the average of nominal data. (__)
Discrete data can have decimal values. (__)
Height and weight are examples of continuous data. (__)
PIN codes are quantitative data because they contain numbers. (__)
Data processing includes cleaning and standardizing data. (__)
The mode can be found for all types of data. (__)
Surveys can collect both qualitative and quantitative data. (__)
Correlation always means causation. (__)

D. Define the Following (30-40 words each)

Qualitative Data
Quantitative Data
Nominal Data
Ordinal Data
Discrete Data
Continuous Data
Data Processing

E. Very Short Answer Questions (40-50 words each)

What is the main difference between qualitative and quantitative data?
Explain the difference between nominal and ordinal data with examples.
How is discrete data different from continuous data?
Why are phone numbers considered nominal data even though they contain digits?
What are three common problems found in raw data?
Name three primary methods of data collection.
What is one-hot encoding and why is it used?
Why can’t you calculate the mean of ordinal data?
Give two examples each of discrete and continuous data.
What should you check when interpreting data to avoid mistakes?

F. Long Answer Questions (75-100 words each)

Explain the four types of data (nominal, ordinal, discrete, continuous) with two examples each.
You’re collecting data about students’ mobile phone usage. What type of data would each of the following generate: (a) Brand of phone, (b) Hours of daily usage, (c) Number of apps installed, (d) Satisfaction rating?
Describe the steps involved in data processing. Why is each step important?
What are the different methods of data acquisition? Compare primary and secondary data sources.
Explain how AI handles different types of data. Why does nominal data need special treatment?
A survey about canteen food collected these responses for “food quality”: Excellent, Good, Good, Average, Excellent, Poor, Good. Analyze this data appropriately.
Design a data collection plan to understand exercise habits of Class 9 students. Include questions that generate all four data types.

ANSWER KEY

A. Fill in the Blanks – Answers

qualitative — Qualitative data describes categories.
quantitative — Quantitative data describes numbers.
no — Nominal categories have no natural order.
ranked — Ordinal categories can be ordered.
counting — Discrete data is counted.
measuring — Continuous data is measured.
nominal — Phone numbers are labels, not quantities.
processing/cleaning — Processing fixes data issues.
encoding — Encoding converts categories to numbers.
data — data.gov.in is India’s open data portal.

B. Multiple Choice Questions – Answers

(c) Favorite color: Blue — Colors are categories, not numbers.
(b) It has a natural order — Ordinal data can be ranked.
(c) Discrete — Students are counted as whole numbers.
(d) Continuous data — Temperature can have any decimal value.
(c) Calculating the mean — Can’t average categories.
(b) Ordinal — Ratings have a natural order.
(b) Conducting surveys — Primary = collecting yourself.
(c) Nominal data — Converts categories to binary columns.
(b) Wrong data type — Text instead of number.
(a) Nominal data — Jersey numbers are labels, not quantities.

C. True or False – Answers

False — Qualitative data describes qualities, not measured numbers.
True — Ordinal categories have a natural ranking.
False — Cannot calculate average of categories like colors.
False — Discrete data is whole numbers only.
True — Both can have any decimal value.
False — PIN codes are labels (nominal), not quantities.
True — Processing includes cleaning and standardizing.
True — Mode (most common) works for all types.
True — Surveys can include various question types.
False — Correlation does not imply causation.

D. Definitions – Answers

1. Qualitative Data: Data that describes qualities or characteristics using categories rather than numbers. It answers “what kind?” and includes types like colors, names, and opinions.

2. Quantitative Data: Data that describes quantities using numbers and measurements. It answers “how many?” or “how much?” and includes values like height, age, and count.

3. Nominal Data: A type of qualitative data where categories have no natural order or ranking. Examples include blood type, gender, and favorite color.

4. Ordinal Data: A type of qualitative data where categories have a natural order but the gaps between them aren’t necessarily equal. Examples include satisfaction ratings and education levels.

5. Discrete Data: A type of quantitative data with countable, separate values, usually whole numbers. Cannot have fractions in context. Examples: number of children, goals scored.

6. Continuous Data: A type of quantitative data that can take any value within a range, including decimals. Obtained by measuring. Examples: height, weight, temperature.

7. Data Processing: The steps of transforming raw data into usable format, including cleaning errors, handling missing values, standardizing formats, removing duplicates, and validating accuracy.

E. Very Short Answer Questions – Answers

1. Qualitative vs quantitative difference:
Qualitative data describes categories or qualities (colors, names, opinions) answering “what kind?” Quantitative data describes quantities with numbers (height, age, count) answering “how many?” or “how much?”

2. Nominal vs ordinal with examples:
Nominal: categories with no order (blood types A, B, O — no type is “higher”). Ordinal: categories with order (education levels: Primary < Secondary < Graduate — there’s a ranking but gaps aren’t equal).

3. Discrete vs continuous difference:
Discrete data is countable with separate whole values (students in class: 32, not 32.5). Continuous data is measurable with any value possible (height: 165.7 cm, can be any decimal).

4. Phone numbers as nominal:
Phone numbers use digits but are labels/identifiers, not quantities. You wouldn’t add phone numbers or calculate their average. The digits don’t represent amounts — they’re just identification codes.

5. Three raw data problems:
Missing values (empty cells), inconsistent formats (Male/M/male), and outliers (height: 950 cm — likely error). Others include duplicates and wrong data types.

6. Three primary collection methods:
Surveys/questionnaires (asking questions), observations (watching and recording), and experiments (controlled testing). Also interviews and sensor measurements.

7. One-hot encoding:
Converting nominal categories into binary columns. Example: Color (Red, Blue) becomes Is_Red (1/0) and Is_Blue (1/0). Used because AI algorithms need numerical inputs to perform calculations.

8. Why no mean for ordinal:
The gaps between ordinal categories aren’t equal. The difference between “Good” and “Excellent” may not equal the difference between “Poor” and “Average.” Mean assumes equal spacing.

9. Discrete and continuous examples:
Discrete: Number of siblings (0, 1, 2…), goals in a match (0, 1, 2…). Continuous: Height (165.5 cm), weight (58.3 kg), temperature (28.7°C).

10. Avoiding interpretation mistakes:
Check sample size (is it large enough?), don’t confuse correlation with causation, use appropriate methods for each data type, don’t cherry-pick favorable results, consider context.

F. Long Answer Questions – Answers

1. Four data types with examples:
Nominal: Categories without order — Blood type (A, B, O, AB), Favorite color (Red, Blue, Green). Ordinal: Ranked categories — Education (Primary < Secondary < Graduate), Star rating (1-5 stars). Discrete: Countable whole numbers — Number of siblings (0, 1, 2), Books read this year (5, 10, 15). Continuous: Measurable any-value — Height (165.7 cm), Temperature (28.5°C). Each type requires different analysis methods.

2. Mobile phone usage data types:
(a) Brand of phone: Nominal — categories like Apple, Samsung, OnePlus with no natural order. (b) Hours of daily usage: Continuous — can be 2.5 hours, 3.7 hours, any decimal value. (c) Number of apps installed: Discrete — whole numbers only (25, 30, 45 apps). (d) Satisfaction rating: Ordinal — ranked categories (Very Satisfied > Satisfied > Neutral, etc.).

3. Data processing steps:
Collect: Gather raw data from sources. Inspect: Check for problems (missing values, errors, duplicates). Clean: Fix errors, fill gaps, remove duplicates. Transform: Standardize formats, convert types, create categories. Validate: Verify everything is correct and consistent. Store: Organize in proper format. Each step is important because errors at any stage corrupt final analysis.

4. Data acquisition methods:
Primary sources: Collecting yourself through surveys, observations, experiments, interviews — tailored to your needs but time-consuming. Secondary sources: Using existing data from government portals, research databases, company records — saves time but may not perfectly fit needs. Primary gives control over quality; secondary provides larger datasets quickly.

5. AI and data types:
AI algorithms work with numbers, so categorical data needs conversion. Nominal data uses one-hot encoding (color → Is_Red, Is_Blue columns with 0/1). Ordinal data uses label encoding (Poor=1, Average=2, Good=3). Continuous data often needs normalization (scaling to 0-1 range). Discrete data may be used directly or normalized. Without proper handling, AI can’t process categories mathematically.

6. Canteen survey analysis:
Data: Excellent, Good, Good, Average, Excellent, Poor, Good (n=7, ordinal data). Frequency: Excellent-2, Good-3, Average-1, Poor-1. Mode: Good (most common). Median: Good (middle value when ordered). Percentage: Positive (Excellent+Good): 71%, Negative (Poor): 14%. Interpretation: Most students find food quality acceptable, but one unhappy customer worth investigating. Cannot calculate mean — ordinal gaps aren’t equal.

7. Exercise habits data collection:
Nominal questions: “What type of exercise do you prefer?” (Running, Swimming, Gym, Yoga), “Where do you exercise?” (Home, Park, Gym, School). Ordinal questions: “How would you rate your fitness level?” (Poor to Excellent), “How motivated are you to exercise?” (1-5 scale). Discrete questions: “How many days per week do you exercise?” (0-7), “How many push-ups can you do?” (number). Continuous question: “How many minutes do you exercise per session?” (can be 25.5 minutes).

Activity Answers

Part A: Data Type Classification

Number of WhatsApp messages — Discrete (countable whole numbers)
Blood group — Nominal (categories, no order)
Temperature — Continuous (measurable, decimals possible)
Position in race — Ordinal (ranked categories)
Number of pets — Discrete (countable whole numbers)
Favorite sport — Nominal (categories, no order)
Height — Continuous (measurable, decimals possible)
Customer review stars — Ordinal (ranked scale)
Instagram followers — Discrete (countable whole numbers)
Mood today — Nominal (categories, no inherent order) or Ordinal (if treated as scale)

Part B: Survey Design (Sample)

Nominal questions:

“What is your preferred study location?” (Home, Library, Classroom, Café)
“Which subject do you find most interesting?” (Math, Science, English, Social Studies)

Ordinal questions:

“How would you rate your study habits?” (Excellent, Good, Average, Poor)
“How stressed do you feel about exams?” (Not at all, Slightly, Moderately, Very, Extremely)

Discrete questions:

“How many hours do you study on weekdays?” (1, 2, 3, 4, 5+)
“How many subjects do you need extra help with?” (0, 1, 2, 3, 4+)

Continuous question:

“On average, how many minutes do you spend on homework daily?” (Open numeric response)

Next Lesson: Data Visualization with Tableau: How to Create Interactive Charts and Dashboards

Previous Lesson: Data Literacy for Beginners: Data Pyramid, Data Privacy and Cyber Security

Types of Data in AI: How to Acquire, Process and Interpret Data (Qualitative vs Quantitative for Class 9)

What Will You Learn?

The Two Main Types of Data

Qualitative Data: Categories and Qualities

Types of Qualitative Data

1. Nominal Data

2. Ordinal Data

Quantitative Data: Numbers and Measurements

Types of Quantitative Data

1. Discrete Data

2. Continuous Data

Quick Comparison: All Four Types

Visual Summary

Identifying Data Types: Practice

Data Acquisition: Gathering Your Data

Primary Data Collection Methods

Secondary Data Sources

Designing Good Survey Questions

The Three Strategies for Data Acquisition

1. Discovery

2. Augmentation

3. Generation

Data Processing: Cleaning and Preparing

Common Data Problems

Data Processing Steps

Example: Processing Survey Data

Three Qualities of Well-Processed Data: Structure, Cleanliness, and Accuracy

Structure

Cleanliness

Accuracy

Independent and Dependent Features

Example: Predicting Student Performance

Data Interpretation: Making Sense of Data

Textual, Tabular, and Graphical Interpretation

Interpreting Different Data Types

Example: Interpreting Survey Results

Avoiding Interpretation Mistakes

Data Types in AI Applications

How AI Handles Different Data Types

Activity: Classify and Plan

Quick Recap

EXERCISES

A. Fill in the Blanks

B. Multiple Choice Questions

C. True or False

D. Define the Following (30-40 words each)

E. Very Short Answer Questions (40-50 words each)

F. Long Answer Questions (75-100 words each)

ANSWER KEY

A. Fill in the Blanks – Answers

B. Multiple Choice Questions – Answers

C. True or False – Answers

D. Definitions – Answers

E. Very Short Answer Questions – Answers

F. Long Answer Questions – Answers

Activity Answers

Submit a Comment Cancel reply

Recent posts

Categories

Jon Morrow Guest Blogging Course

Pin It on Pinterest