0% found this document useful (0 votes)
92 views20 pages

CAIE Statistics 1 Data Measures Guide

Uploaded by

Ankur Saha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
92 views20 pages

CAIE Statistics 1 Data Measures Guide

Uploaded by

Ankur Saha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Statistics 1 (PAPER 5)

Chapter 1, 2, 3
Representation and Measures of Data
SUPPORTING MATERIAL

EXAMPLES

1. This stem-and-leaf diagram shows the number of employees at 20 companies.


a) What is the most common number of employees?
b) How many of the companies have fewer than 25 employees?
c) What percentage of the companies have more than 30 employees?
d) Determine which of the three rows in the stem-and-leaf diagram contains the smallest number of:
i) Companies 1 0888899 Key: 1| 0
ii) Employees 2 05667789 represents 10
e) Find the mean number of employees in the companies. 3 01129 employees

[S1 book; Ex 1A; No. 3 modified]

2. A family has 38 films on DVD with a mean playing time of 1 hour 32 minutes. They also have 26 films on video
cassette, with a mean playing time of 2 hours 4 minutes. Find the mean playing time of all the films in their
collection.
[S1 book; Worked Example 2.6; Page 32]

3. An examination was taken by 50 students. The 22 boys scored a mean of 71% and the girls scored a mean of 76%.
Find the mean score of all the students.
[S1 book; Ex 2B; No. 7]

4. A company employs 12 drivers. Their mean monthly salary is $1950. A new driver is employed and the mean
monthly salary falls by $8. Find the monthly salary of the new driver.
[S1 book; Ex 2B; No. 8]

5. The number of patients treated each day by a dentist during a 20-day period is shown in the following stem-and-
leaf diagram.

0 44456667 Key: 1| 5
1 4556677889 presents 15
2 01 patients
Find the median and modal number of patients.

[S1 book; Ex 2E; No. 1 modified]

6. Li Wei records the shirt collar size, x, of the male students in his year. The results are shown in the table.

For these data, find:


(a) The mode
(b) The median
(c) The mean
(d) Explain why a shirt manufacturer might use the mode when planning production numbers.
CAIE STATISTICS 1 SUPPORTING MATERIAL
7. The number of patients treated each day by a dentist during a 20-day period is shown in the following stem-and-
leaf diagram.

0 44456667 Key: 1| 5
1 4556677889 presents 15
2 01 patients

a) Find the median number of patients.


b) On eight of these 20 days, the dentist arrived late to collect their son from school. If they decide to use their
average number of patients as a reason for arriving late, would they use the median or the mean? Explain your
answer.
c) Describe a situation in which it would be to the dentist’s advantage to use a mode as the average.
[S1 book; Ex 2E; No. 1]

8. A small workshop records how long it takes, in minutes, for each of their workers to make a certain item. The
times are shown in the table.

(a) Write down the mode for these data.


(b) Calculate the mean for these data.
(c) Find the median for these data.
(d) The manager wants to give the workers an idea of the average time they took. Write down, with a reason,
which of the answers to (a), (b) and (c) she should use.

9.
a) Find the median for the values of 𝑡 given in the following table.

𝑡 7 8 9 10 11 12 13
𝑓 4 7 9 14 16 41 9

b) What feature of the data suggests that 𝑡 is less than the median? Confirm whether or not this is the case.
[S1 book; Ex 2E; No. 2]

10.
a) Find the median and the mode for the values of 𝑥 given in the following table.
𝑥 4 5 6 7 8
𝑓 14 13 4 12 15
b) Give one positive and one negative aspect of using each of the median and the mode as the average value for
𝑥.
c) Some values in the table have been incorrectly recorded as 8 instead of 4. Find the number of incorrectly
recorded values, given that the true median of 𝑥 is 5.5.

[S1 book; Ex 2E; No. 3]

11. A teacher recorded the quiz marks of eight students as 11, 13, 15, 15, 17, 18, 19 and 20.
They later realized that there was a typing error, so they changed the mark of 11 to 1.
Investigate what effect this change has on the mode, mean and median of the students’ marks.
[S1 book; Ex 2E; No. 6]

CAIE STATISTICS 1 SUPPORTING MATERIAL


12. Homes in a certain neighborhood have recently sold for $220 000, $242 000, $236 000, and $3 500 000.
A potential buyer wants to know the average selling price in the neighbourhood. Which of the mean, median of
mode would be more helpful? Explain your answer.
[S1 book; Ex 2E; No. 9]

13. The frequency table shows the number of breakdowns, b, per month recorded by a lorry firm over a certain period
of time.

(a) Write down the modal number of breakdowns.


(b) Find the median number of breakdowns.
(c) Calculate the mean number of breakdowns.
(d) In a brochure about how many loads reach their destination on time, the firm quotes one of the answers to (a),
(b) or (c) as the number of breakdowns per month for its vehicles. Write down which of the three answers the
firm should quote in the brochure.

14. Forty values of x are coded in the following table.

Calculate an estimate of the mean value of x.

[S1 book; Worked Example 2.10; Page 38]

15. The lengths of 2500 bolts, 𝑥 mm, are summarized by ∑(𝑥 − 40) = 875. Find the mean length of the bolts.
[S1 book; Ex 2C; No. 4]

16. The exact age of an individual boy is denoted by b, and the exact age of an individual girl is denoted by g.
Exactly 5 years ago, the sum of the ages of 10 boys was 127.0 years, so ∑(𝑏 − 5) = 127.0.
In exactly 5 years’ time, the sum of the ages of 15 girls will be 351.0 years, so ∑(𝑔 + 5) = 351.0.
Find the mean age today of
(a) The 10 boys
(b) The 15 girls
(c) The 10 boys and 15 girls combined.
[S1 book; Worked Example 2.9; Page 38]

17. For the 20 values of x summarized by ∑(2𝑥 − 3) = 104, find 𝑥̅ .


[S1 book; Worked Example 2.12; Page 40]

18. For 20 values of 𝑦, it is given that ∑(𝑎𝑥 − 𝑏) = 400 and ∑(𝑏𝑥 − 𝑎) = 545. Given also that 𝑥 = 6.25, find the
value of 𝑎 and of 𝑏.
[S1 book; Ex 2D; No. 7]

19. Twenty-five values of 𝑧 are such that ∑(𝑧 − 7) = 275. Find 𝑧.


[S1 book; Ex 2C; No. 2]

20. Given 𝑞 = 22 and ∑(𝑞 − 4) = 3672, find the number of values of 𝑞.


[S1 book; Ex 2C; No. 3]
CAIE STATISTICS 1 SUPPORTING MATERIAL
21. A dataset of 20 values is denoted by 𝑥 where ∑(𝑥 − 1) = 58. Another dataset of 30 values is denoted by 𝑦 where
∑(𝑦 − 2) = 36. Find the mean of the 50 values of 𝑥 and 𝑦.
[S1 book; Ex 2C; No. 10]

22. Given that 15 values of 𝑥 are such that ∑(3𝑥 − 2) = 528, find 𝑥 and find the value of 𝑏 such that ∑(0.5𝑥 − 𝑏) =
138.
[S1 book; Ex 2D; No. 6]

23. Find the range and the interquartile range of the following sets of data.
a) 5, 8, 13, 17, 22, 25, 30
b) 7, 13, 21, 2, 37, 28, 17, 11, 2
c) 42, 47, 39, 51, 73, 18, 83, 29, 41, 64
d) 113, 97, 36, 81, 49, 41, 20, 66, 28, 32, 17, 107
e) 4.6, 0, − 2.6, 0.8, −1.9, −3.3, 5.2, −3.2
[S1 book; Ex 3A; No. 1]

24. The heights, x cm, of 10 boys are summarized by ∑ 𝑥 = 1650 and ∑ 𝑥 2 = 275 490.
The heights, y cm, of 15 girls are summarized by ∑ 𝑦 = 2370 and ∑ 𝑦 2 = 377 835.
Calculate, to 3 significant figures, the standard deviation of the heights of all 25 children together.
[S1 book; Worked Example 3.10; Page 72]

25. The totals ∑ 𝑥 2 = 7931, ∑ 𝑥 = 397 and ∑ 𝑦 = 499 are given by 29 values of 𝑥 and 𝑛 values of 𝑦. All the values
of 𝑥 and 𝑦 together have a variance of 52.
a) Express ∑ 𝑦 2 in terms of 𝑛.
b) Find the value of 𝑛 for which ∑ 𝑦 2 − ∑ 𝑥 2 = 10.
[S1 book; Ex 3C; No. 6]

26. A building is occupied by 𝑛 companies. The number of people employed by these companies is denoted by 𝑥.
Find the mean number of employees, given that ∑ 𝑥 2 = 8900, ∑ 𝑥 = 220 and that the standard deviation of 𝑥 is
18.
[S1 book; Ex 3C; No. 2]

27. Eight students’ heights (h cm) are measured. They are as follows:
165 170 190 180 175 185 176 184

(a) Work out the mean height of the students.

(b) Given ∑ ℎ2 = 254 307, work out the variance. Show all your working.

(c) Work out the standard deviation.

28. Nahab asks the students in his year group how much allowance they get per week. The results, rounded to the
nearest Omani Riyals, are shown in the table.

(a) Work out the mean and standard deviation of the allowance. Give units with your answer.
(b) How many students received an allowance amount more than one standard deviation above the mean?

CAIE STATISTICS 1 SUPPORTING MATERIAL


29. A certain type of machine contained a part that tended to wear out after different amounts of time. The time it took
for 50 of the parts to wear out was recorded. The results are shown in the table.

The manufacturer makes the following claim:

Comment on the accuracy of the manufacturer’s claim, giving relevant numerical evidence.

30. The daily mean wind speed, x (knots) in Chicago is recorded. The summary data are:
∑ 𝑥 = 243 and ∑ 𝑥 2 = 2317
(a) Work out the mean and the standard deviation of the daily mean wind speed.

The highest recorded wind speed was 17 knots and the lowest recorded wind speed was 4 knots.

(b) Estimate the number of days in which the wind speed was greater than one standard deviation above the mean.
(c) State one assumption you have made in making this estimate.

31. The lifetime, x, in hours, of 70 light bulbs is shown in the table below.

𝑥−1
The data are coded using 𝑦 = .
20
(a) Estimate the mean of the coded values 𝑦̅.
(b) Hence find an estimate for the mean lifetime of the light bulbs, 𝑥̅ .
(c) Estimate the standard deviation of the lifetimes of the light bulbs.

32. The weekly income, i, of 100 workers was recorded.


𝑖−90
The data were coded using 𝑦 = 100
and the following summations were obtained:
∑ 𝑦 = 131 and ∑ 𝑦 2 = 176.84
Estimate the standard deviation of the actual workers’ weekly income.

33. It is known that 20 girls each have at least one brother. The number of brothers that they have is denoted by x.
Information about the values of 𝑥 − 1 is given in the following table.

Use the coded value to calculate the standard deviation of the number of brothers, to 3 decimal places.
[S1 book; Worked Example 3.13; Page 76]

34. Eight values of x are summarized by the totals ∑(𝑥 − 10)2 = 1490 and ∑(𝑥 − 10) = 100.
Twelve values of y are summarized by the totals ∑(𝑦 + 5)2 = 5139 and ∑(𝑦 + 5) = 234.
Find the variance of the 20 values of x and y together.
[S1 book; Worked Example 3.12; Page 76]

CAIE STATISTICS 1 SUPPORTING MATERIAL


35. Twenty readings of 𝑦 are summarized by the totals ∑(𝑦 − 5)2 = 890 and ∑( 𝑦 − 5) = 130. Find the standard
deviation of 𝑦.
[S1 book; Ex 3D; No. 2]

36. Readings from a device, denoted by 𝑦, are such that ∑(𝑦 − 3)2 = 2775, ∑ 𝑦 = 105 and the standard deviation of
𝑦 is 13. Find the number of readings that were taken.
[S1 book; Ex 3D; No. 5]

37. Twenty values of 𝑥 are summarized by ∑(𝑥 − 1)2 = 132 and ∑(𝑥 − 1) = 44.
Eight values of 𝑦 are summarized by ∑(𝑦 + 1)2 = 17704 and ∑(𝑦 + 1) = 1184.
a) Show that ∑ 𝑥 = 64 and that ∑ 𝑥 2 = 240.
b) Calculate the value of ∑ 𝑦 and of ∑ 𝑦 2 .
c) Find the exact variance of the 100 values of 𝑥 and 𝑦 combined.
[S1 book; Ex 3D; No. 11]

38. The heights, 𝑥 cm, of 200 boys and the heights, 𝑦 cm, of 300 girls are summarized by the following totals:
∑(𝑥 − 160)2 = 18240, ∑(𝑥 − 160) = 1820, ∑(𝑦 − 150)2 = 20100, ∑(𝑦 − 150) = 2250.
a) Find the mean height of these 500 children.
b) By first evaluating ∑ 𝑥 2 and ∑ 𝑦 2 , find the variance of the heights of the 500 children, including
appropriate units with your answer.
[S1 book; Ex 3D; No. 12]

39. Given that ∑(3𝑥 − 1)2 = 9136, ∑(3𝑥 − 1) = 53 and n = 10, find the value of ∑ 𝑥 2 .
[S1 book; Worked Example 3.15; Page 80]

40. Find the standard deviation of x, given that ∑ 4𝑥 2 = 14600, ∑ 2𝑥 = 420 and n = 20.
[S1 book; Ex 3E; No. 2]

41. Temperatures in degrees Celsius (°C) can be converted to temperatures in degrees Fahrenheit (°F) using the
formula F = 1.8C + 32.
(a) The temperatures yesterday had a range of 15°C. Express this range in degrees Fahrenheit.
(b) Temperatures elsewhere were recorded at hourly intervals in degrees Fahrenheit and were found to have mean
54.5 and variance 65.61. Find the mean and standard deviation of these temperatures in degrees Celsius.
[S1 book; Ex 3E; No. 6]

42. The temperatures, T °C, at seven locations in the Central Kalahari Game Reserve were recorded at 4 pm one
January afternoon. The values of t, correct to 1 decimal place, were:
32.1, 31.7, 31.2, 31.5, 31.9, 32.2 and 32.7.
(a) Evaluate ∑ 10(𝑇 − 30) and ∑ 100(𝑇 − 30)2 .
(b) Use your answers to part (a) to calculate the standard deviation of T.
(c) By 5 pm the temperature at each location had dropped by exactly 0.75°C. Find the variance of the temperatures
at 5 pm.
[S1 book; Ex 3E; No. 4]

43. Each of the 70 trainees at a secretarial college was asked to type a copy of a particular document. The times taken
are shown, correct to the nearest 0.1 minutes, in the following table.

(a) Explain why the interval for the first class has a width of 0.3 minutes.
(b) Represent the times taken in a histogram.

CAIE STATISTICS 1 SUPPORTING MATERIAL


(c) Estimate, to the nearest second, the upper boundary of the times taken by the fastest 10 typists.
(d) It is given that 15 trainees took between 3.15 and b minutes. Calculate an estimate for the value of b when:
(i) 𝑏 > 3.15
(ii) 𝑏 < 3.15
[S1 book; Ex 1B; No. 5]

44. The heights of 600 saplings are shown in the following table.

(a) Suggest a suitable value for u, the upper boundary of the data.
(b) Illustrate the data in a histogram.
(c) Calculate an estimate of the number of saplings with heights that are:
(i) Less than 25 cm
(ii) Between 7.5 and 19.5 cm
[S1 book; Ex 1B; No. 4]

45. A university investigated how much space on its computers’ hard drives is used for data storage. The results are
shown below. It is given that 40 hard drives use less than 20 GB for data storage.

(a) Find the total number of hard drives represented.


(b) Calculate an estimate of the number of hard drives that use less than 50 GB.
(c) Estimate the value of k, if 25% of the hard drives use k GB or more.
(d) Estimate the value of m, if 40% of the hard drives use m GB or less.
[S1 book; Ex 1B; No. 7 modified]

46. The following table shows the lengths of 80 leaves from a particular tree, given to the nearest centimeter.

(a) Draw a cumulative frequency curve and a cumulative frequency polygon.


(b) Use each of the graphs to estimate:
(i) The number of leaves that are less than 3.7 cm long
(ii) The lower boundary of the lengths of the longest 22 leaves.
(c) Use the cumulative frequency curve to find
(i) Median
(ii) Lower and upper quartiles and interquartile range
[S1 book; Worked Example 1.4; Page-14; modified]

47. The following table shows the widths of the 70 books in one section of a library, given to the nearest centimeter.

(a) Given that the upper boundary of the first class is 14.5 cm, write down the upper boundary of the second class.

CAIE STATISTICS 1 SUPPORTING MATERIAL


(b) Draw up a cumulative frequency table for the data and construct a cumulative frequency graph.
(c) Use your graph to estimate:
(i) The number of books that have widths of less than 27 cm
(ii) The widths of the widest 20 books
[S1 book; Ex 1C; No.2]

48. The following cumulative frequency graph shows the masses, 𝑚 grams, of 152 uncut diamonds.

a) Estimate the number of uncut diamonds with masses such that:


i) 9 ≤ 𝑚 < 17
ii) 7.2 ≤ 𝑚 < 15.6.
b) The lightest 40 diamonds are classified as small. The heaviest 40 diamonds are classified as large. Estimate
the difference between the mass of the heaviest small diamond and the lightest large diamond.
c) The point marked at 𝑃(24, 152) on the graph indicates that the 152 uncut diamonds all have masses of less
than 24 grams.
Each diamond is now cut into two parts of equal mass. Assuming that there is no wastage of material, write
down the coordinates of the point corresponding to 𝑃 on a cumulative frequency graph representing the
masses of these cut diamonds.
[S1 book; Ex 1C; No. 5]

49. The following graph illustrates the times, in minutes, taken by 500 people to complete a task.

CAIE STATISTICS 1 SUPPORTING MATERIAL


Use the graph to find an estimate of:
(a) The greatest possible range
(b) The interquartile range
(c) The 95th percentile
[S1 book; Worked Example 3.5; Page 59]

50. The following graph illustrates the times taken by 112 people to complete a puzzle.

a) Estimate the median time taken.


b) The median is used to divide these people into two groups. Find the median time taken by each of the groups.
[S1 book; Ex 2E; No. 4]

51. The resistances, in ohms (Ω), of 100 conductors are represented in the following graph.

Find, to an appropriate degree of accuracy, an


estimate of:
a) the interquartile range
b) the 90th percentile
c) the percentile that is equal to 0.192 Ω
d) the range of the middle 40% of the resistances.

[S1 book; Ex 3A; No. 8]

52. A group of students took a test. The summary data are shown in the table.

Given that there were no outliers, draw a box plot to illustrate these data.

CAIE STATISTICS 1 SUPPORTING MATERIAL


53. Aaron and Bassam decided to go on a touring holiday in Europe for the whole of July. They recorded the distance
they drove, in kilometres, each day:

(a) Find 𝑄1 , 𝑄2 and 𝑄3 .

Outliers are values that lie below 𝑄1 − 1.5(𝑄3 − 𝑄1 ) or above 𝑄3 + 1.5(𝑄3 − 𝑄1 ).

(b) Find any outliers.


(c) Draw a box plot of these data.
(d) Comment on the skewness of the distribution.

54. The masses of male and female turtles are given in grams. The data are summarized in the box plots.

(a) Compare and contrast the masses of the male and female turtles.
(b) A turtle was found to have a mass of 330 grams. State whether it is likely to be a male or a female. Give a
reason for your answer.
(c) Write down the size of the largest female turtle.

55. University students measured the heights of the 54 trees in the grounds of a primary school. As part of a talk on
conservation at a school assembly, the students have decided to present their data using one of the following
diagrams.

a) Give one disadvantage of using each of the representations shown.


b) Name and describe a different type of representation that would be appropriate for the audience, and that has
none of the disadvantages given in part a).
[S1 book; Ex 1D; No. 5]

56. The following table shows the focal lengths, 𝑙 mm, of the 84 zoom lenses sold by a shop. For example, there are
18 zoom lenses that can be set to any focal length between 24 and 50 mm.

a) What feature of the data does not allow them to be displayed in a histogram?
b) What type of diagram could you use to illustrate the data? Explain clearly how you would do this.
[S1 book; Ex 1D; No. 8]

CAIE STATISTICS 1 SUPPORTING MATERIAL


TOPICAL QUESTION PAPER
Topics 7 and 8: Representation and Measures of Data
Note:
These problems are taken from Topics 7 and 8 of Topical QP Book.
Questions are numbered serially (Topic 7 has 26 questions and Topic 8 has 11 questions, so 37
questions in total).

1) The salaries, in thousands of dollars, of 11 people, chosen at random in a certain office, were found to be:
40, 42, 45, 41, 352, 40, 50, 48, 51, 49, 47
Choose and calculate an appropriate measure of central tendency (mean, mode or median) to summarise these
salaries. Explain briefly why the other measures are not suitable.
[J06/S1/Q1]

2) Each father in a random sample of fathers was asked how old he was when his first child was born. The following
histogram represents the information.

i) What is the modal age group?


ii) How many fathers were between 25 and 30 years old when their first child was born?
iii) How many fathers were in the sample?
iv) Find the probability that a father, chosen at random from the group, was between 25 and 30 years old
when his first child was born, given that he was older than 25 years.
[J06/S1/Q5]
Answer:
i) 30 – 35 years
ii) 24 fathers were 25 to 30 years when their 1st child was born.
iii) = 4 + 18 + 24 + 28 + 26 + 10 = 110
iv) = 0.2727 ≈ 0.273

3) 32 teams enter for a knockout competition, in which each match results in one team winning and the other team
losing. After each match the winning team goes on to the next round, and the losing team takes no further part in
the competition. Thus 16 teams play in the second round, 8 teams play in the third round, and so on, until 2 teams
play in the final round.

i) How many teams play in only 1 match?


ii) How many teams play in exactly 2 matches?
iii) Draw up a frequency table for the numbers of matches which the teams play.
iv) Calculate the mean and variance of the number of matches which the teams play.
[J06/S1/Q6]

CAIE STATISTICS 1 SUPPORTING MATERIAL


Answer:
i) 16
ii) 8
iv) 1.94, 1.43

4) The weights of 30 children in a class, to the nearest kilogram, were as follows.

50 45 61 53 55 47 52 49 46 51
60 52 54 47 57 59 42 46 51 53
56 48 50 51 44 52 49 58 55 45
Construct a grouped frequency table for these data such that there are five equal class intervals with the first class
having a lower boundary of 41.5 kg and the fifth class having an upper boundary of 61.5 kg.
[N06/S1/Q1]

5) In a survey, people were asked how long they took to travel to and from work, on average. The median time was 3
hours 36 minutes, the upper quartile was 4 hours 42 minutes and the interquartile range was 3 hours 48 minutes.
The longest time taken was 5 hours 12 minutes and the shortest time was 30 minutes.
i) Find the lower quartile.
ii) Represent the information by a box-and-whisker plot, using a scale of 2 cm to represent 60 minutes.
[N06/S1/Q03]
Answer:
i) = 0.9 hours

6) The lengths of time in minutes to swim a certain distance by the members of a class of twelve 9-year-olds and by
the members of a class of eight 16-year-olds are shown below.

9-year-olds: 13.0 16.1 16.0 14.4 15.9 15.1 14.2 13.7 16.7 16.4 15.0 13.2
16-year-olds: 14.8 13.0 11.4 11.7 16.5 13.7 12.8 12.9

i) Draw a back-to-back stem-and-leaf diagram to represent the information above.


ii) A new pupil joined the 16-year-old class and swam the distance. The mean time for the class of nine
pupils was now 13.6 minutes. Find the new pupil’s time to swim the distance.
[J07/S1/Q4]
Answer:
ii) 15.6 min

7) The arrival times of 204 trains were noted and the number of minutes, 𝑡, that each train was late was recorded.
The results are summarized in the table.
Number of minutes late (𝑡) −2 ≤ 𝑡 < 0 0≤𝑡<2 2≤𝑡<4 4≤𝑡<6 6 ≤ 𝑡 < 10
Number of trains 43 51 69 22 19

i) Explain what −2 ≤ 𝑡 < 0 means about the arrival times of trains.


ii) Draw a cumulative frequency graph, and from it estimate the median and the interquartile range of the
number of minutes late of these trains.
[N07/S1/Q5]

8) The stem-and-leaf diagram below represents data collected for the number of hits on an internet site on each day
in March 2007. There is one missing value, denoted by 𝑥.
0 0 1 5 6
1 1 3 5 6 6 8
2 1 1 2 3 4 4 4 8 9
3 1 2 2 2 𝑥 8 9
4 2 5 6 7 9

Key: 1 5 represent 15 hits

CAIE STATISTICS 1 SUPPORTING MATERIAL


i) Find the median and lower quartile for the number of hits each day.
ii) The interquartile range is 19. Find the value of 𝑥.
[J08/S1/Q1]
Answer:
i) 24, 16
ii) 𝑥=5

9) As part of a data collection exercise, members of a certain school year group were asked how long they spent on
their Mathematics homework during one particular week. The times are given to the nearest 0.1 hour. The results
are displayed in the following table.
Time spent (𝑡 hours) 0.1 ≤ 𝑡 ≤ 0.5 0.6 ≤ 𝑡 ≤ 1.0 1.1 ≤ 𝑡 ≤ 2.0 2.1 ≤ 𝑡 ≤ 3.0 3.1 ≤ 𝑡 ≤ 4.5
Frequency 11 15 18 30 21

i) Draw, on graph paper, a histogram to illustrate this information.


ii) Calculate an estimate of the mean time spent on their Mathematics homework by members of this year
group.
[J08/S1/Q5]
Answer:

ii) 2.1 hours

10) The pulse rates, in beats per minutes, of a random sample of 15 small animals are shown in the following table.
115 120 158 132 125
104 142 160 145 104
162 117 109 124 134

i) Draw a stem-and-leaf diagram to represent the data.


ii) Find the median and the quartiles.
iii) On graph paper, using a scale of 2 cm to represent 10 beats per minute, draw a box-and-whisker plot of
the data.
[N08/S1/Q5]
Answer:

ii) 125, 115 and 145

11) During January the numbers of people entering a store during the first hour after opening were as follows.
Time after opening, Frequency Cumulative
𝑥 minutes Frequency
0 < 𝑥 ≤ 10 210 210
10 < 𝑥 ≤ 20 134 344
20 < 𝑥 ≤ 30 78 422
30 < 𝑥 ≤ 40 72 𝑎
40 < 𝑥 ≤ 60 𝑏 540
i) Find the values of 𝑎 and 𝑏.
ii) Draw a cumulative frequency graph to represent this information. Take a scale of 2 cm for 10 minutes
on the horizontal axis and 2 cm for 50 people on the vertical axis.
iii) Use your graph to estimate the median time after opening that people entered the store.
iv) Calculate estimates of the mean, 𝑚 minutes, and standard deviation, 𝑠 minutes, of the time after opening
that people entered the store.
1 1
v) Use your graph to estimate the number of people entering the store between (𝑚 − 2 𝑠) and (𝑚 + 2 𝑠)
minutes after opening.
[09/S1/Q6]

CAIE STATISTICS 1 SUPPORTING MATERIAL


Answer:
i) 494, 46
ii)
iii) 14
iv) 18.2 minutes, 14.2 minutes
v) 155

12) A library has many identical shelves. All the shelves are full and the numbers of books on each shelf in a certain
section are summarized by the following stem and leaf diagram.

3 3699
4 67
5 0122
6 0011234444455667889
7 113335667899
8 0245568
9 001244445567788999

Key: 3 6 represents 36 books

i) Find the number of shelves in this section of the library.


ii) Draw a box and whisker plot to represent the data.
In another section all the shelves are full and the numbers of books on each shelf are summarized by the following
stem and leaf diagram.

2 1222334566679
3 011234456677788
4 22357789

Key: 3 6 represents 36 books


iii) There are fewer books in this section than in the previous section. State one other difference between
the books in this section and the books in the previous section.
[N09/S1/Q4]

Answer:
i) 67
iii) Books in the second’s section are thicker or contained more pages.

13) The birth weights of random samples of 900 babies


born in country A and 900 babies born in country B
are illustrated in the cumulative frequency graphs.
Use suitable date from these graphs to compare the
central tendency and spread of the birth weights of
the two sets of babies.
[N10/S1/Q3]

CAIE STATISTICS 1 SUPPORTING MATERIAL


14) The weights in kilograms of 11 bags of sugar and 7 bags of flour are as follows.
Sugar: 1.961 1.983 2.008 2.014 1.968 1.994 2.011 2.017 1.977 1.984 1.989
Flour: 1.945 1.962 1.949 1.977 1.964 1.941 1.953

i) Represent this information on a back to back stem and leaf diagram with sugar on the left hand side.
ii) Find the median and interquartile range of the weights of the bags of sugar.

[N10/S1/Q4]
Answer:

ii) 1.989 kg, 0.034 kg

15) A hotel has 90 rooms. The table summarises information about the number of rooms occupied each day for a
period of 200 days.
Number of rooms occupied 1 − 20 21 − 40 41 − 50 51 − 60 61 − 70 70 − 90
Frequency 10 32 62 50 28 18

i) Draw a cumulative frequency graph on graph paper to illustrate this information.


ii) Estimate the number of days when over 30 rooms were occupied.
iii) On 75% of the days at most 𝑛 rooms were occupied. Estimate the value of 𝑛.
[J11/S1/Q5]
Answer:
i)
ii) 180
iii) 𝑛 = 59 rooms

16) The weights of 220 sausages are summarised in the following table.
Weight (grams) < 20 < 30 < 40 < 45 < 50 < 60 < 70
Cumulative frequency 0 20 50 100 160 210 220

i) State which interval the median weight lies in.


ii) Find the smallest possible value and the largest possible value for the interquartile range.
iii) State how man sausages weighed between 50 g and 60 g.
iv) On graph paper, draw a histogram to represent the weights of the sausages.
[N11/S1/Q4]
Answer:
i) 45 < 𝑥 < 50.
ii) 5, 20
iii) 50

17) The back to back stem and leaf diagram shows the values taken by two variable A and B.
A B
(3) 3 1 0 15 1 3 3 5 (4)
(2) 4 1 16 2 2 3 4 4 5 7 7 7 8 (10)
(3) 8 3 3 17 0 1 3 3 3 4 6 6 7 9 9 (11)
(12) 9 8 8 6 5 5 4 3 2 1 1 0 18 2 4 7 (3)
(8) 9 9 8 8 6 5 4 2 19 1 5 (2)
(5) 9 8 7 1 0 20 4 (1)

Key: 4 16 7 means A = 0.164 and B = 0.167.

i) Find the median and the interquartile range for variable A.


ii) You are given that, for variable B, the median is 0.171, the upper quartile is 0.179 and the lower quartile
is 0.164. Draw box-and whisker plots for A and B in a single diagram on graph paper.
CAIE STATISTICS 1 SUPPORTING MATERIAL
[J12/S1/Q4]
Answer:
i) = 0.186, 0.019
ii)

18) The table summarizes the times that 112 people took to travel to work on a particular day.
Time to travel to 0 < 𝑡 ≤ 10 10 < 𝑡 ≤ 15 15 < 𝑡 ≤ 20 20 < 𝑡 ≤ 25 25 < 𝑡 ≤ 40 40 < 𝑡 ≤ 60
work (𝑡 minutes)
Frequency 19 12 28 22 18 13

i) State which time interval in the table contains the median and which time interval contains the upper
quartile.
ii) On graph paper, draw a histogram to represent the data.
iii) Calculate an estimate of the mean time to travel to work.
[N12/S1/Q3]
Answer:
i) 15 < 𝑡 ≤ 20, 25 < 𝑡 ≤ 40
iii) 22.0 minutes

19) The following are the annual amounts of money spent on clothes, to the nearest $10, by 27 people.

10 40 60 80 100 130 140 140 140


150 150 150 160 160 160 160 170 180
180 200 210 250 270 280 310 450 570

i) Construct a stem-and-leaf diagram for the data.


ii) Find the median and the interquartile range of the data.
An ‘outlier’ is defined as any data value which is more than 1.5 times the interquartile range above the upper
quartile, or more than 1.5 times the interquartile range below the lower quartile.
iii) List the outliers.
[J13/S1/Q5]
Answer:
i)
ii) 160, 70
iii) Outliers = 10, 450, 570.

20) The following histogram summaries the times, in


minutes, taken by 190 people to complete a race.

i) Show that 75 people took between 200 and


250 minutes to complete the race.
ii) Calculate estimates of the mean and
standard deviation of the times of the 190
people.
iii) Explain why your answers to part (ii) are
estimates.

[N13/S1/Q4]
Answer:
i) 75 (shown)
ii) 213, 46.5
iii) Because we have used mid-value
of the interval to find mean and
standard deviation.

CAIE STATISTICS 1 SUPPORTING MATERIAL


21) The time taken by 57 athletes to run 100 metres are summarized in the following cumulative frequency table.

Time (seconds) < 10.0 < 10.5 < 11.0 < 12.0 < 12.5 < 13.5
Cumulative frequency 0 4 10 40 49 57

i) State how many athletes ran 100 metres in a time between 10.5 and 11.0 seconds.
ii) Draw a histogram on graph paper to represent the times taken by these athletes to run 100 metres.
iii) Calculate estimates of the mean and variance of the times taken by these athletes.
[J14/S1/Q6]

Answer:
i) 10 − 4 = 6
ii)
iii) 0.547

22) On a certain day in spring, the heights of 200 daffodils are measured, correct to the nearest centimeter. The
frequency distribution is given below.

Height (cm) 4 − 10 11 − 15 16 − 20 21 − 25 26 − 30
Frequency 22 32 78 40 28

i) Draw a cumulative frequency graph to illustrate the data.


ii) 28% of these daffodils are of height ℎ cm or more. Estimate ℎ.
iii) You are given that the estimate of the mean height of these daffodils, calculated from the table, is 18.39
cm. Calculate an estimate of the standard deviation of the heights of these daffodils.
[N14/S1/Q6]
Answer:
i)
ii) ℎ = 21.7
iii) 6.01

23) In an open-plan office there are 88 computers. The


times taken by these 88 computers to access a
particular web page are represented in the cumulative
frequency diagram.
i) On graph paper draw a box-and-whisker plot
to summarise this information.
An ‘outlier’ is defined as any data value which is
more than 1.5 times the interquartile range above the
upper quartile, or more than 1.5 times the
interquartile range below the lower quartile.

ii) Show that there are no outliers.


[J15/S1/Q3]

24) The weights, in kilograms, of the 15 rugby players of two teams, 𝐴 and 𝐵, are shown below.
Team A 97 98 104 84 100 109 115 99 122 82 116 96 84 107 91
Team B 75 79 94 101 96 77 111 108 83 84 86 115 82 113 95

i) Represent the data by drawing a back to back stem and leaf diagram with team 𝐴 on the left hand side of
the diagram and team 𝐵 on the right hand side.
CAIE STATISTICS 1 SUPPORTING MATERIAL
ii) Find the interquartile range of the weights of the players in team 𝐴.
iii) A new player joins team 𝐵 as a substitute. The mean weight of the 16 players in team 𝐵 is now 93.9 kg.
Fid the weight of the new player.
[N15/S1/Q5]
Answer:
i)
ii) 18 kg
iii) 103.4 kg

25) The following are the maximum daily wind speeds in kilometers per hour for the first two weeks in April for
two towns, Bronlea and Rogate.

Bronlea 21 45 6 33 27 3 32 14 28 24 13 17 25 22
Rogate 7 5 4 15 23 7 11 13 26 18 23 16 10 34

i) Draw a back to back stem and leaf diagram to represent this information.
ii) Write down the median of the maximum wind speeds for Bronlea and find the interquartile range for
Rogate.
iii) Use your diagram to make one comparison between the maximum wind speeds in the two towns.
[J16/S1/Q5]
Answer:
i)
ii) 23 km/h, 16
iii) From stem and leaf diagram we observed that maximum wind speed in Rogate is less than
maximum wind speed in Bronlea.

26) The number of people a football stadium can hold is called the ‘capacity’. The capacities of 130 football
stadiums in the UK, to the nearest thousand, are summarized in the table.

Capacity 3000-7000 8000-12000 13000-22000 23000-42000 43000-82000


Number of 40 30 18 34 8
stadiums

i) On graph paper, draw a histogram to represent this information. Use a scale of 2 cm for a capacity of
10 000 on the horizontal axis.
ii) Calculate an estimate the mean capacity of these 130 stadiums.
iii) Find which class in the table contains the median and which contains the lower quartile.
[N16/S1/Q5]
Answer:
ii) 18600
iii) 8000 – 12000, 3000 − 7000

27) The length of time, 𝑡 minutes, taken to do the crossword in a certain newspaper was observed on 12 occasions.
The results are summarized below.
∑(𝑡 − 35) = −15
∑(𝑡 − 35)2 = 82.23
Calculate the mean and standard deviation of these times taken to do the crossword.
[J07/S1/Q1]
Answer:
33.8 min, 2.3 min

28) A summary of 24 observations of 𝑥 gave the following information:


∑(𝑥 − 𝑎) = −73.2 and ∑(𝑥 − 𝑎)2 = 2115.
The mean of these values of 𝑥 is 8.95.
i) Find the value of constant 𝑎.
ii) Find the standard deviation of these values of 𝑥.
CAIE STATISTICS 1 SUPPORTING MATERIAL
‘07/S1/Q1]
Answer:
i) 𝑎 = 12
ii) 8.88

29) Rachel measured the lengths in millimeters of some of the leaves on a tree. Her results are recorded below.
32 35 45 37 38 44 33 39 36 45
Find the mean and standard deviation of the lengths of these leaves.

[N08/S1/Q1]

Answer: 38.4 mm, 4.57 mm

30) The times in minutes for seven students to become proficient at a new computer game were measured. The results
are shown below.
15 10 48 10 19 14 16

i) Find the mean and standard deviation of these times.


ii) State which of the mean, median or mode you consider would be most appropriate to use as a measure of
central tendency to represent the date in this case.
iii) For each of the two measures of average you did not choose in part (ii), give a reason why you consider it
inappropriate.
[N10/S1/Q1]
Answer:
i) 18.9, 12.3
ii) Median would be most appropriate measure of central tendency.

31) Esme noted the test marks, 𝑥, of 16 people in a class. She found that ∑ 𝑥 = 824 and that the standard deviation of
𝑥 was 6.5.
i) Calculate ∑(𝑥 − 50) and ∑(𝑥 − 50)2 .
ii) One person did the test later and her mark was 72. Calculate the new mean and standard deviation of the
marks of all 17 people.
[N10/S1/Q2]
Answer:
i) 24, 712
ii) 52.7, 7.94

32) A sample of 36 data values, 𝑥, gave ∑(𝑥 − 45) = −148 and ∑(𝑥 − 45)2 = 3089.
i) Find the mean and standard deviation of the 36 values.
ii) One extra data value of 29 was added to the sample. Find the standard deviation of all 37 values.
[J11/S1/Q3]
Answer:
i) 40.9, 8.30
ii) 8.41

33) The following are the times, in minutes, taken by 11 runners to complete a 10 km run.
48.3 55.2 59.9 67.7 60.5 75.6 62.5 57.4 53.4 49.2 64.1
Find the mean and standard deviation of these times.
[N11/S1/Q1]
Answer: 59.4, 7.69

CAIE STATISTICS 1 SUPPORTING MATERIAL


34) The ages, 𝑥 years, of 150 cars are summarized by ∑ 𝑥 = 645 and ∑ 𝑥 2 = 8287.5. Find ∑(𝑥 − 𝑥̅ )2 , where 𝑥̅
denotes the mean of 𝑥.
[J12/S1/Q1]
Answer: 5514

35) A summary of the speeds, 𝑥 kilometres per hour, of 22 cars passing a certain point gave the following information.
∑(𝑥 − 50) = 81.4 and ∑(𝑥 − 50)2 = 671.0.
Find the variance of the speeds and hence find the value of ∑ 𝑥 2 .
[J13/S1/Q2]
Answer: 16.81, 63811

36) 120 people were asked to read an article in a newspaper. The times taken, to the nearest second, by the people to
read the article are summarized in the following table.

Time (seconds) 1 − 25 26 − 35 36 − 45 46 − 55 56 − 90
Number of people 4 24 38 34 20

Calculate estimates of the mean and standard deviation of the reading times.
[J15/S1/Q2]
Answer: 45.8, 14.9

37) For 𝑛 values of the variable 𝑥, it is given that ∑(𝑥 − 100) = 216 and ∑ 𝑥 = 2416.
Find the value of 𝑛.
[N15/S1/Q1]
Answer: 𝑛 = 22

CAIE STATISTICS 1 SUPPORTING MATERIAL

You might also like