CAIE Statistics 1 Data Measures Guide
CAIE Statistics 1 Data Measures Guide
Chapter 1, 2, 3
Representation and Measures of Data
SUPPORTING MATERIAL
EXAMPLES
2. A family has 38 films on DVD with a mean playing time of 1 hour 32 minutes. They also have 26 films on video
cassette, with a mean playing time of 2 hours 4 minutes. Find the mean playing time of all the films in their
collection.
[S1 book; Worked Example 2.6; Page 32]
3. An examination was taken by 50 students. The 22 boys scored a mean of 71% and the girls scored a mean of 76%.
Find the mean score of all the students.
[S1 book; Ex 2B; No. 7]
4. A company employs 12 drivers. Their mean monthly salary is $1950. A new driver is employed and the mean
monthly salary falls by $8. Find the monthly salary of the new driver.
[S1 book; Ex 2B; No. 8]
5. The number of patients treated each day by a dentist during a 20-day period is shown in the following stem-and-
leaf diagram.
0 44456667 Key: 1| 5
1 4556677889 presents 15
2 01 patients
Find the median and modal number of patients.
6. Li Wei records the shirt collar size, x, of the male students in his year. The results are shown in the table.
0 44456667 Key: 1| 5
1 4556677889 presents 15
2 01 patients
8. A small workshop records how long it takes, in minutes, for each of their workers to make a certain item. The
times are shown in the table.
9.
a) Find the median for the values of 𝑡 given in the following table.
𝑡 7 8 9 10 11 12 13
𝑓 4 7 9 14 16 41 9
b) What feature of the data suggests that 𝑡 is less than the median? Confirm whether or not this is the case.
[S1 book; Ex 2E; No. 2]
10.
a) Find the median and the mode for the values of 𝑥 given in the following table.
𝑥 4 5 6 7 8
𝑓 14 13 4 12 15
b) Give one positive and one negative aspect of using each of the median and the mode as the average value for
𝑥.
c) Some values in the table have been incorrectly recorded as 8 instead of 4. Find the number of incorrectly
recorded values, given that the true median of 𝑥 is 5.5.
11. A teacher recorded the quiz marks of eight students as 11, 13, 15, 15, 17, 18, 19 and 20.
They later realized that there was a typing error, so they changed the mark of 11 to 1.
Investigate what effect this change has on the mode, mean and median of the students’ marks.
[S1 book; Ex 2E; No. 6]
13. The frequency table shows the number of breakdowns, b, per month recorded by a lorry firm over a certain period
of time.
15. The lengths of 2500 bolts, 𝑥 mm, are summarized by ∑(𝑥 − 40) = 875. Find the mean length of the bolts.
[S1 book; Ex 2C; No. 4]
16. The exact age of an individual boy is denoted by b, and the exact age of an individual girl is denoted by g.
Exactly 5 years ago, the sum of the ages of 10 boys was 127.0 years, so ∑(𝑏 − 5) = 127.0.
In exactly 5 years’ time, the sum of the ages of 15 girls will be 351.0 years, so ∑(𝑔 + 5) = 351.0.
Find the mean age today of
(a) The 10 boys
(b) The 15 girls
(c) The 10 boys and 15 girls combined.
[S1 book; Worked Example 2.9; Page 38]
18. For 20 values of 𝑦, it is given that ∑(𝑎𝑥 − 𝑏) = 400 and ∑(𝑏𝑥 − 𝑎) = 545. Given also that 𝑥 = 6.25, find the
value of 𝑎 and of 𝑏.
[S1 book; Ex 2D; No. 7]
22. Given that 15 values of 𝑥 are such that ∑(3𝑥 − 2) = 528, find 𝑥 and find the value of 𝑏 such that ∑(0.5𝑥 − 𝑏) =
138.
[S1 book; Ex 2D; No. 6]
23. Find the range and the interquartile range of the following sets of data.
a) 5, 8, 13, 17, 22, 25, 30
b) 7, 13, 21, 2, 37, 28, 17, 11, 2
c) 42, 47, 39, 51, 73, 18, 83, 29, 41, 64
d) 113, 97, 36, 81, 49, 41, 20, 66, 28, 32, 17, 107
e) 4.6, 0, − 2.6, 0.8, −1.9, −3.3, 5.2, −3.2
[S1 book; Ex 3A; No. 1]
24. The heights, x cm, of 10 boys are summarized by ∑ 𝑥 = 1650 and ∑ 𝑥 2 = 275 490.
The heights, y cm, of 15 girls are summarized by ∑ 𝑦 = 2370 and ∑ 𝑦 2 = 377 835.
Calculate, to 3 significant figures, the standard deviation of the heights of all 25 children together.
[S1 book; Worked Example 3.10; Page 72]
25. The totals ∑ 𝑥 2 = 7931, ∑ 𝑥 = 397 and ∑ 𝑦 = 499 are given by 29 values of 𝑥 and 𝑛 values of 𝑦. All the values
of 𝑥 and 𝑦 together have a variance of 52.
a) Express ∑ 𝑦 2 in terms of 𝑛.
b) Find the value of 𝑛 for which ∑ 𝑦 2 − ∑ 𝑥 2 = 10.
[S1 book; Ex 3C; No. 6]
26. A building is occupied by 𝑛 companies. The number of people employed by these companies is denoted by 𝑥.
Find the mean number of employees, given that ∑ 𝑥 2 = 8900, ∑ 𝑥 = 220 and that the standard deviation of 𝑥 is
18.
[S1 book; Ex 3C; No. 2]
27. Eight students’ heights (h cm) are measured. They are as follows:
165 170 190 180 175 185 176 184
(b) Given ∑ ℎ2 = 254 307, work out the variance. Show all your working.
28. Nahab asks the students in his year group how much allowance they get per week. The results, rounded to the
nearest Omani Riyals, are shown in the table.
(a) Work out the mean and standard deviation of the allowance. Give units with your answer.
(b) How many students received an allowance amount more than one standard deviation above the mean?
Comment on the accuracy of the manufacturer’s claim, giving relevant numerical evidence.
30. The daily mean wind speed, x (knots) in Chicago is recorded. The summary data are:
∑ 𝑥 = 243 and ∑ 𝑥 2 = 2317
(a) Work out the mean and the standard deviation of the daily mean wind speed.
The highest recorded wind speed was 17 knots and the lowest recorded wind speed was 4 knots.
(b) Estimate the number of days in which the wind speed was greater than one standard deviation above the mean.
(c) State one assumption you have made in making this estimate.
31. The lifetime, x, in hours, of 70 light bulbs is shown in the table below.
𝑥−1
The data are coded using 𝑦 = .
20
(a) Estimate the mean of the coded values 𝑦̅.
(b) Hence find an estimate for the mean lifetime of the light bulbs, 𝑥̅ .
(c) Estimate the standard deviation of the lifetimes of the light bulbs.
33. It is known that 20 girls each have at least one brother. The number of brothers that they have is denoted by x.
Information about the values of 𝑥 − 1 is given in the following table.
Use the coded value to calculate the standard deviation of the number of brothers, to 3 decimal places.
[S1 book; Worked Example 3.13; Page 76]
34. Eight values of x are summarized by the totals ∑(𝑥 − 10)2 = 1490 and ∑(𝑥 − 10) = 100.
Twelve values of y are summarized by the totals ∑(𝑦 + 5)2 = 5139 and ∑(𝑦 + 5) = 234.
Find the variance of the 20 values of x and y together.
[S1 book; Worked Example 3.12; Page 76]
36. Readings from a device, denoted by 𝑦, are such that ∑(𝑦 − 3)2 = 2775, ∑ 𝑦 = 105 and the standard deviation of
𝑦 is 13. Find the number of readings that were taken.
[S1 book; Ex 3D; No. 5]
37. Twenty values of 𝑥 are summarized by ∑(𝑥 − 1)2 = 132 and ∑(𝑥 − 1) = 44.
Eight values of 𝑦 are summarized by ∑(𝑦 + 1)2 = 17704 and ∑(𝑦 + 1) = 1184.
a) Show that ∑ 𝑥 = 64 and that ∑ 𝑥 2 = 240.
b) Calculate the value of ∑ 𝑦 and of ∑ 𝑦 2 .
c) Find the exact variance of the 100 values of 𝑥 and 𝑦 combined.
[S1 book; Ex 3D; No. 11]
38. The heights, 𝑥 cm, of 200 boys and the heights, 𝑦 cm, of 300 girls are summarized by the following totals:
∑(𝑥 − 160)2 = 18240, ∑(𝑥 − 160) = 1820, ∑(𝑦 − 150)2 = 20100, ∑(𝑦 − 150) = 2250.
a) Find the mean height of these 500 children.
b) By first evaluating ∑ 𝑥 2 and ∑ 𝑦 2 , find the variance of the heights of the 500 children, including
appropriate units with your answer.
[S1 book; Ex 3D; No. 12]
39. Given that ∑(3𝑥 − 1)2 = 9136, ∑(3𝑥 − 1) = 53 and n = 10, find the value of ∑ 𝑥 2 .
[S1 book; Worked Example 3.15; Page 80]
40. Find the standard deviation of x, given that ∑ 4𝑥 2 = 14600, ∑ 2𝑥 = 420 and n = 20.
[S1 book; Ex 3E; No. 2]
41. Temperatures in degrees Celsius (°C) can be converted to temperatures in degrees Fahrenheit (°F) using the
formula F = 1.8C + 32.
(a) The temperatures yesterday had a range of 15°C. Express this range in degrees Fahrenheit.
(b) Temperatures elsewhere were recorded at hourly intervals in degrees Fahrenheit and were found to have mean
54.5 and variance 65.61. Find the mean and standard deviation of these temperatures in degrees Celsius.
[S1 book; Ex 3E; No. 6]
42. The temperatures, T °C, at seven locations in the Central Kalahari Game Reserve were recorded at 4 pm one
January afternoon. The values of t, correct to 1 decimal place, were:
32.1, 31.7, 31.2, 31.5, 31.9, 32.2 and 32.7.
(a) Evaluate ∑ 10(𝑇 − 30) and ∑ 100(𝑇 − 30)2 .
(b) Use your answers to part (a) to calculate the standard deviation of T.
(c) By 5 pm the temperature at each location had dropped by exactly 0.75°C. Find the variance of the temperatures
at 5 pm.
[S1 book; Ex 3E; No. 4]
43. Each of the 70 trainees at a secretarial college was asked to type a copy of a particular document. The times taken
are shown, correct to the nearest 0.1 minutes, in the following table.
(a) Explain why the interval for the first class has a width of 0.3 minutes.
(b) Represent the times taken in a histogram.
44. The heights of 600 saplings are shown in the following table.
(a) Suggest a suitable value for u, the upper boundary of the data.
(b) Illustrate the data in a histogram.
(c) Calculate an estimate of the number of saplings with heights that are:
(i) Less than 25 cm
(ii) Between 7.5 and 19.5 cm
[S1 book; Ex 1B; No. 4]
45. A university investigated how much space on its computers’ hard drives is used for data storage. The results are
shown below. It is given that 40 hard drives use less than 20 GB for data storage.
46. The following table shows the lengths of 80 leaves from a particular tree, given to the nearest centimeter.
47. The following table shows the widths of the 70 books in one section of a library, given to the nearest centimeter.
(a) Given that the upper boundary of the first class is 14.5 cm, write down the upper boundary of the second class.
48. The following cumulative frequency graph shows the masses, 𝑚 grams, of 152 uncut diamonds.
49. The following graph illustrates the times, in minutes, taken by 500 people to complete a task.
50. The following graph illustrates the times taken by 112 people to complete a puzzle.
51. The resistances, in ohms (Ω), of 100 conductors are represented in the following graph.
52. A group of students took a test. The summary data are shown in the table.
Given that there were no outliers, draw a box plot to illustrate these data.
54. The masses of male and female turtles are given in grams. The data are summarized in the box plots.
(a) Compare and contrast the masses of the male and female turtles.
(b) A turtle was found to have a mass of 330 grams. State whether it is likely to be a male or a female. Give a
reason for your answer.
(c) Write down the size of the largest female turtle.
55. University students measured the heights of the 54 trees in the grounds of a primary school. As part of a talk on
conservation at a school assembly, the students have decided to present their data using one of the following
diagrams.
56. The following table shows the focal lengths, 𝑙 mm, of the 84 zoom lenses sold by a shop. For example, there are
18 zoom lenses that can be set to any focal length between 24 and 50 mm.
a) What feature of the data does not allow them to be displayed in a histogram?
b) What type of diagram could you use to illustrate the data? Explain clearly how you would do this.
[S1 book; Ex 1D; No. 8]
1) The salaries, in thousands of dollars, of 11 people, chosen at random in a certain office, were found to be:
40, 42, 45, 41, 352, 40, 50, 48, 51, 49, 47
Choose and calculate an appropriate measure of central tendency (mean, mode or median) to summarise these
salaries. Explain briefly why the other measures are not suitable.
[J06/S1/Q1]
2) Each father in a random sample of fathers was asked how old he was when his first child was born. The following
histogram represents the information.
3) 32 teams enter for a knockout competition, in which each match results in one team winning and the other team
losing. After each match the winning team goes on to the next round, and the losing team takes no further part in
the competition. Thus 16 teams play in the second round, 8 teams play in the third round, and so on, until 2 teams
play in the final round.
50 45 61 53 55 47 52 49 46 51
60 52 54 47 57 59 42 46 51 53
56 48 50 51 44 52 49 58 55 45
Construct a grouped frequency table for these data such that there are five equal class intervals with the first class
having a lower boundary of 41.5 kg and the fifth class having an upper boundary of 61.5 kg.
[N06/S1/Q1]
5) In a survey, people were asked how long they took to travel to and from work, on average. The median time was 3
hours 36 minutes, the upper quartile was 4 hours 42 minutes and the interquartile range was 3 hours 48 minutes.
The longest time taken was 5 hours 12 minutes and the shortest time was 30 minutes.
i) Find the lower quartile.
ii) Represent the information by a box-and-whisker plot, using a scale of 2 cm to represent 60 minutes.
[N06/S1/Q03]
Answer:
i) = 0.9 hours
6) The lengths of time in minutes to swim a certain distance by the members of a class of twelve 9-year-olds and by
the members of a class of eight 16-year-olds are shown below.
9-year-olds: 13.0 16.1 16.0 14.4 15.9 15.1 14.2 13.7 16.7 16.4 15.0 13.2
16-year-olds: 14.8 13.0 11.4 11.7 16.5 13.7 12.8 12.9
7) The arrival times of 204 trains were noted and the number of minutes, 𝑡, that each train was late was recorded.
The results are summarized in the table.
Number of minutes late (𝑡) −2 ≤ 𝑡 < 0 0≤𝑡<2 2≤𝑡<4 4≤𝑡<6 6 ≤ 𝑡 < 10
Number of trains 43 51 69 22 19
8) The stem-and-leaf diagram below represents data collected for the number of hits on an internet site on each day
in March 2007. There is one missing value, denoted by 𝑥.
0 0 1 5 6
1 1 3 5 6 6 8
2 1 1 2 3 4 4 4 8 9
3 1 2 2 2 𝑥 8 9
4 2 5 6 7 9
9) As part of a data collection exercise, members of a certain school year group were asked how long they spent on
their Mathematics homework during one particular week. The times are given to the nearest 0.1 hour. The results
are displayed in the following table.
Time spent (𝑡 hours) 0.1 ≤ 𝑡 ≤ 0.5 0.6 ≤ 𝑡 ≤ 1.0 1.1 ≤ 𝑡 ≤ 2.0 2.1 ≤ 𝑡 ≤ 3.0 3.1 ≤ 𝑡 ≤ 4.5
Frequency 11 15 18 30 21
10) The pulse rates, in beats per minutes, of a random sample of 15 small animals are shown in the following table.
115 120 158 132 125
104 142 160 145 104
162 117 109 124 134
11) During January the numbers of people entering a store during the first hour after opening were as follows.
Time after opening, Frequency Cumulative
𝑥 minutes Frequency
0 < 𝑥 ≤ 10 210 210
10 < 𝑥 ≤ 20 134 344
20 < 𝑥 ≤ 30 78 422
30 < 𝑥 ≤ 40 72 𝑎
40 < 𝑥 ≤ 60 𝑏 540
i) Find the values of 𝑎 and 𝑏.
ii) Draw a cumulative frequency graph to represent this information. Take a scale of 2 cm for 10 minutes
on the horizontal axis and 2 cm for 50 people on the vertical axis.
iii) Use your graph to estimate the median time after opening that people entered the store.
iv) Calculate estimates of the mean, 𝑚 minutes, and standard deviation, 𝑠 minutes, of the time after opening
that people entered the store.
1 1
v) Use your graph to estimate the number of people entering the store between (𝑚 − 2 𝑠) and (𝑚 + 2 𝑠)
minutes after opening.
[09/S1/Q6]
12) A library has many identical shelves. All the shelves are full and the numbers of books on each shelf in a certain
section are summarized by the following stem and leaf diagram.
3 3699
4 67
5 0122
6 0011234444455667889
7 113335667899
8 0245568
9 001244445567788999
2 1222334566679
3 011234456677788
4 22357789
Answer:
i) 67
iii) Books in the second’s section are thicker or contained more pages.
i) Represent this information on a back to back stem and leaf diagram with sugar on the left hand side.
ii) Find the median and interquartile range of the weights of the bags of sugar.
[N10/S1/Q4]
Answer:
15) A hotel has 90 rooms. The table summarises information about the number of rooms occupied each day for a
period of 200 days.
Number of rooms occupied 1 − 20 21 − 40 41 − 50 51 − 60 61 − 70 70 − 90
Frequency 10 32 62 50 28 18
16) The weights of 220 sausages are summarised in the following table.
Weight (grams) < 20 < 30 < 40 < 45 < 50 < 60 < 70
Cumulative frequency 0 20 50 100 160 210 220
17) The back to back stem and leaf diagram shows the values taken by two variable A and B.
A B
(3) 3 1 0 15 1 3 3 5 (4)
(2) 4 1 16 2 2 3 4 4 5 7 7 7 8 (10)
(3) 8 3 3 17 0 1 3 3 3 4 6 6 7 9 9 (11)
(12) 9 8 8 6 5 5 4 3 2 1 1 0 18 2 4 7 (3)
(8) 9 9 8 8 6 5 4 2 19 1 5 (2)
(5) 9 8 7 1 0 20 4 (1)
18) The table summarizes the times that 112 people took to travel to work on a particular day.
Time to travel to 0 < 𝑡 ≤ 10 10 < 𝑡 ≤ 15 15 < 𝑡 ≤ 20 20 < 𝑡 ≤ 25 25 < 𝑡 ≤ 40 40 < 𝑡 ≤ 60
work (𝑡 minutes)
Frequency 19 12 28 22 18 13
i) State which time interval in the table contains the median and which time interval contains the upper
quartile.
ii) On graph paper, draw a histogram to represent the data.
iii) Calculate an estimate of the mean time to travel to work.
[N12/S1/Q3]
Answer:
i) 15 < 𝑡 ≤ 20, 25 < 𝑡 ≤ 40
iii) 22.0 minutes
19) The following are the annual amounts of money spent on clothes, to the nearest $10, by 27 people.
[N13/S1/Q4]
Answer:
i) 75 (shown)
ii) 213, 46.5
iii) Because we have used mid-value
of the interval to find mean and
standard deviation.
Time (seconds) < 10.0 < 10.5 < 11.0 < 12.0 < 12.5 < 13.5
Cumulative frequency 0 4 10 40 49 57
i) State how many athletes ran 100 metres in a time between 10.5 and 11.0 seconds.
ii) Draw a histogram on graph paper to represent the times taken by these athletes to run 100 metres.
iii) Calculate estimates of the mean and variance of the times taken by these athletes.
[J14/S1/Q6]
Answer:
i) 10 − 4 = 6
ii)
iii) 0.547
22) On a certain day in spring, the heights of 200 daffodils are measured, correct to the nearest centimeter. The
frequency distribution is given below.
Height (cm) 4 − 10 11 − 15 16 − 20 21 − 25 26 − 30
Frequency 22 32 78 40 28
24) The weights, in kilograms, of the 15 rugby players of two teams, 𝐴 and 𝐵, are shown below.
Team A 97 98 104 84 100 109 115 99 122 82 116 96 84 107 91
Team B 75 79 94 101 96 77 111 108 83 84 86 115 82 113 95
i) Represent the data by drawing a back to back stem and leaf diagram with team 𝐴 on the left hand side of
the diagram and team 𝐵 on the right hand side.
CAIE STATISTICS 1 SUPPORTING MATERIAL
ii) Find the interquartile range of the weights of the players in team 𝐴.
iii) A new player joins team 𝐵 as a substitute. The mean weight of the 16 players in team 𝐵 is now 93.9 kg.
Fid the weight of the new player.
[N15/S1/Q5]
Answer:
i)
ii) 18 kg
iii) 103.4 kg
25) The following are the maximum daily wind speeds in kilometers per hour for the first two weeks in April for
two towns, Bronlea and Rogate.
Bronlea 21 45 6 33 27 3 32 14 28 24 13 17 25 22
Rogate 7 5 4 15 23 7 11 13 26 18 23 16 10 34
i) Draw a back to back stem and leaf diagram to represent this information.
ii) Write down the median of the maximum wind speeds for Bronlea and find the interquartile range for
Rogate.
iii) Use your diagram to make one comparison between the maximum wind speeds in the two towns.
[J16/S1/Q5]
Answer:
i)
ii) 23 km/h, 16
iii) From stem and leaf diagram we observed that maximum wind speed in Rogate is less than
maximum wind speed in Bronlea.
26) The number of people a football stadium can hold is called the ‘capacity’. The capacities of 130 football
stadiums in the UK, to the nearest thousand, are summarized in the table.
i) On graph paper, draw a histogram to represent this information. Use a scale of 2 cm for a capacity of
10 000 on the horizontal axis.
ii) Calculate an estimate the mean capacity of these 130 stadiums.
iii) Find which class in the table contains the median and which contains the lower quartile.
[N16/S1/Q5]
Answer:
ii) 18600
iii) 8000 – 12000, 3000 − 7000
27) The length of time, 𝑡 minutes, taken to do the crossword in a certain newspaper was observed on 12 occasions.
The results are summarized below.
∑(𝑡 − 35) = −15
∑(𝑡 − 35)2 = 82.23
Calculate the mean and standard deviation of these times taken to do the crossword.
[J07/S1/Q1]
Answer:
33.8 min, 2.3 min
29) Rachel measured the lengths in millimeters of some of the leaves on a tree. Her results are recorded below.
32 35 45 37 38 44 33 39 36 45
Find the mean and standard deviation of the lengths of these leaves.
[N08/S1/Q1]
30) The times in minutes for seven students to become proficient at a new computer game were measured. The results
are shown below.
15 10 48 10 19 14 16
31) Esme noted the test marks, 𝑥, of 16 people in a class. She found that ∑ 𝑥 = 824 and that the standard deviation of
𝑥 was 6.5.
i) Calculate ∑(𝑥 − 50) and ∑(𝑥 − 50)2 .
ii) One person did the test later and her mark was 72. Calculate the new mean and standard deviation of the
marks of all 17 people.
[N10/S1/Q2]
Answer:
i) 24, 712
ii) 52.7, 7.94
32) A sample of 36 data values, 𝑥, gave ∑(𝑥 − 45) = −148 and ∑(𝑥 − 45)2 = 3089.
i) Find the mean and standard deviation of the 36 values.
ii) One extra data value of 29 was added to the sample. Find the standard deviation of all 37 values.
[J11/S1/Q3]
Answer:
i) 40.9, 8.30
ii) 8.41
33) The following are the times, in minutes, taken by 11 runners to complete a 10 km run.
48.3 55.2 59.9 67.7 60.5 75.6 62.5 57.4 53.4 49.2 64.1
Find the mean and standard deviation of these times.
[N11/S1/Q1]
Answer: 59.4, 7.69
35) A summary of the speeds, 𝑥 kilometres per hour, of 22 cars passing a certain point gave the following information.
∑(𝑥 − 50) = 81.4 and ∑(𝑥 − 50)2 = 671.0.
Find the variance of the speeds and hence find the value of ∑ 𝑥 2 .
[J13/S1/Q2]
Answer: 16.81, 63811
36) 120 people were asked to read an article in a newspaper. The times taken, to the nearest second, by the people to
read the article are summarized in the following table.
Time (seconds) 1 − 25 26 − 35 36 − 45 46 − 55 56 − 90
Number of people 4 24 38 34 20
Calculate estimates of the mean and standard deviation of the reading times.
[J15/S1/Q2]
Answer: 45.8, 14.9
37) For 𝑛 values of the variable 𝑥, it is given that ∑(𝑥 − 100) = 216 and ∑ 𝑥 = 2416.
Find the value of 𝑛.
[N15/S1/Q1]
Answer: 𝑛 = 22