Statistics

by Roger Porkess, Sophie Goldie

Statistics This brand new series has been written for the University of Cambridge International Examinations course for AS and A Level Mathematics 9709 This title covers the requirements of S1 and S2 The authors are experienced examiners and teachers who have written extensively at this level so have ensured all mathematical concepts are explained using language and terminology that is appropriate for students across the world Students are provded with clear and detailed worked examples and questions

Publisher : Hodder Education

Author : Roger Porkess, Sophie Goldie

ISBN : 9781444146509

Year : 2012

Language: en

File Size : 5.02 MB

Category : Science Math

Cambridge

International AS and A Level Mathematics

Statistics
Sophie Goldie
Series Editor: Roger Porkess

Questions from the Cambridge International Examinations AS and A Level Mathematics papers
are reproduced by permission of University of Cambridge International Examinations.
Questions from the MEI AS and A Level Mathematics papers are reproduced by permission of OCR.
We are grateful to the following companies, institutions and individuals who have given permission
to reproduce photographs in this book.
Photo credits: page 3 © Artur Shevel / Fotolia; page 77 © Luminis / Fotolia; page 105 © Ivan Kuzmin / Alamy; page 123
© S. Ferguson; page 134 © Peter Küng / Fotolia; page 141 © Mathematics in Education and Industry; p.192 © Claudia
Paulussen / Fotolia.com; page 202 © Ingram Publishing Limited; page 210 © Peter Titmuss / Alamy; page 216 © Monkey
Business / Fotolia; page 233 © StockHouse / Fotolia; page 236 © Ingram Publishing Limited / Ingram Image Library
500-Animals; page 256 © Kevin Peterson / Photodisc / Getty Images; page 277 © Charlie Edwards / Getty Images;
page 285 © Stuart Miles / Fotolia.com
All designated trademarks and brands are protected by their respective trademarks.
Every effort has been made to trace and acknowledge ownership of copyright. The publishers will be
glad to make suitable arrangements with any copyright holders whom it has not been possible to contact.
Hachette UK’s policy is to use papers that are natural, renewable and recyclable products and
made from wood grown in sustainable forests. The logging and manufacturing processes are
expected to conform to the environmental regulations of the country of origin.
Orders: please contact Bookpoint Ltd, 130 Milton Park, Abingdon, Oxon OX14 4SB.
Telephone: (44) 01235 827720. Fax: (44) 01235 400454. Lines are open 9.00–5.00, Monday
to Saturday, with a 24-hour message answering service. Visit our website at www.hoddereducation.co.uk
Much of the material in this book was published originally as part of the MEI Structured
Mathematics series. It has been carefully adapted for the Cambridge International AS and A Level
Mathematics syllabus.
The original MEI author team for Statistics comprised Michael Davies, Ray Dunnett, Anthony Eccles,
Bob Francis, Bill Gibson, Gerald Goddall, Alan Graham, Nigel Green and Roger Porkess.
Copyright in this format © Roger Porkess and Sophie Goldie, 2012
First published in 2012 by
Hodder Education, an Hachette UK company,
338 Euston Road
London NW1 3BH
Impression number 5 4 3 2 1
Year
2016 2015 2014 2013 2012
All rights reserved. Apart from any use permitted under UK copyright law, no part of this
publication may be reproduced or transmitted in any form or by any means, electronic or
mechanical, including photocopying and recording, or held within any information storage
and retrieval system, without permission in writing from the publisher or under licence from
the Copyright Licensing Agency Limited. Further details of such licences (for reprographic
reproduction) may be obtained from the Copyright Licensing Agency Limited, Saffron
House, 6–10 Kirby Street, London EC1N 8TS.
Cover photo © Kaz Chiba/Photodisc/Getty Images/Natural Patterns BS13
Illustrations by Pantek Media, Maidstone, Kent
Typeset in 10.5pt Minion by Pantek Media, Maidstone, Kent
Printed in Dubai
A catalogue record for this title is available from the British Library
ISBN 978 1444 14650 9

Contents
Key to symbols in this book
vi
Introduction
vii
The Cambridge International AS and A Level Mathematics syllabus viii

S1 Statistics 1

1

Chapter 1

Exploring data
Looking at the data
Stem-and-leaf diagrams
Categorical or qualitative data
Numerical or quantitative data
Measures of central tendency
Frequency distributions
Grouped data
Measures of spread (variation)
Working with an assumed mean

2
4
7
13
13
14
19
24
34
45

Chapter 2

Representing and interpreting data
Histograms
Measures of central tendency and of spread using quartiles
Cumulative frequency curves

52
53
62
65

Chapter 3

Probability
Measuring probability
Estimating probability
Expectation
The probability of either one event or another
Independent and dependent events
Conditional probability

77
78
79
81
82
87
94

Chapter 4

Discrete random variables
Discrete random variables
Expectation and variance

105
106
114
iii

Chapter 5

Permutations and combinations
Factorials
Permutations
Combinations
The binomial coefficients
Using binomial coefficients to calculate probabilities

123
124
129
130
132
133

Chapter 6

The binomial distribution
The binomial distribution
The expectation and variance of B(n, p)
Using the binomial distribution

141
143
146
147

Chapter 7

The normal distribution
Using normal distribution tables
The normal curve
Modelling discrete situations
Using the normal distribution as an approximation for the
binomial distribution

154
156
161
172

S2 Statistics 2

iv

173

179

Chapter 8

Hypothesis testing using the binomial distribution
Defining terms
Hypothesis testing checklist
Choosing the significance level
Critical values and critical (rejection) regions
One-tail and two-tail tests
Type I and Type II errors

180
182
183
184
189
193
196

Chapter 9

The Poisson distribution
The Poisson distribution
Modelling with a Poisson distribution
The sum of two or more Poisson distributions
The Poisson approximation to the binomial distribution
Using the normal distribution as an approximation for the
Poisson distribution

202
204
207
210
216
224

Chapter 10

Continuous random variables
Probability density function
Mean and variance
The median
The mode
The uniform (rectangular) distribution

233
235
244
246
247
249

Chapter 11

Linear combinations of random variables
The expectation (mean) of a function of X, E(g[X])
Expectation: algebraic results
The sums and differences of independent random variables
More than two independent random variables

256
256
258
262
269

Chapter 12

Sampling
Terms and notation
Sampling
Sampling techniques

277
277
278
281

Chapter 13

Hypothesis testing and confidence intervals using
the normal distribution
Interpreting sample data using the normal distribution
The Central Limit Theorem
Confidence intervals
How large a sample do you need?
Confidence intervals for a proportion

285
285
298
300
304
306

Answers
Index

312
342

v

Key to symbols in this book
?


This symbol means that you may want to discuss a point with your teacher. If
you are working on your own there are answers in the back of the book. It is
important, however, that you have a go at answering the questions before looking
up the answers if you are to understand the mathematics fully.

! This is a warning sign. It is used where a common mistake, misunderstanding or
tricky point is being described.
This is the ICT icon. It indicates where you could use a graphic calculator or a
computer. Graphic calculators and computers are not permitted in any of the
examinations for the Cambridge International AS and A Level Mathematics 9709
syllabus, however, so these activities are optional.
This symbol and a dotted line down the right-hand side of the page indicate
material which is beyond the syllabus for the unit but which is included for
completeness.

vi

Introduction
This is part of a series of books for the University of Cambridge International
Examinations syllabus for Cambridge International AS and A Level Mathematics
9709. There are thirteen chapters in this book; the first seven cover Statistics 1
and the remaining six Statistics 2. The series also includes two books for pure
mathematics and one for mechanics.
These books are based on the highly successful series for the Mathematics in
Education and Industry (MEI) syllabus in the UK but they have been redesigned
for Cambridge international students; where appropriate, new material has been
written and the exercises contain many past Cambridge examination questions.
An overview of the units making up the Cambridge international syllabus is given
in the diagram on the next page.
Throughout the series the emphasis is on understanding the mathematics as well
as routine calculations. The various exercises provide plenty of scope for practising
basic techniques; they also contain many typical examination questions.
An important feature of this series is the electronic support. There is an
accompanying disc containing two types of Personal Tutor presentation:
examination-style questions, in which the solutions are written out, step by step,
with an accompanying verbal explanation, and test-yourself questions; these are
multiple-choice with explanations of the mistakes that lead to the wrong answers
as well as full solutions for the correct ones. In addition, extensive online support
is available via the MEI website, www.mei.org.uk.
The books are written on the assumption that students have covered and
understood the work in the Cambridge IGCSE® syllabus. However, some
of the early material is designed to provide an overlap and this is designated
‘Background’. There are also places where the books show how the ideas can be
taken further or where fundamental underpinning work is explored and such
work is marked as ‘Extension’.
The original MEI author team would like to thank Sophie Goldie who has carried
out the extensive task of presenting their work in a suitable form for Cambridge
international students and for her original contributions. They would also like to
thank University of Cambridge International Examinations for their detailed advice
in preparing the books and for permission to use many past examination questions.
Roger Porkess
Series Editor

vii

The Cambridge International AS
and A Level Mathematics syllabus
P2
Cambridge
IGCSE
Mathematics

P1

S1

AS Level
Mathematics

M1

S1

M1
S2

P3
M1

viii

S1
M2

A Level
Mathematics

Statistics 1

S1

Exploring data

S1 
1

2

1

Exploring data
A judicious man looks at statistics, not to get knowledge but to save
himself from having ignorance foisted on him.
Carlyle

Source: The Times 2012

The cuttings on page 2 all appeared in one newspaper on one day. Some of them
give data as figures, others display them as diagrams.
How do you interpret this information? Which data do you take seriously and
which do you dismiss as being insignificant or even misleading?

Exploring data

To answer these questions fully you need to understand how data are collected
and analysed before they are presented to you, and how you should evaluate what
you are given to read (or see on the television). This is an important part of the
subject of statistics.

S1 
1

In this book, many of the examples are set as stories from fictional websites.
Some of them are written as articles or blogs; others are presented from the
journalists’ viewpoint as they sort through data trying to write an interesting
story. As you work through the book, look too at the ways you are given such
information in your everyday life.

bikingtoday.com
Another cyclist seriously hurt. Will you be next?
On her way back home from school on
Wednesday afternoon, little Rita Roy
was knocked off her bicycle and taken to
hospital with suspected concussion.
Rita was struck by a Ford Transit van, only
50 metres from her own house.
Rita is the fourth child from the Nelson
Mandela estate to be involved in a serious
cycling accident this year.

The busy road where Rita Roy was
knocked off her bicycle yesterday.

After reading the blog, the editor of a local newspaper commissioned one of the
paper’s reporters to investigate the situation and write a leading article for the
paper on it. She explained to the reporter that there was growing concern locally
about cycling accidents involving children. She emphasised the need to collect
good quality data to support presentations to the paper’s readers.

?


Is the aim of the investigation clear?
Is the investigation worth carrying out?
What makes good quality data?

The reporter started by collecting data from two sources. He went through back
numbers of the newspaper for the previous two years, finding all the reports of
cycling accidents. He also asked an assistant to carry out a survey of the ages of

3

Exploring data

S1 
1

local cyclists; he wanted to know whether most cyclists were children, young
adults or whatever.

?


Are the reporter’s data sources appropriate?

Before starting to write his article, the reporter needed to make sense of the data
for himself. He then had to decide how he was going to present the information
to his readers. These are the sorts of data he had to work with.
Name

Age

Distance
from home

Cause

Injuries

Treatment

Rahim Khan

45

3 km

skid

Concussion

Hospital
outpatient

Debbie Lane

5

75 km

hit kerb

Broken arm

Hospital
outpatient

Arvinder Sethi

12

1200 m

lorry

Multiple
fractures

Hospital
3 weeks

Husna Mahar

8

300 m

Bruising

Hospital
outpatient

David Huker

8

50 m

hit
each
other

Concussion

Hospital
outpatient

}

There were 92 accidents listed in the reporter’s table.
Ages of cyclists (from survey)
66 6 62 19 20
35 26 61 13 61
64 11 39 22 9
37 18 138 16 67
9 23 12 9 37
18 20 11 25 7
18 15

15
28
13
45
7
42

21
21
9
10
36
29

8 21 63
7 10 52
17 64 32
55 14 66
9 88 46
6 60 60

44 10 44 34 18
13 52 20 17 26
8 9 31 19 22
67 14 62 28 36
12 59 61 22 49
16 50 16 34 14

This information is described as raw data, which means that no attempt has yet
been made to organise it in order to look for any patterns.

Looking at the data

4

At the moment the arrangement of the ages of the 92 cyclists tells you very little
at all. Clearly these data must be organised so as to reveal the underlying shape,
the distribution. The figures need to be ranked according to size and preferably
grouped as well. The reporter had asked an assistant to collect the information
and this was the order in which she presented it.

Tally

Tallying is a quick, straightforward way of grouping data into suitable intervals.
You have probably met it already.
Tally

Frequency

        

13

10–19

                    

26

20–29

            

16

30–39

     

10

40–49

    

6

50–59

 

5

60–69

        

0–9

70–79
80–89

Looking at the data

Stated age
(years)

S1 
1

14
0



1



1


130–139
Total

92

Extreme values

A tally immediately shows up any extreme values, that is values which are far
away from the rest. In this case there are two extreme values, usually referred to
as outliers: 88 and 138. Before doing anything else you must investigate these.
In this case the 88 is genuine, the age of Millie Smith, who is a familiar sight
cycling to the shops.
The 138 needless to say is not genuine. It was the written response of a man who
was insulted at being asked his age. Since no other information about him is
available, this figure is best ignored and the sample size reduced from 92 to 91.
You should always try to understand an outlier before deciding to ignore it; it
may be giving you important information.

! Practical statisticians are frequently faced with the problem of outlying
observations, observations that depart in some way from the general pattern of
a data set. What they, and you, have to decide is whether any such observations
belong to the data set or not. In the above example the data value 88 is a genuine
member of the data set and is retained. The data value 138 is not a member of the
data set and is therefore rejected.
5

Describing the shape of a distribution

An obvious benefit of using a tally is that it shows the overall shape of the
distribution.
30
frequency density (people/10 years)

Exploring data

S1 
1

20

10

0

10

20

30

40

50

60

70

80

90

age (years)

Figure 1.1  Histogram to show the ages of people involved in cycling accidents

You can now see that a large proportion (more than a quarter) of the sample are
in the 10 to 19 year age range. This is the modal group as it is the one with the
most members. The single value with the most members is called the mode, in
this case age 9.
You will also see that there is a second peak among those in their sixties; so this
distribution is called bimodal, even though the frequency in the interval 10–19 is
greater than the frequency in the interval 60–69.
Different types of distribution are described in terms of the position of their
modes or modal groups, see figure 1.2.

(a)

(b)

(c)

Figure 1.2  Distribution shapes: (a) unimodal and symmetrical (b) uniform (no
mode but symmetrical) (c) bimodal

6

When the mode is off to one side the distribution is said to be skewed. If the
mode is to the left with a long tail to the right the distribution has positive (or
right) skewness; if the long tail is to the left the distribution has negative (or left)
skewness. These two cases are shown in figure 1.3.

© 2018-2019 uberlabel.com. All rights reserved