Probability & Statistics - MAT00035I

Department: Mathematics
Credit value: 40 credits
Credit level: I
Academic year of delivery: 2022-23

Related modules

Module will run

Occurrence	Teaching period
A	Autumn Term 2022-23 to Summer Term 2022-23

Module aims

The Probability and Statistics Module in Stage 2 aims to provide students with a thorough grounding in statistical methodology, an awareness of the scope, achievements and possibilities of using statistics, and confidence in the use of appropriate statistical and computational tools, techniques and methodologies for solving and analysing a range of practical problems. This module leads on to the study of more advanced and specialised statistics, probability and financial mathematics in Stages 3 and 4.

As part of these broad aims, this module has the following component parts:

Probability and Statistical Inference 1 (Autumn) will give students a theoretical and mathematically formal framework for joint and conditional distributions of random variables and for studying the asymptotic behaviour of sequences of random variables. Moreover, the statistical inference technique of parameter (point) estimation will be discussed.
Statistical Inference 2 (Spring) will extend the knowledge about statistical inference concepts and techniques to (confidence) interval estimation and hypothesis testing. In addition to a variety of standard statistical tests, also simple analysis-of-variance models will be introduced.
Linear Models (Spring/Summer) will introduce the students to more complex statistical models like the simple and multiple linear regression models. These models will not only be theoretically discussed in detail, but also applied to real-world problems using the statistical software R.
Applied Probability (Spring) will show students how probability theory may be used to model a variety of random processes (discrete in both time and space).

Studying these components alongside each other during the course of the year will allow students to see the many connections across different areas of Probability and Statistics; understanding these connections and being able to use ideas and techniques across many contexts is an essential part of the modern mathematician or statistician’s toolkit.

Module learning outcomes

Subject content

Probability and Statistical Inference 1

Understand the concepts of joint and conditional distributions. Be able to compute conditional expectations.
Understand the role and use of moment generating functions and be able to use them to compute the expectation and variance of standard distributions.
Understand different modes of convergence of sequences of random variables.
Be able to apply various limit theorems to prove convergence in probability or in distribution of a sequence of random variables.
Understand important limit theorems in Statistics such as the Weak Law of Large Numbers and the Central Limit Theorem. Be able to prove the Central Limit Theorem.
Be able to estimate parameters of standard distributions following the Maximum Likelihood approach.
Understand estimators as functions of random variables and be able to assess their properties such as unbiasedness, consistency and asymptotic normality.
Be able to compare different estimators taking into account the Mean Squared Error and asymptotic properties.

Statistical Inference 2

Be able to derive a confidence interval, exact and/or approximate, for parameters of probability distributions.
Be able to identify all the elements of a hypothesis test, carry it out, and interpret the result.
Be able to establish a relationship between confidence intervals and hypothesis testing.
Understand and be able to carry out a one-way Analysis of Variance (ANOVA).
Be able to apply confidence intervals, hypothesis testing techniques and ANOVA models to solve a variety of real-life problems.
Be competent at executing and interpreting R commands that facilitate the course's inference calculations.

Linear Models

Understand the theoretical framework of linear regression models: standard model assumptions, Least Squares estimators for the model parameters and their properties, inference techniques for the model parameters.
Be able to derive Maximum Likelihood estimators for the parameters in a Gaussian linear regression model.
Be competent in applying linear regression models for data analysis, using the statistical software R to practically estimate the models.
Be able to assess the adequacy of a linear regression model for data analysis and to select an adequate set of covariates in the model.
Be able to draw conclusions about real-life problems using linear regression models.

Applied Probability
Understand the probability generating function of a discrete random variable, and know how to interpret and apply it.
Understand elementary aspects of simple branching processes, and be able to calculate extinction probabilities.
Describe and calculate with discrete time/space Markov chains, including the calculation of absorption probabilities and stationary distributions.

Module content

[n] lectures allocated to a given subsection.

Probability and Statistical Inference 1

1. Background [3]

Probability and random variables, continuous and discrete
Density, probability mass and distribution functions
Moments of random variables
Joint and marginal distributions of two or more random variables
Independence of random variables
Covariance and correlation

2. Further characteristics of probability distributions. [5]

Moment generating functions
Conditional distributions of jointly discrete and continuous random variables
Multivariate normal, chi square and t distributions

3. Limit theorems. [5]

Modes of convergence for sequences of random variables: in distribution, in probability
Weak Law of Large Numbers
Continuous Mapping Theorem, Slutsky’s Theorem
Continuity Theorem (for moment generating functions)
Central Limit Theorem (including proof)

4. Point estimation of parameters. [6]

Statistical models and estimators
Method of Moments and Maximum Likelihood estimation of parameters
Properties of estimators: unbiasedness, consistency, asymptotic normality of estimators
Mean Squared Error of an estimator
Cramér-Rao bound and efficiency
General properties of Maximum Likelihood estimators

Statistical Inference 2

Confidence intervals. [5]

Interval estimation as a statistical inference method
Confidence intervals for means, proportions, difference in means and variances
Sample size calculations

2. Hypothesis testing. [7]

Elements of a statistical test: null and alternative hypothesis, test statistics, critical value, size, power
Hypothesis tests for means, proportions, variances, comparison of means
Choice of a hypothesis test: size vs. power
Power and sample size calculations

3. General hypothesis testing techniques. [3]

Neyman-Pearson Lemma
Likelihood Ratio tests
Goodness-of-fit tests and analysis of n x m contingency tables.

4. One-way Analysis of Variance (ANOVA). [4]

Standard ANOVA model and F test

Linear Models

1. Statistical inference using a single covariate. [5]

Correlation analysis: correlation coefficient, interpretation and least squares line
Simple linear regression: model, estimation and inference
Assessing the goodness of fit of the simple linear regression model
Prediction

2. Multiple linear regression models. [9]

Estimation of model parameters: Least Squares estimators and Maximum Likelihood estimators under normality
Inference on model parameters: confidence intervals and significance tests

3. Model diagnostics and choice. [5]

Residual analysis and model diagnostic checks
Collinearity
Variable selection

Applied Probability

1. Generating functions and applications. [4]

deriving and calculating with the probability generating function (PGF) of a non-negative integer-valued random variable
using PGFs to derive mean/variance of random variables
using PGFs to work with sums (including random sums) of random variables

2. Branching processes. [5]

Galton-Watson branching processes
calculation of extinction probability
notions of subcritical, critical, and supercritical processes
total number of descendants

3. Markov chains. [9]

transition matrix and state-space diagram
Chapman-Kolmogorov equations
first-step decomposition for hitting probabilities
fundamental matrix calculations, with application to absorption probabilities
stationary distribution (existence and calculation)
limiting occupation probabilities

Indicative assessment

Task	% of module mark
Closed/in-person Exam (Centrally scheduled)	25
Closed/in-person Exam (Centrally scheduled)	25
Closed/in-person Exam (Centrally scheduled)	50

Special assessment rules

None

Additional assessment information

The intention is that the different components will “speak” to each other throughout the year. However, to accommodate the needs of the various combined programmes these components will also service, the assessments will be attached to the components as indicated.

Students only resit components which they have failed.

Indicative reassessment

Task	% of module mark
Closed/in-person Exam (Centrally scheduled)	25
Closed/in-person Exam (Centrally scheduled)	25
Closed/in-person Exam (Centrally scheduled)	50

Module feedback

Current Department policy on feedback is available in the undergraduate student handbook. Coursework and examinations will be marked and returned in accordance with this policy.

Indicative reading

M DeGroot and M Schervish (2012), Probability and Statistics (4th edition), Pearson.
R V Hogg, J W McKean and A T Craig (2013), Introduction to Mathematical Statistics (7th edition), Pearson.

R V Hogg and E A Tanis, Probability and Statistical Inference (4th edition), Macmillan.
J J Faraway (2005), Linear Models with R, Chapman & Hall/CRC.
G A F Seber and A J Lee (2012), Linear Regression Analysis (2nd edition), J. Wiley.
G Grimmett and D Stirzaker (2001), Probability and Random Processes, OUP.
J R Norris (1997), Markov Chains, CUP.