Bayesian Statistics
An Introduction
Fourth Edition
PETER M. LEE
(ISBN 978-1-118-33257-3)
Table of Contents
- Preface
- Preface to the First Edition
- Preliminaries
- Probability and Bayes’ Theorem
- Notation
- Axioms for probability
- ‘Unconditional’ probability
- Odds
- Independence
- Some simple consequences of the axioms; Bayes’ Theorem
- Examples on Bayes’ Theorem
- The Biology of Twins
- A political example
- A warning
- Random variables
- Discrete random variable
- The binomial distribution
- Continuous random variables
- The normal distribution
- Mixed random variables
- Several random variables
- Two discrete random variables
- Two continuous random variables
- Bayes’ Theorem for random variables
- Example
- One discrete variable and one continuous variable
- Independent random variables
- Means and variances
- Expectations
- The expectation of a sum and of a product
- Variance, precision and standard deviation
- Examples
- Variance of a sum; covariance and correlation
- Approximations to the mean and variance of a function of a random variable
- Conditional expectations and variances
- Medians and modes
- Exercises on Chapter 1
- Bayesian Inference for the Normal Distribution
- Nature of Bayesian inference
- Preliminary remarks
- Post is prior times likelihood
- Likelihood can be multiplied by any constant
- Sequential use of Bayes’ Theorem
- The predictive distribution
- A warning
- Normal prior and likelihood
- Posterior from a normal prior and likelihood
- Example
- Predictive distribution
- The nature of the assumptions made
- Several normal observations with a normal prior
- Posterior distribution
- Example
- Predictive distribution
- Robustness
- Dominant likelihoods
- Improper priors
- Approximation of proper priors by improper priors
- Locally uniform priors
- Bayes’ postulate
- Data translated likelihoods
- Transformation of unknown parameters
- Highest density regions (HDRs)
- Need for summaries of posterior information
- Relation to classical statistics
- Normal variance
- A suitable prior for the normal variance
- Reference prior for the normal variance
- HDRs for the normal variance
- What distribution should we be considering?
- Example
- The role of sufficiency
- Definition of sufficiency
- Neyman’s Factorization Theorem
- Sufficiency Principle
- Examples
- Order Statistics and Minimal Sufficient Statistics
- Examples on minimal sufficiency
- Conjugate prior distributions
- Definition and difficulties
- Examples
- Mixtures of conjugate densities
- Is your prior really conjugate?
- The exponential family
- Definition
- Examples
- Conjugate densities
- Two-parameter exponential family
- Normal mean and variance both unknown
- Formulation of the problem
- Marginal distribution of the mean
- Example of the posterior density for the mean
- Marginal distribution of the variance
- Example of the posterior density of the variance
- Conditional density of the mean for given variance
- Conjugate joint prior for the normal
- The form of the conjugate prior
- Derivation of the posterior
- Example
- Concluding remarks
- Exercises on Chapter 2
- Some Other Common Distributions
- The binomial distribution
- Conjugate prior
- Odds and log-odds
- Highest density regions
- Example
- Predictive distribution
- Reference prior for the binomial likelihood
- Bayes&rsquop; postulate
- Haldane’s prior
- The arc-sine distribution
- Conclusion
- Jeffreys’ rule
- Fisher’s information
- The information from several observations
- Jeffreys’ prior
- Examples
- Warning
- Several unknown parameters
- Example
- The Poisson distribution
- Conjugate prior
- Reference prior
- Example
- Predictive distribution
- The uniform distribution
- Preliminary definitions
- Uniform distribution with a fixed lower endpoint
- The general uniform distribution
- Examples
- Reference prior for the uniform distribution
- Lower limit of the interval fixed
- Example
- Both limits unknown
- The tramcar problem
- The discrete uniform distribution
- The first digit problem; invariant priors
- A prior in search of an explanation
- The problem
- A solution
- Haar priors
- The circular normal distribution
- Distributions on the circle
- Example
- Construction of an HDR by numerical integration
- Remarks
- Approximations based on the likelihood
- Maximum likelihood
- Iterative methods
- Approximation to the posterior density
- Examples
- Extension to more than one parameter
- Example
- Reference Posterior Distributions
- The information provided by an experiment
- Reference priors under asymptotic normality
- Uniform distribution of unit length
- Normal mean and variance
- Technical complications
- Exercises on Chapter 3
- Hypothesis Testing
- Hypothesis testing
- Introduction
- Classical hypothesis testing
- Difficulties with the classical approach
- The Bayesian approach
- Example
- Comment
- One-sided hypothesis tests
- Definition
- P-values
- Lindley’s method
- A compromise with classical statistics
- Example
- Discussion
- Point null hypotheses with prior information
- When are point null hypotheses reasonable?
- A case of nearly constant likelihood
- The Bayesian method for point null hypotheses
- Sufficient statistics
- Point null hypotheses (normal case)
- Calculation of the Bayes’ factor
- Numerical examples
- Lindley’s paradox
- A bound which does not depend on the prior distribution
- The case of an unknown variance
- The Doogian philosophy
- Description of the method
- Numerical example
- Exercises on Chapter 4
- Two-sample Problems
- Two-sample problems both variances unknown
- The problem of two normal samples
- Paired comparisons
- Example of a paired comparison problem
- The case where both variances are known
- Example
- Non-trivial prior information
- Variances unknown but equal
- Solution using reference priors
- Example
- Non-trivial prior information
- Variances unknown and unequal (Behrens-Fisher problem)
- Formulation of the problem
- Patil’s approximation
- Example
- Substantial prior information
- The Behrens-Fisher controversy
- The Behrens-Fisher problem from a classical standpoint
- Example
- The controversy
- Inferences concerning a variance ratio
- Statement of the problem
- Derivation of the F distribution
- Example
- Comparison of two proportions; the 2 x 2 table
- Methods based on the log odds-ratio
- Example
- The inverse root-sine transformation
- Other methods
- Exercises on Chapter 5
- Correlation, Regression and ANOVA
- Theory of the correlation coefficient
- Definitions
- Approximate posterior distribution of the correlation coefficient
- The hyperbolic tangent substitution
- Reference prior
- Incorporation of prior information
- Examples on correlation
- Use of the hyperbolic tangent transformation
- Combination of several correlation coefficients
- The squared correlation coefficient
- Regression and the bivariate normal model
- The model
- Bivariate linear regression
- Example
- Case of known variance
- The mean value at a given value of the explanatory variable
- Prediction of observations at a given value of the explanatory variable
- Continuation of the example
- Multiple regression
- Polynomial regression
- Conjugate prior for bivariate regression
- The problem of updating a regression line
- Formulae for recursive construction of a regression line
- Finding an appropriate prior
- Comparison of several means the one way model
- Description of the one way layout
- Integration over the nuisance parameters
- Derivation of the F distribution
- Relationship to the analysis of variance
- Example
- Relationship to a simple linear regression model
- Investigation of contrasts
- The two way layout
- Notation
- Marginal posterior distributions
- Analysis of variance
- The general linear model
- Formulation of the general linear model
- Derivation of the posterior
- Inference for a subset of the parameters
- Application to bivariate linear regression
- Exercises on Chapter 6
- Other Topics
- The likelihood principle
- Introduction
- The conditionality principle
- The sufficiency principle
- The likelihood principle
- Discussion
- The stopping rule principle
- Definitions
- Examples
- The stopping rule principle
- Discussion
- Informative stopping rules
- An example on capture and recapture of fish
- Choice of prior and derivation of posterior
- The maximum likelihood estimator
- Numerical example
- The likelihood principle and reference priors
- The case of Bernoulli trials and its general implications
- Conclusion
- Bayesian decision theory
- The elements of game theory
- Point estimators resulting from quadratic loss
- Particular cases of quadratic loss
- Weighted quadratic loss
- Absolute error loss
- Zero-one loss
- General discussion of point estimation
- Bayes linear methods
- Methodology
- Some simple examples
- Extensions
- Decision theory and hypothesis testing
- Decision theory and classical hypothesis testing
- Composite hypotheses
- Empirical Bayes methods
- Von Mises’ example
- The Poisson case
- Exercises on Chapter 7
- Hierarchical Models
- The idea of a hierarchical model
- Definition
- Examples
- Objectives of a Hierarchical Analysis
- More on Empirical Bayes Methods
- The hierarchical normal model
- The model
- The Bayesian analysis for known overall mean
- The empirical Bayes approach
- The baseball example
- The Stein estimator
- Evaluation of the risk of the James-Stein estimator
- Bayesian analysis for an unknown overall mean
- Derivation of the posterior
- The general linear model revisited
- An informative prior for the general linear model
- Ridge regression
- A further stage to the general linear model
- The one way model
- Posterior variances of the estimators
- Exercises on Chapter 8
- The Gibbs Sampler
- Introduction to numerical methods
- Monte Carlo methods
- Markov chains
- The EM algorithm
- The idea of the EM algorithm
- Why the EM algorithm works
- Semi-conjugate prior with a normal likelihood
- The EM algorithm for the hierarchical normal model
- A particular case of the hierarchical normal model
- Data augmentation by Monte Carlo
- The genetic linkage example revisited
- Use of R
- The Genetic Linkage Example in R
- Other possible uses for data augmentation
- The Gibbs sampler
- Chained data augmentation
- An example with observed data
- More on the semi-conjugate prior with a normal likelihood
- The Gibbs sampler as an extension of chained data augmentation
- An application to change-point analysis
- Other uses of the Gibbs sampler
- More about convergence
- Rejection sampling
- Description
- Example
- Rejection sampling for log-concave distributions
- A practical example
- The Metropolis-Hastings Algorithm
- Finding an invariant distribution
- The Metropolis-Hastings algorithm
- Choice of a candidate density
- Example
- More Realistic Examples
- Gibbs as a special case of Metropolis-Hastings
- Metropolis within Gibbs
- Introduction to WinBUGS and OpenBUGS
- Information about WinBUGS and OpenBUGS
- Distributions in WinBUGS and OpenBUGS
- A Simple Example using WinBUGS
- The Pump Failure Example Revisited
- DoodleBUGS
- coda
- R2WinBUGS and R2OpenBUGS
- Generalized Linear Models
- Logistic Regression
- A general framework
- Exercises on Chapter 9
- Some Approximate Methods
- Bayesian importance sampling
- Importance Sampling to find HDRs
- Sampling importance resampling (SIR)
- Multidimensional applications
- Variational Bayesian methods: simple case
- Independent Parameters
- Application to the normal distribution
- Updating the mean
- Updating the variance
- Iteration
- Numerical example
- Variational Bayesian methods: general case
- A mixture of multivariate normals
- ABC: Approximate Bayesian Computation
- The ABC rejection algorithm
- The genetic linkage example
- The ABC Markov Chain Monte Carlo algorithm
- The ABC Sequential Monte Carlo algorithm
- The ABC local linear regression algorithm
- Other variants of ABC
- Reversible Jump Markov Chain Monte Carlo
- Exercises on Chapter 10
- A Common Statistical Distributions
- Normal distribution
- Chi-squared distribution
- Normal approximation to chi-squared
- Gamma distribution
- Inverse chi-squared distribution
- Inverse chi distribution
- Log chi-squared distribution
- Student’s t distribution
- Normal/chi-squared distribution
- Beta distribution
- Binomial distribution
- Poisson distribution
- Negative binomial distribution
- Hypergeometric distribution
- Uniform distribution
- Pareto distribution
- Circular normal distribution
- Behrens’ distribution
- Snedecor’s F distribution
- Fisher’s z distribution
- Cauchy distribution
- Difference of beta variables
- Bivariate normal distribution
- Multivariate normal distribution
- Distribution of the correlation coefficient
- Tables
- Percentage points of the Behrens-Fisher distribution
- HDRs for the chi-squared distribution
- HDRs for the inverse chi-squared distribution
- Chi-squared corresponding to HDRs for log c2
- Values of F corresponding to HDRs for log F
- R Programs
- Further Reading
- Robustness
- Nonparametric methods
- Multivariate estimation
- Time series and forecasting
- Sequential methods
- Numerical methods
- Bayesian Networks
- General reading
Click on the required option below:
Peter M. Lee
2 July 2012