# A new method for forecasting population size

Dr Han Lin Shang, ANU College of Business and Economics, and colleagues have used a cohort-component projection model to forecast the population size of the United Kingdom in research recently published in the *International Journal of Forecasting*. A summary of the research is included below.

**A multilevel functional data method for forecasting population, with an application to the United Kingdom**

Shang, H.L., Smith, P.W.F., Bijak, J. and Wisniowski, A.

In recent decades, the authors have seen a considerable amount of development in the stochastic modelling and forecasting of population. Cohort component projection models are often used to model the evolution of age-specific population, and are particularly useful to highlight which demographic component contributes the most in population change. Many methods have been proposed in the demographical forecasting literature to forecast four demographic components, namely mortality, fertility, emigration and immigration. However, some of the existing methods are considered from a deterministic viewpoint, which can be quite restrictive. The statistical method proposed is a multilevel functional data analytic approach, where the age-specific mortality and migration for females and males are modelled and forecasted jointly. The forecast uncertainty associated with each component is incorporated through parametric bootstrapping.

The authors consider a functional data analytic approach for forecasting population. As a generalisation of the Lee-Carter method, the functional data analysis views each component of population from a functional perspective, by combining ideas from nonparametric smoothing, functional principal component regression, differential equation, to name only a few. It has salient features: data can be pre-processed before performing functional principal component analysis in order to reduce the effect of the noisy and missing observations. Each component of population is modelled as continuous functions of age, so that patterns of variation among years are captured by the functional principal components (which are data-driven basis functions) and their associated scores. By drawing random samples from normal distributions, the simulated principal component scores can be forecasted using a univariate time-series technique for each replication. Conditional on the estimated mean and functional principal component functions, the probabilistic forecasts of future realisations can be obtained through bootstrapping.

The historical UK population data include observations from 1975 to 2009, from which they aim to forecast population by age and sex from 2010 to 2030. The fertility data were obtained from the Human Fertility Database (2013), while the mortality data were obtained from the Human Mortality Database (2013). The emigration and immigration counts were obtained directly from the Office for National Statistics. The UK population has been obtained from Human Mortality Database (2013).

By using the multilevel functional principal component regression, the authors forecast each component of population in a stochastic manner. They find that the age-specific mortality rates are likely to decline, especially for elderly people. The decline seems to be more rapid for males than females. As a result of decreases in mortality, they observe an increase in life expectancy at birth for both females and males. As with age-specific fertility rates, the greatest forecast change is a continuing decrease in fertility rates for ages between 17 and 30, but a continuing increase in fertility for ages between 30 and 40. From the forecasted age-specific fertility, they obtain the total fertility rates which are likely to decrease until 2015, then increase thereafter. The median total fertility rates are likely to fluctuate at about two children per family. For the age-specific emigration and immigration, the greatest forecast change is a continuing increase in emigration and immigration for ages between 20 and 45.

With the forecasted age-specific mortality, fertility, emigration and immigration, the authors obtain forecasted population via a cohort component projection model. They found that the age profile of the population in 2030 is mainly driven by future migration and, to some extent, fertility. The largest uncertainties are associated with the number of newborns, as well as the population at ages between 20 and 45, for both males and females. It is also expected the number of older people will be increasing in 2030, as well as the working age group between 20 and 45. The total population in the UK will exceed 70 million, by the middle of 2029.

In this paper, the authors present the functional data analytic approach for estimating and forecasting age profiles of the four demographic components of changes in the UK. They combine the forecasts of mortality, fertility, emigration and immigration into the forecast of population, through a cohort component projection model. The advantage of our functional models can be attributed to: (1) the use of a smoothing technique to smooth out noisy or missing observations; (2) the use of higher order functional principal components to extract patterns in the data; (3) accounting for the uncertainties embedded in mortality, fertility and migration for each age and gender. The advantage of the multilevel functional data model is that it incorporates correlation between two genders and thus allows each component of population to be modelled jointly.

An earlier version of the paper was published as a working paper by the Centre for Population Change, University of Southampton.