Age × stage‐classified demographic analysis: a comprehensive approach

Abstract This paper presents a comprehensive theory for the demographic analysis of populations in which individuals are classified by both age and stage. The earliest demographic models were age classified. Ecologists adopted methods developed by human demographers and used life tables to quantify survivorship and fertility of cohorts and the growth rates and structures of populations. Later, motivated by studies of plants and insects, matrix population models structured by size or stage were developed. The theory of these models has been extended to cover all the aspects of age‐classified demography and more. It is a natural development to consider populations classified by both age and stage. A steady trickle of results has appeared since the 1960s, analyzing one or another aspect of age × stage‐classified populations, in both ecology and human demography. Here, we use the vec‐permutation formulation of multistate matrix population models to incorporate age‐ and stage‐specific vital rates into demographic analysis. We present cohort results for the life table functions (survivorship, mortality, and fertility), the dynamics of intra‐cohort selection, the statistics of longevity, the joint distribution of age and stage at death, and the statistics of life disparity. Combining transitions and fertility yields a complete set of population dynamic results, including population growth rates and structures, net reproductive rate, the statistics of lifetime reproduction, and measures of generation time. We present a complete analysis of a hypothetical model species, inspired by poecilogonous marine invertebrates that produce two kinds of larval offspring. Given the joint effects of age and stage, many familiar demographic results become multidimensional, so calculations of marginal and mixture distributions are an important tool. From an age‐classified point of view, stage structure is a form of unobserved heterogeneity. From a stage‐classified point of view, age structure is unobserved heterogeneity. In an age × stage‐classified model, variance in demographic outcomes can be partitioned into contributions from both sources. Because these models are formulated as matrices, they are amenable to a complete sensitivity analysis. As more detailed and longer longitudinal studies are developed, age × stage‐classified demography will become more common and more important.


INTRODUCTION
As demographic data become more detailed, they reveal more, and more diverse, differences among individuals. These types of heterogeneity led to the development of increasingly complicated demographic models. Demographic analysis in ecology can be divided into three periods. The first, building on the work of Lotka and many others, used age-classified models and life table functions to compute survival, fertility, life expectancy, intrinsic rates of increase, stable age distributions, and reproductive values. It became a fully developed population theory by the mid-20th century (e.g., Lotka 1939, Keyfitz 1968, Coale 1972. It was quickly adopted by population biologists and ecologists, from the mortality studies of Pearl and Miner (1935) and the comparative life table studies of Deevey (1947) to the use of the Euler-Lotka equation to calculate population growth rates (e.g., Birch 1948, Leslie andPark 1949). Fisher (1930), Norton (1928), and Charlesworth (1994), among others, applied age-classified theory to evolutionary questions, and it became the basis of evolutionary demography and life history theory (e.g., Cole 1954, Lewontin 1965, Hamilton 1966, Stearns 1992.
The second period saw the widespread adoption of stageclassified methods based largely on matrix population models classified by stage (Lefkovitch 1965, Werner andCaswell 1977) where stage might refer to size, instar, developmental stage, physiological condition, behavioral status, etc. Plant ecologists were at the forefront of this development because the modular construction and plastic growth of plants tend to make age less suitable as a state variable in many cases. Today, age-classified models are still more common among animal models and stage-classified models more common among plant models (Salguero-G omez et al. 2015(Salguero-G omez et al. , 2016.
Stage-classified matrix models made it possible to compute population growth rates and structures, sensitivity, and elasticity analyses, and their extensions to periodic and stochastic environments, as well as to develop sophisticated treatments of nonlinear dynamics. This led eventually to a fully developed theory of stage-classified demography in a variety of mathematical frameworks (Nisbet and Gurney 1982, Metz and Diekmann 1986, Tuljapurkar 1990, Tuljapurkar and Caswell 1997, Easterling et al. 2000, Caswell 2001, Ellner et al. 2016. Now, increasingly extensive and detailed individual data are being collected and it is becoming apparent that combining age-and stage-classification is sometimes useful. This is accompanied, in both ecology and human demography, by an interest in incorporating more biological or social detail into the analyses. Our goal here is to present a comprehensive methodological treatment of age 9 stage-classified matrix population models. The methods will be presented in a way that can be applied to any species, with any life cycle, described by any pattern of transitions among any set of stages, with any type of reproduction of any kind(s) of offspring. We will provide a hypothetical example to suggest how a particular species might be analyzed.

Explicit and implicit age dependence
As a first step, we recognize that age dependence may be either implicit or explicit. Any stage-classified model produces implicit age-classified results because, even though age has no effect on the vital rates, an individual still becomes one unit older with the passage of one unit of time. Age-specific results (e.g., life expectancy) computed from such a model are implicit in the stage structure of the life cycle. Methods based on absorbing Markov chains are widely used to extract implicit age dependence from stageclassified models (e.g., Feichtinger 1971, Cochran and Ellner 1992, Caswell 2001, 2009, Steinsaltz and Evans 2004, Tuljapurkar and Horvitz 2006, Horvitz and Tuljapurkar 2008). These are not, however, age 9 stageclassified models and the results do not reveal anything about the interaction of age-specific and stage-specific rates.
This paper addresses explicit age dependence, in which individuals are classified by their age and stage, and the vital rates depend on both variables. The literature on explicit age 9 stage demography is scattered, with fragmentary results appearing in the ecological and demographic literatures. Goodman (1969) first introduced the matrix formulation of age-stage models. Even before that, a continuoustime age 9 size model written as a partial differential equation was presented by Sinko and Streifer (1967). Csetenyi and Logofet (1989) and Logofet (2002Logofet ( , 2013 explored the graph-theoretic implications of the block structure appearing in age 9 stage models, in relation to the irreducibility and primitivity of the resulting model. Recent papers (Caswell 2012, Caswell andShyu 2012) have developed matrix models and their sensitivity analysis, for both cohort properties and population growth rate, for age 9 stage-classified models. These were used in an evolutionary analysis of the selection gradients on senescence, as a function of age and of stage, by Caswell and Salguero-G omez (2013).  explored some aspects of longevity and later (Steiner et al. 2014) developed a powerful age 9 stage-classified analysis of the relation between generation time and population growth rate, with applications to life history theory (generation time will be considered here in the section entitled Population dynamics: generation time).
Multiregional models are an important special case of age 9 stage-classified models. They classify individuals by age and spatial location, and were one of the earliest applications of matrix population models (Rogers 1968) and have been extensively developed in human demography (e.g., Rogers 1975) and ecology (Lebreton 1996, 2005, Lebreton et al. 2000. There is a rich literature in health demography of multistate models (e.g., Willekens 2014) in which individuals are classified by age and health status (see, e.g., Wu et al. [2006] for colorectal cancer, Zhou et al. (2016) for dementia, or Honeycutt et al. (2003) for diabetes). These models focus on longevity and occupancy times in various age-stage combinations, and include only the survival and transition portion of the life cycle. Some epidemiological models have combined age and infection status in nonlinear matrix models that include the infection process (e.g., Caswell 2011, Metcalf et al. 2012).
Perhaps the human demographic usage closest in conceptual structure to ecological practice are multistate population projections, which are used to forecast the short-term, transient dynamics of populations under scenarios of changing mortality, fertility, and migration rates. The basic agespecific projections are sometimes augmented by a stage classification to examine the joint age 9 stage dynamics. The flavor of these efforts is shown by studies on, e.g., the interaction of age with citizenship status (S anchez Gassen 2014), disability (Van der Gaag et al. 2015), or education level (Loichinger 2015).
Ecological examples of explicit age 9 stage models are not common. Law (1983) considered the theoretical problem of age 9 stage models but presented only a hypothetical example. De Roos (2008) analyzes a continuous-time model in which age and stage can be combined. More empirical studies can be found in, e.g., van Groenendael and Slim (1988) and Zuidema et al. (2009). 5. Characteristics of population dynamics, including the population growth rate, stable structure, reproductive value, net reproductive rate, and measures of generation time. 6. Sensitivity and elasticity analysis, showing a completely general formula for the effects of changes in any parameter[s] affecting any of the age-specific or stage-specific rates on any quantity calculated from the model. Table 1 is a list of all the demographic results that will be presented, keyed to the equations in which they are derived.
Notation.-Matrices are denoted by uppercase boldface letters (e.g., U), and vectors by lowercase boldface letters (e.g., n). Block-diagonal matrices are denoted by blackboard font (e.g., U). Matrices and vectors associated with the full age 9 stage model are denoted, e.g.,Ũ,ñ; these matrices are block structured and contain entries for all combinations of age classes and stages. The number of age classes is x and the number of stages is s. The notation for matrices used in model construction is tabulated for easy reference in Table 2.
The unit vector e i is a vector with a 1 in the ith entry and zeros elsewhere. The unit matrix E ij is a matrix with a 1 in the (i,j) entry and zeros elsewhere; the dimensions will be indicated if not clear from the context. The dimensions of certain matrices and vectors are denoted by subscripts; I s is an identity matrix of order s and 1 s is a s 9 1 vector of ones. When convenient, MATLAB (MathWorks, Natick, MA, USA) notation will be used to refer to rows and columns of matrices; thus X (i,:) is the ith row and X(:,j) the jth column of X. The diagonal matrix with x on the diagonal and zeros elsewhere is denoted DðxÞ. The symbol°denotes the Hadamard, or element-by-element product; the symbol ⊗ denotes the Kronecker product. The vec operator transforms a matrix to a vector by stacking the columns on top of each other. The symbol ‖x‖ denotes the 1-norm of the vector x. The transpose of the matrix X is X ⊤ . The matrix K is the vec-permutation matrix (Henderson and Searle 1981); see Box 2. The mean and variance are denoted by E(Á) and V(Á) respectively.

The population vector
Each individual is jointly classified by its age and its stage; the population composition at any time can be written as where rows correspond to stages (1, . . ., s) and columns to age classes (1, . . ., x). The population vectorñ is obtained from N asñ . . .
Inñ, stages are grouped within age classes. The vector can be transformed so that age classes are grouped within stages,  Vol. 88,No. 4 vec where K s,x is the vec permutation matrix (Magnus andNeudecker 1979, Henderson andSearle 1981). This transformation is essential to the analysis of the model, as will be seen below. An explicit formula for K s,x is given in Box 2. All results can be obtained using this arrangement, by properly reformulating the relevant matrices.
Age-and stage-specific demography The influence of age and stage on the vital rates is captured in four sets of matrices: 1. Matrices U j , for j ¼ 1; . . .; x (dimension s 9 s). These matrices contain stage transitions, including mortality, of living individuals in each age class. Because the entries of the U j refer to stages, and each matrix corresponds to an age class, the set includes arbitrarily complicated interactions of age and stage in determining mortality and transitions. 2. Matrices F j , for j ¼ 1; . . .; x (dimension s 9 s). These matrices contain fertilities, describing the stage-specific per capita production of new individuals by reproduction, for individuals in each age class. Because the entries of the F j refer to stages, and each matrix corresponds to an age class, the set includes arbitrarily complicated interactions of age and stage in determining fertility. 3. Matrices D i , for i ¼ 1; . . .; s (dimension x 9 x). These matrices contain age transitions for individuals in each stage. 4. Matrices H i , for i ¼ 1; . . .; s (dimension x 9 x). These matrices assign newly produced offspring, of parents in stage i, to an initial age class (usually the first).
The matrices U j and F j correspond to the familiar decomposition of a projection matrix A into components due to transitions and survival of extant individuals (U) and due to fertility (F) Now, however, the matrices are defined for each of the x age classes. This flexible formulation permits age to influence any of the stage-specific vital rates, in any way, described in either parametric or nonparametric terms. Similarly, it permits stages to influence age-specific rates in any way.
The matrix F j describes reproduction by all stages in age class j. An age 9 stage model must account for the possibility that multiple types of offspring may be produced (e.g. a plant may produce seedlings of different sizes). In the special case where all offspring are of the same stage, F j will have positive entries in only one row, usually chosen to be the first.
The matrices U j and F j move individuals among stages, account for survival, and create new offspring of various stages, at rates that depend on the current age. Following these processes, individuals must be allocated to their next age class. The matrices D i advance surviving individuals from one age class to the next; the simplest such model contains ones on the subdiagonal and the x, x corner; e.g., (for x = 3) The x, x entry converts the final age class into an openended category. Setting it to 0 would kill all individuals at age x.
As written in Eq. 6, advancement in age involves no deaths; all mortality is included in the U j . This need not be the case. If desired, an additional age-specific mortality hazard, affecting stage i could be incorporated by replacing the ones in D i by survival probabilities <1. Such mortality can always be incorporated into the U j , so we will not explore its incorporation into the D i here.

CONCEPTS & SYNTHESIS
The matrix H i allocates the newborn individuals, regardless of the age of their parent, into age class 1. Thus (e.g., for x ¼ 3) Impossible age 9 stage combinations.-An operational decision is required to deal with "impossible" age-stage combinations. For example, if stages are size classes, perhaps large individuals never occur in young age classes, and small individuals never occur in old age classes. In such cases, we set the rows and columns of the U j corresponding to these impossible combinations to 0

Box 1. Creating marginals and mixtures
The Kronecker product expressions for computing marginals (e.g., Eqs. 26 and 27) and mixtures [e.g., equation (19)] may appear confusing at first sight. There is a simple trick to deriving them, which we will reveal here. It relies on the properties of the vec operator and on Roth's theorem that, for any matrices X, Y and Z, vecðXYZÞ ¼ ðZ > XÞvec Y (Roth 1934). In this paper, we have agreed to organizeñ by grouping stages within age classes; that is, by applying the vec operator to the array N in Eq. 1, in which stages appear as rows and ages as columns Matrices (e.g.,Ũ,Ñ) and vectors (e.g.,w,ṽ) inherit the same block structure. Any linear combination of rows of N (i.e., combinations of stages) can be written as a matrix R, and any linear combination columns of N (i.e., of ages) can be written as a matrix C, and applied to N as

RN C:
Applying the vec operator to this gives ðC > RÞvecN : Thus (C ⊤ ⊗ R) operating on the rows of any age 9 stage block-structured matrix or vector captures the operations implied by R and C operating on N , e.g., adding all stages to get a marginal age structure e.g., adding all ages to get a marginal stage distribution e.g., a mixture, defined by a vector p s , all of age class 1 These matrices operate on arrays with the structure of vecN ; i.e., on the rows of an object in which columns have stages arranged within ages. To operate on columns, for example on the columns of the fundamental matrixÑ, the arrays must be transposed (the transpose of a Kronecker product is the product of the transposes). Thus the mixture in Eq. A.7 applied to the columns ofÑ would be ðvecN Þ > ðe 1 p s Þ: Vol. 88,No. 4 stage i and age j is impossible ¼)

564
Because no offspring can be produced by a parent of an impossible age-stage combination stage i and age j is impossible ¼) The age 9 stage-classified projection matrices To construct the age 9 stage model using the vec-permutation matrix formulation, we first construct block diagonal matrices U, F, D, and H, which contain the matrices U j , F j , D i , and H i , respectively, on the diagonal. That is with similar construction for the others. The block diagonal matrices are all of dimension sx 9 sx. The age 9 stage-classified projection matrices are defined in terms of the block-diagonal matrices as A ¼Ũ þF (13) where K ¼ K s;x is the vec-permutation matrix. From right to left in Eq. 11, the matrix U first moves individuals among stages within their age classes. The vec-permutation matrix K rearranges age classes within stages, the matrix D advances individuals to the next age class, and the matrix K ⊤ returns the vector to its original arrangement. In Eq. 12, the matrix F produces new offspring, then K rearranges age classes within stages, and the matrix H assigns all newborn individuals to the first age class. A little manipulation of these matrices reveals thatÃ has a block-Leslie form (e.g., for x ¼ 3) where the 0 matrices are of dimension s 9 s. The formulation ofñ in Eq. 3, in which stages are grouped within age classes, leads to this familiar structure (e.g., Goodman 1969, Feeney 1970, Lebreton 1996. The formulation ofñ with age classes grouped within stages, whereñ ¼ vecðN > Þ, rearranges the blocks in Eq. 14 (e.g., Rogers 1968, Cohen 1982see Caswell 2001: Section 4.3). The systematic construction ofÃ by the vec-permutation matrix algorithm in Eqs. 11 and 12 is enormously beneficial. The biological content of the model, and hence the data collection and data analysis effort, comes from assembling the demographic information in the U j and F j . Of the s 2 x 2 entries ofÃ, at most only 2s 2 x entries contain demographic information. The expressions forŨ andF isolate these components and make it more efficient to conduct analyses, especially sensitivity analyses.
In a later section (A model species example), we develop a hypothetical example to demonstrate the calculations, for a life cycle including two types of larvae and two types of adults.

LIFE TABLE FUNCTIONS
The classical life table functions are the survivorship '(x), the mortality rate l(x), and the distribution of age at death b(x), where x represents age. Together, these functions are a time-honored way to characterize age-specific demography, longevity, and life histories (e.g., Pearl and Miner 1935, Dublin and Lotka 1937, Deevey 1947, Slobodkin 1961, Hutchinson 1978. As is well known, any one of them suffices to calculate any of the others.
In this section, we derive the life table functions for age 9 stage-classified models. The extra dimension of stage classification adds a rich set of additional life table perspectives. In an age-classified model, for example, survival to some age x either happens or it does not; if it does, it follows a defined pathway of ages 1, . . ., x. In an age 9 stage-classified model, survival to a given age can take place via a potentially infinite set of developmental pathways through the combinations of ages and life history stages. Each pathway has its own probability of occurrence, and the life table functions must integrate over all those pathways and probabilities. In addition, if there are multiple types of offspring, a cohort may start off with a mixture of different stages at birth, which will affect the survival and mortality results. Fortunately, the matrix formulation makes these calculations possible.

Survivorship and mortality
Survivorship '(x) is the probability that an individual survives from age 0 to age x, 'ðxÞ ¼ Pðsurvival from age 0 to age xÞ 1 Â 1: In an age-classified model, '(x) is calculated by projecting an initial age 0 cohort of size 1; the resulting values give the proportion of the cohort surviving.
In an age 9 stage-classified model, a survivorship vector is obtained by projecting a cohort usingŨ The survivorship vector'ðxÞ, of dimension sx Â 1, inherits the block age-stage structure ofŨ; its entries give the probability of survival to age x of an individual starting in every age-stage combination. We extract the vector of survivorship starting from the first age class by where e 1 is a unit vector of length x with 1 in the first position and zeros elsewhere. The ith entry of 'ðxÞ gives the survivorship to age x of an individual starting life in stage i, accounting for the joint effects of age and stage on survival, November 2018 AGE 9 STAGE DEMOGRAPHIC ANALYSIS 565

CONCEPTS & SYNTHESIS
integrating over all age-stage developmental pathways between birth and age x.
If the cohort begins as a mixture of stages specified by a mixing distribution p, the resulting survivorship function is Note that our convention (Eq. 8) that transitions involving impossible age-stage combinations are set to 0 implies that ' i ðxÞ ¼ 0 if stage i does not occur among the types of offspring.
In age-classified models, the mortality rate, or hazard function, l(x) is defined by the relationship so that That is, mortality rates are given by the slope (with the sign reversed) of the log of the survivorship function. In the age 9 stage-classified model, using the vector 'ðxÞ in Eq. 17, we obtain a vector of mortality rates as a function of birth stage as where 'ðxÞ is given by Eq. 17 and the log function is applied elementwise. The mortality schedule for birth stage i, l i (x) is undefined if stage i and age class 1 is an impossible combination. The apparent mortality schedule (i.e., what would appear to an observer ignorant of the mixed composition of the cohort) for a mixed cohort with mixing distribution p, is where '(x | p) is given by (19). This schedule is "apparent" because it results from the mixture of different initial stages; no individual actually experiences l(x | p). Eqs. 22 and 23 provide mortality schedules resulting from the full age-stage dynamics; l(x) and l(x | p) capture the full complexity of age-and stage-specific development. We will return briefly to the survivorship function, and present an alternative method of computation, in the section entitled The fundamental matrix and longevity.
Intra-cohort selection.-In an age-classified model, as a cohort ages, it shrinks as individuals die. In an age 9 stageclassified model, a cohort shrinks and its stage composition changes. Stages with higher mortality rates and/or shorter stage durations decrease in relative frequency due to intracohort selection.
To analyze intra-cohort selection, define an initial cohort n 0 and project it forward bỹ nðxÞ ¼Ũ xñ 0 : The projected cohort vectorñðxÞ gives the complete joint distribution of abundance by age and stage at age x, dependent on the initial cohortñ 0 . NormalizingñðxÞ to sum to 1 gives the joint age-stage frequency distribution, from which the marginal age and stage distributions can be calculated as: The general approach to the computation of such mixture and marginal quantities is outlined in Box 1. In general, mðxÞ will converge, as x increases, to the right eigenvector corresponding to the dominant eigenvalue ofŨ (Horvitz and Tuljapurkar 2008).

Box 2. Computing the vec-permutation matrix
The vec-permutation matrix (Magnus andNeudecker 1979, Henderson andSearle 1981) connects the vec operator and the matrix transpose. If X is a m Â n matrix, then vec X > ¼ K m;n vecX: The matrix can be calculated as where E ij is a matrix, of dimension m Â n, with a 1 in the (i,j) entry and zeros elsewhere. The choice ofñ 0 depends on the question of interest. By definition, a cohort consists of individuals of the same age, which we can take as age class 1. Define p 0 as the distribution of stages within this age class. Theñ where e 1 is a vector of length x with a 1 in the first entry and zeros elsewhere. If the goal is to analyze a cohort all members of which start in stage j, then p 0 is a unit vector of length s with a 1 in the jth location and zeros elsewhere.
The fertility function.-In an age-classified model, the (scalar) fertility function f(x) gives the mean number of offspring produced, per unit time, by a parent aged x. In an age 9 stage-classified model the fertility matrices F x , for x ¼ 1; . . .; x, are the multivariate analogue of f(x). The (i, j) entry of F x is the expected number of offspring of type i produced, per unit time, by a parent of type j and age class x.
The fertility matrix F x can be simplified to obtain three different age-specific fertility measures.
1. Weighted offspring production. Multiple offspring types, when they exist, are combined into a weighted sum, with weights defined by, e.g., body size, parental investment, reproductive value, etc. Let c be such a vector of weights (s 9 1). The vector of weighted fertility at age x is The jth entry of f weighted (x) is the weighted mean number of offspring produced by a parent of stage j at age x. 2. Mixed offspring production. The vector giving the offspring, of all types, produced at age x by a mixture of stages given by a mixing distribution p(x), is One source for p(x) is the stage structure of a cohort of individuals of age x. Given an initial cohort compositioñ n 0 containing only individuals in age class 1, the appropriate mixture is given by Eq. 26. The jth entry of f mixed (x) is the number of type j offspring produced per individual in a mixed cohort at age x, with a specified composition. 3. Mixed and weighted offspring production. The scalar fertility function, giving the weighted number of offspring produced by a mixed cohort at age x, is If only one type of offspring exists, and if stages do not differ in their fertility, f(x) reduces to the familiar agespecific fertility function.
The functions F x , f weighted (x), f mixed (x), and f(x) capture different aspects of the (possibly complicated) age-and stage-dependence of reproduction.

COHORT DYNAMICS: LONGEVITY STATISTICS
Longevity is the age at death of an individual. Statistics of longevity are frequently used to summarize mortality schedules. The most familiar such statistic is the life expectancy, or mean age at death. Measures of variation (the variance, standard deviation, coefficient of variation, and life discrepancy) provide information on stochastic variation in longevity among individuals. For example, many human populations have recently exhibited declines in variance in longevity, accompanying increases in life expectancy (e.g., Edwards and Tuljapurkar 2005, Tuljapurkar and Edwards 2011, Vaupel et al. 2011, Van Raalte and Caswell 2013, Engelman et al. 2014).
In age 9 stage-classified models, longevity depends on how mortality varies over age and among stages, and we now turn to the calculation of these statistics.

The fundamental matrix and longevity
Longevity statistics are obtained by treatingŨ as the transient matrix of an age 9 stage-classified absorbing Markov chain (Feichtinger 1971, Caswell 2001, 2009. If the states are numbered so that the transient (alive) states precede the absorbing (dead) states, the transition matrix for such a chain can be writteñ whereM is a matrix of transition probabilities from transient to absorbing states. We will considerM in detail in the section entitled Age and stage at death. The fundamental matrix for the chain is The matrixÑ inherits the block structure ofŨ, with stages arranged within age classes. The (i,j) element ofÑ is the expected time spent in transient state i by an individual starting in transient state j (see Caswell 2006Caswell , 2009Caswell , 2013 for the higher moments and variances), where i and j range over all combinations of age and stage. Longevity is measured by the time to eventual absorption in one of the states representing death. Well known results from Markov chain theory (Iosifescu 1980: Theorem 3.2) give the vectors of the mean and the second moments of time to absorption, which satisfỹ The entries ofg 1 , of dimension sx 9 1, give the life expectancy of individuals of each age-stage combination. The entries are arranged as inñ, with stages within age classes.

CONCEPTS & SYNTHESIS
The vectorg 2 contains the second moments of longevity, so the vector of variances of longevity is V ðgÞ ¼g 2 À ðg 1 g 1 Þ: This vector contains the variances in remaining longevity among individuals of every possible age-stage combination, reflecting stochasticity in advancement through age classes and transitions among stages. This variance can be decomposed into contributions within and between stages (cf. Hartemink et al. 2017), as we show in the next section.
Partitioning the variance in longevity.-In a cohort at age x, individuals will vary in longevity for two reasons: the stochasticity inherent in aging and transitions, and the heterogeneity inherent in the stage distribution at age x. To analyze the variance in remaining longevity at age x, define the vector g(x), of dimension s Â 1, whose ith entry gives the remaining longevity of individuals of stage i at age x, and the scalar g(x) that gives the remaining longevity of an individual of age x in a mixed cohort. Extract the mean and variance of remaining longevity in age class x as where e x (x) is a unit vector of length x with 1 in the xth position and zeros elsewhere, E( Á ) is the mean and V( Á ) is the variance. Let p(x), of dimension s 9 1 be the mixing distribution of stages at age x. This could be obtained as the vector m stage (x) in Eq. 26. Then the mean longevity of age class x, treated as a mixture of stages, is and the variance is partitioned into a component due to stochasticity within a given stage and a component due to heterogeneity among the stages, where the within-group variance is the expectation over p(x) of the variances, The between-group variance is the variance over p(x) of the means, This variance decomposition is well-known in probability theory (R enyi 1970: Chapter 5.6, Theorem 1), forms the basis of the analysis of variance in statistics (e.g. Kempthorne 1957), and is a familiar tool in the analysis of mixture models (Fr€ uhwirth-Schnatter 2006). It is a valuable tool in quantifying the relative contributions of heterogeneity and individual stochasticity to the variance in demographic outcomes (e.g. Caswell 2009, Edwards 2011, Hartemink et al. 2017, Jenouvrier et al. 2018, Hartemink and Caswell 2018. It is a natural tool for the analysis of age 9 stage-classified models, because the incorporation of stage-dependence is a natural way to include the effects of heterogeneity in age-specific parameters. 1. Beyond longevity.-The Expressions 43 and 45 require only the expressions for the mean and the variance of g, for each stage at each age of interest. Although we have described it in terms of variance in longevity, it can be applied to any quantity for which those quantities are available. For example, a stage-classified calculation has applied the analysis to age at first reproduction and lifetime reproductive output (Jenouvrier et al. 2018). The extension to age 9 stage-classified models will, in some cases, require an extension of occupancy time theory (Roth and Caswell 2018), in other cases the variance terms are directly available.
2. Survivorship revisited.-The fundamental matrixÑ provides an alternative way to compute survivorship, dependent on the stage of the individual at birth (Keyfitz and Caswell 2005). Starting withÑ, we first aggregate the rows by summing stages within age classes; the result gives the mean number of visits to each age class. But an age class can be visited at most once, so the mean number of visits equals the probability of ever visiting that age class, which is the survivorship. The column block corresponding to starting age class 1 contains the survivorship of each stage at that initial age; this is The matrix L, of dimension x Â s, contains as columns the survivorship functions for each stage. The survivorship for a mixture of initial stages, specified by a mixing distribution vector p s , is given by Age and stage at death In classical age-structured life tables, the probability distribution of age at death is bðxÞ ¼ 'ðxÞ À 'ðx þ 1Þ. In age 9 stage-classified analysis, there is a joint distribution of age and stage at death, for an individual of any initial age-stage combination. To obtain this joint distribution, we construct a mortality matrixM in the absorbing Markov chain (Eq. 33) such that the absorbing states are defined by the combination of age and stage at death. To do so, define q i as the vector of stage-specific mortality probabilities for age class i, obtained from U i as  Vol. 88,No. 4 Then the mortality matrixM is inheriting the block structure ofŨ. As a simple extension of the result for age-or stage-classified models (Caswell 2001(Caswell , 2009(Caswell , 2012 all the joint distributions of age and stage at death are contained in the matrix Each column ofB contains the joint distribution of age and stage at death for an individual in one of the possible agestage categories. Marginalizing by summing stages within age classes, or age classes within stages, gives the marginal age and stage distributions as the columns of the matrices The distributions resulting from a mixed cohort, with mixture distribution p sx arẽ bðpÞ ¼Bp sx sx Â 1 (53) As this collection of results makes clear, the combination of age and stage dependence of all the vital rates creates rich opportunities to examine the distributions of age and stage at death.

Life disparity: years of life lost
An important application ofB is to the calculation of the life disparity, or the mean number of life years lost to mortality, an index introduced by Vaupel and Canudas Romo (2003). In an age-classified model, an individual that dies at age x, which happens with probability b(x), loses some remaining years of life that would have been available had the individual not, in fact, died. The lost years of life are unknown, but their expectation is the remaining life expectancy at age x. Thus the expectation of the life lost due to mortality is obtained by integrating over all ages at death where e(x) is remaining life expectancy at age x.
The mean life lost plays another role: it is a measure of the variation in age at death called life disparity. If all individuals were to die at the same age, say x*, then b(x) would be a delta function at x* and e(x*) would be 0, and (Eq. 56) would give e y ¼ 0. As a measure of variability in longevity, e † is highly correlated with the standard deviation of longevity and other measures of variation. Vaupel et al. (2011) analyzed patterns of e † across countries in relation to life expectancy as a way to analyze premature death.
In an age 9 stage-classified model, we define a vector The vectorg 1 contains the expected remaining longevity of every age-stage combination; the columns ofB contain the probabilities of death at each age-stage combination. Thus the vectorg y contains the expected life lost due to mortality, for individuals of every age and stage combination. These age 9 stage-specific expectations can be combined according to a mixing distribution p (dimension sx Â 1) as The result g y ðpÞ is a scalar, the mean life lost due to mortality in a cohort composed of age-stage combinations in proportions given by p.

POPULATION DYNAMICS: GROWTH AND STRUCTURE
We turn now from probabilistic outcomes for individuals within cohorts to the dynamics of populations. Population dynamics are the outcome of survival, development, and fertility throughout the life cycle. The first two are accounted for byŨ; the third is accounted for byF. The basic projection of population dynamics is with specified initial populationñð0Þ ¼ñ 0 .

Stable population theory and demographic ergodicity
IfÃ is time-invariant, we expect the population to converge, from any non-negative and non-zero initial population, to exponential growth at a rate k given by the dominant eigenvalue ofÃ, and a structure proportional to the corresponding right eigenvectorw. The reproductive value vector will be given by the corresponding left eigenvectorṽ ofÃ.
When normalized to sum to 1, the vectorw gives the joint age 9 stage distribution in the stable population. The marginal stable age and stage distributions are given by The reproductive value vectorṽ is not a probability distribution; instead, the entries ofṽ give the asymptotic relative sizes of populations initialized with a single individual of the corresponding age-stage combination. The reproductive value of a mixture of ages and stages is where p is the mixing distribution.

CONCEPTS & SYNTHESIS
Ergodicity.-Our expectation of ergodic behavior (the convergence to exponential growth and a stable structure from any initial condition) may be disappointed, becauseÃ may not be irreducible (a matrix is irreducible if there exists a pathway from every state to every other state). The Perron-Frobenius theorem guarantees that the state space of a reducible matrix can be decomposed into subspaces, each of which leads to different asymptotic behavior (see Csetenyi and Logofet 1989, Caswell 2001: Section 4.5, Stott et al. 2010, Caswell 2012. For example, life cycles with post-reproductive age classes are reducible and the behavior of such a population depends on whether the initial population contains only post-reproductive individuals (eventual extinction) or contains some reproductive individuals (potential population growth).
The i-state space of an age 9 stage-classified model will certainly be reducible if it contains some impossible combinations, which appear as rows and columns of zeros inÃ (see Eqs. 8 and 9). These states cannot be reached from, nor do they lead to, any other states; henceÃ is reducible. However, if we make the eminently reasonable decision not to start with a population composed solely of impossible individuals, these states will have no effect on eventual dynamics.
It is more challenging to deal with reducibility generated by the complicated pathways through the life cycle produced by the interaction of age-specific and stage-specific processes. Stochastic ergodic theorems may provide some guidance (Cohen 1982). Ergodicity can, in any specific case, be determined using the patterns of zero and non-zero entries of the reproductive value vectorṽ (Caswell 2012). Ifṽ has positive entries for all non-impossible states, then population dynamics converge to k andw from any initial condition not restricted to impossible states. Ifṽ has zero entries for some non-impossible states, then an initial condition restricted to those states would not converge tow, but rather to the distribution given by a different eigenvector. This possibility can easily be checked for any specified matrix.
Net reproductive rate.-The familiar net reproductive rate in age-classified models, serves three functions (Caswell 2009): it is the mean number of (usually female) offspring produced over a lifetime, it is the per-generation growth rate of the population, and it is an indicator function that distinguishes population growth (R 0 > 1) from population decline (R 0 < 1). Cushing and Zhou (1994) generalized this concept to stage-classified models by showing that where R is the next-generation matrix, If F contains only a single type of offspring, then Eq. 65 satisfies all three functions of the net reproductive rate. But if multiple types of offspring exist, then R 0 defined by Eq. 65 satisfies the last two (per-generation growth rate and growth indicator function), but it is not the mean lifetime number of offspring. Indeed, if multiple types of offspring exist, there is no single "number of offspring" to calculate.
The Cushing-Zhou result (Eq. 65) applies equally to age 9 stage-classified models, so R 0 ¼ max eigðRÞ (67) whereR ¼FÑ. However, taking advantage of the structure of the population vector, we can carry the analysis further. Let F ij and N ij denote the age blocks inF andÑ, respectively. For example, for x ¼ 3, The matrixR is block upper-triangular, and hence its dominant eigenvalue is The matrix R 11 is of dimension s Â s; its (i,j) entry is R 11 ði; jÞ ¼ Eðlifetime stage i offspringjstarting in stage jÞ: It can be extracted easily fromR as where The remaining lifetime reproductive output at age class j is obtained by applying the following analyses to the block R 1j .
Mean lifetime reproduction.-As was the case for stage-classified models, if there exists only a single stage of offspring, then R 0 calculated from the Cushing-Zhou theorem also gives the mean lifetime reproduction. However, when there are multiple offspring stages, the matrix R 11 contains information on the production of all these types of offspring. Just as we did for the age-specific fertility function in Eqs. 29-32, 570 Vol. 88,No. 4 we can aggregate lifetime reproductive output in several ways.
1. Weighted mean lifetime reproduction. Let c s be a vector of weights, of dimension s Â 1. Then The vector r weighted is a weighted combination of the rows of R 11 , and c, a vector whose entries assign weights to the different types of offspring. The ith entry of r weighted gives the mean weighted lifetime reproduction by an individual starting life in stage i. 2. Lifetime reproduction of a mixed cohort. A cohort starting as a mixture specified by a mixing distribution p will have a lifetime reproductive output of This vector is a weighted combination of the columns of R 11 ; its ith entry is the mean lifetime production of type i offspring by the mixed cohort. 3. Mixed and weighted lifetime reproduction. A scalar measure of mean lifetime reproduction is given by In the special case where only a single type of offspring is produced, q ¼ R 0 . But in general, lifetime offspring production in age-stage models is a more diverse and nuanced concept than R 0 . Note that much more information about lifetime reproductive output, including variances, higher moments, and sensitivity analysis can be obtained using Markov chains with rewards (Caswell 2011, Caswell 2015, 2017). The application of these methods to age 9 stage-classified models will be explored elsewhere.

POPULATION DYNAMICS: GENERATION TIME
Generation time, which is defined in several ways, is an important demographic measure of the time scale on which a population operates (Gaillard et al. 2005, Lebreton 2005, Steiner et al. 2014. In conservation standards established by the International Union for the Conservation of Nature, population decline measured in units of generation time helps establish the threat level for a species.
Three measures of generation time are in common use in age-classified demography (Coale 1972, Caswell 2001: the growth rate generation time, which is the time required to grow by a factor of R 0 , the cohort generation time, which is the mean age of the production of offspring by an individual, and the stable age generation time, which is the mean age of the parents of the offspring produced in a population at the stable age distribution c(x), Of these three, the cohort generation time is the most appropriate as a measure of the time scale on which a life history operates, because it is explicitly calculated as a property of an individual over its lifetime. The cohort generation time is denoted as l by Coale (1972), but that symbol is also used for the mortality rate. It is not unusual for such notational conflicts to arise when discussing demographic calculations that span a variety of indices or processes. Three complications arise in extending the familiar calculations for age-classified demography to age 9 stage models: (1) Multiple kinds of offspring may be produced, following different age schedules, and thus each appearing to have a different generation time; (2) Individuals that start life in different offspring stages may survive, mature, and reproduce differently, and hence have different generation times; (3) Unlike age-classified models, in which a cohort is always of a fixed age, a cohort may start life as a mixture of stages.
The cohort generation time for stage-classified models was derived in (Caswell 2009: Appendix A.5). What follows is the extension to age 9 stage-classified models, in which both age and stage trajectories must be taken into account.
Begin with a cohortñ 0 (dimension sx 9 1); being a cohort, it contains only age class 1 individuals, but may have any initial stage distribution. Normalize the cohort so that kñ 0 k ¼ 1. The survivors of this cohort at age x arẽ nðxÞ ¼Ũ xñ 0 (81) and the offspring, in the first age class, produced by these survivors are where e 1 is a unit vector of length x. Summing this quantity over all ages gives the vector of total lifetime reproduction, The mean age of the production of the offspring over the lifetime is proportional to But X 1 x¼0 xŨ x ¼ÑŨÑ: November 2018 AGE 9 STAGE DEMOGRAPHIC ANALYSIS 571

CONCEPTS & SYNTHESIS
The proportionality constant is required to make the entries of / life sum to 1 as probability distributions; with this normalization the vector of cohort generation times, for each starting stage of offspring, is The ith entry of Γ is the mean age of production of offspring of type i by a cohort of individuals with initial stage distributionñ 0 , and accounting for all stage transitions tha occur as the individuals age. Entries of Γ corresponding to stages that never appear as offspring will be undefined (0/0) in Eq. 88; they should be set equal to 0.

SENSITIVITY ANALYSIS OF AGE 9 STAGE-CLASSIFIED MODELS
In this paper, each step in the construction and analysis of the age 9 stage-classified model has been written in terms of matrix operations. This is not only for notational, analytical, and computational efficiency. It also makes possible the systematic calculation of the sensitivity of any output to changes in any set of parameters, using matrix calculus. Some previous studies have applied matrix calculus to vec-permutation models that are similar or equivalent to age 9 stage-classified models, including stage-classified epidemics (Klepac and Caswell 2011), spatial models (Strasser et al. 2012), the sensitivity of population growth rate in age-stage models (Caswell 2012, Caswell and Shyu 2012, Caswell and Salguero-G omez 2013, and the effects of age and frailty (Roth and Caswell 2016). These results can be extended to a general sensitivity analysis of age 9 stage-classified models; details will be presented elsewhere.
Let ξ be any output variable (see Table 1), scalar-or vector-valued, calculated fromÃ, and let h be a vector of parameters that affect any of the U i and/or F i . In matrix calculus notation, the sensitivity of ξ to h is Each of the numbered terms in (Eq. 89) has its own function.

Term 1
Expresses the dependence of the outcome variable ξ on the matrixÃ or its componentsF andŨ. this matrix is obtained by differentiating ξ with respect toŨ andF. For example, if n ¼ k, then Term 1 is the familiar derivative of k to the entries of the matrixÃ, But suppose instead that interest focuses on the net reproductive rate, as defined by Eq. 67, and let y and x be the right and left eigenvectors ofR corresponding to R 0 , scaled so that y > x ¼ 1. Then n ¼ R 0 , and Term 1 is the derivative of the net reproductive rate with respect toÃ, which is (extending calculations from Caswell [2009] to the age 9 stage case). The point of these two examples is that Term 1 is the only part of the sensitivity calculation that depends on what the dependent variable may be (k, R 0 , or anything else).

Term 2
These are constant matrices, depending only on the vecpermutation matrix K the age transition matrix D and the age assignment matrix H. SinceŨ andF are given by Eqs. 11 and 12, the two derivatives in Term 2 are Notice that these matrices are independent of the dependent variable or on the identity of the parameters h.

Term 3
These are the derivatives of the block diagonal matrices U and F to their diagonal entries U i and F i . As such they depend only on the arrangement of the matrices on the diagonal. A new way to obtain these derivatives is based on Eqs. 14 and 15 of Caswell and van Daalen (2016). Define the matrices P i and Q i , of dimension xs Â s and s Â xs, respectively,  Vol. 88,No. 4 Then Again, note that this term is independent of the dependent variable and the parameters being perturbed.

Terms 4 and 5
These terms contain the biologically interesting part of the calculation, because it is at this point that we find the dependence of the transition and survival matrices U i and the fertility matrices F i on the parameter vector h. This is the point where the age-stage specific processes of survival, growth, development, reproduction, etc. are determined.
The possibilities are limited only by imagination. As an example, suppose that interest focuses on a parameter that imposes a multiplicative perturbation on the stage-specific survival probabilities at a specified age or ages. We can write where Σ i is a diagonal matrix with a vector of survival probabilities r on the diagonal and G i is a matrix of transition probabilities conditional on survival. From this expression, it can be shown that Now incorporate the parameter vector by writing r i ¼r i h, wherer i is the baseline vector of stage-specific survival probabilities at age class i. Then Because this perturbation is hypothesized to affect only survival, the derivatives of F i to h in Eq. 89 are all zero. This calculation is an arbitrary example. The reader can no doubt think up possible applications (e.g., h might describe harvest or bycatch impacts that reduce the survival probability of certain stages of a threatened species, or it might reflect allocation of resources to survival, with an associated cost in fertility, . . .).
The causal pathways from a parameter vector h to some demographic outcome ξ can be enormously complex in an age 9 stage-classified model. The great value of the vecpermutation formulation is that all these pathways are taken into account, and that attention can focus on the effects of parameters on the age 9 stage-specific vital rates in terms 4 and 5, and on the way that the outcome is computed from the age 9 stage matrixÃ in term 1. The connection between these two, for any outcome for any species described by any set of stages, is given by the series of constant, and easily computable, matrices in terms 2 and 3.

A MODEL SPECIES EXAMPLE
Here, we construct and analyze an age 9 stage-classified matrix model for a hypothetical model species, inspired by (although not identical to) poecilogonous marine invertebrates (Levin et al. 1987, Krug 2007, Knott and McHugh 2012. These are species that produce two dramatically different types of larval offspring. One type, so-called planktotrophic larvae, feed during the larval stage and receive only a small parental investment. The other type, lecithotrophic larvae, do not feed, and rely on a large parental investment in the form of a yolk reserve. Lecithotrophic larvae are more costly to produce and are produced in lower numbers than are planktotrophic larvae. However, lecithotrophic larvae have a higher survival probability than their planktotrophic siblings, an advantage that persists into post-larval life (Levin et al. 1987).
Our model species has four stages: large and small juveniles, and adults that began life as either large or small juveniles. We will refer to these as large and small adults, although large-born and small-born might be more accurate. MATLAB code for the calculations is available as an Supporting information.

The age 9 stage-classified model
The life cycle of our model species contains small and large juveniles and small and large adults. Stage-specific demography is defined in terms of survival probability r, maturation probability (conditional on survival) c, and fertility f. Each of these parameters is a function of both stage and age. The resulting life cycle graph is shown in Fig. 1; the model parameters are described in Table 3. The life cycle graph describes the stages; age dependence enters by making the survival, maturation, and fertility parameters in the graph into functions of age as well as stage.

CONCEPTS & SYNTHESIS
Life history description.-Because the two types of offspring differ in the resources allocated to them by the parent, offspring type has consequences throughout the life cycle. Both juvenile and adult survival probability are lower for small individuals ( Fig. 2a and b). Small juveniles also begin to mature later than large ones and their maximum maturation probability is lower (Fig. 2c). Small individuals are less costly to produce, so they are produced at a higher rate, by both small and large adults. We have incorporated a certain heritability of type, which results in large adults producing relatively more large juveniles and small adults producing relatively more small juveniles ( Fig. 2d and e).

Matrix construction
We construct the age 9 stage model from the stage transition matrices U x , the reproduction matrices F x , the age transition matrix D and the age assignment matrix H. The life cycle contains s = 4 stages (stage 1, small juveniles; 2, large juveniles; 3, small adults; and 4, large adults), and we consider x = 50 age classes.
The matrix U x describes the transition and survival probabilities for individuals in age class x, The s 9 s matrices F x describe the stage-specific per capita production of new individuals by reproduction, for individuals in age class x, The first and second rows of F x describe per capita production of small and large juveniles, respectively. The x Â x matrix D i advances the age class of surviving individuals in stage i. All of the D i are identical, given by Eq. 6. The x Â x matrices H i that assign newborn individuals to the first age class are all identical, given by Eq. 7. The matrices U; F, D, and H are created by putting U i , F i , D i , and H i on the diagonals, as in Eq. 10. Each of these matrices is of dimension sx Â sx, which for our model is 200 Â 200 . Finally, the age 9 stage-classified projection matricesŨ,F, andÃ are constructed following Eqs. 11-13.

Population growth and structure
It is convenient to begin with analyses of population growth, because the stable population structure will provide a family of mixing distributions that will be used several times in the cohort analyses to follow.
Stable population structure.-Using the parameters defined by Fig. 2, the population growth rate given by the dominant eigenvalue ofÃ is k ¼ 1:0083. The joint distribution of age and stage in the stable population is contained in the corresponding eigenvectorw, normalized to sum to 1. Fig. 3 shows the joint distribution of age and stage and the marginal age distribution w age and stage distribution w stage , calculated from Eqs. 61 and 62.
The joint distribution of age and stage in the stable population is dominated by young and small individuals (Fig. 3a). Adults (small and large) appear only at later ages. The marginal age distribution decays rapidly with age; most individuals are younger than 10. The marginal stage distribution is dominated by small juveniles (~60%). Large juveniles are much scarcer (~15%). Large adults, however, are more common (~20%) than small adults (~5%).
Even with only four stages, the interaction between age and stage in the stable population is complicated. The conditional stage distribution within each age class is shown in Fig. 3d. Up until an age of about 10, the population is dominated by small juveniles, but at later ages, it is dominated by large adults. The pattern results from the differences in survival probability and development time between large and small juveniles and adults.
A mixing distribution for cohort initiation.-Cohort dynamics begin with a group of newborn individuals, all in the first age class, but in possibly different stages. Calculations of survivorship, longevity, reproduction, and the age and stage at death need the stage distribution of the cohort as a mixing distribution. The mixing distribution p (dimension s 9 1) is obtained by extracting the entries ofw corresponding to age class 1 and normalizing so that the resulting vector sums to 1   Note: The index j refers to the age class of the individual; for reproduction it refers to the age class of the parents.

574
HAL Vol. 88,No. 4 where e 1 is a unit vector of length x. Thus about 85% of the new offspring in the stable population are small.
Reproductive value.-Reproductive value is given by the left eigenvectorṽ ofÃ (dimension sx Â 1). It is a relative measure of contribution to future population size, customarily scaled relative to that of a newborn individual. In our model species, there are two such types, so Fig. 4 shows reproductive value scaled relative to that of a small juvenile. The survival and maturation advantages of large juveniles relative to small juveniles translate into a dramatic increase in reproductive value. Juvenile reproductive value increases with age, as individuals get closer to maturation. After maturation, large adults have a reproductive advantage.

Cohort analyses
Survivorship.-The stage-specific survivorship vector 'ðxÞ is calculated by projecting a cohort with Eq. 17 and counting the proportion of survivors. Because our model species has two possible initial stages, small and large juveniles, the vector 'ðxÞ has only two non-zero entries. Fig. 5 shows the survivorship of small and large individuals. Survivorship declines rapidly at young ages, when individuals are subject to the high mortality rates of the juvenile stages, and then declines more slowly after maturation. Large individuals have a dramatic survivorship advantage. The survivorship of a mixed cohort, starting with a mixing distribution p calculated from the stable structure in Eq. 106, lies between the survivorship curves for small and large individuals, but much closer to that of small individuals. This reflects the dominance of the mixing distribution by small juveniles.
Longevity.-The remaining life expectancy (mean remaining longevity at a given age) of every stage is given by the vector g 1 , of dimension sx Â 1, calculated from the fundamental

576
HAL Vol. 88,No. 4 matrix using Eq. 35 (Fig. 5c). Life expectancy of juveniles increases with age as they approach maturity, after which they will experience higher survival probability. The life expectancy of the adult stages decreases smoothly with age. The variance in longevity, calculated using Eq. 37, decreases with age for large juveniles, but shows a peak just before the end of maturation, at which point individuals either die very soon, or mature and probably survive for quite some time, hence the large variance (Fig. 5c). Small juveniles show a similar peak in variance. For adults, variance decreases steadily with age.
The mean and variance of longevity for a mixed cohort are calculated using Eqs. 40 and 41 and the mixing distribution vector p 1 given by Eq. 106. The resulting mean and variance in longevity, at birth, are shown in Table 4.
When the variance in longevity of the mixed cohort is decomposed into that due to heterogeneity among stages and that due to stochasticity within stages, using Eqs. 43 and 45, the variance due to stochasticity was 35.2 and the variance due to heterogeneity was 17.2. Thus the percentage of the variance attributable to the heterogeneity among stages is 32.8%.
Fertility.-The entries of F x in Eq. 104 give the mean production of each type of offspring by parents of each type, at each age, as shown in Fig. 2d and e. We summarize this information with the fertility indices defined in Eqs. 29-32.  As an example of a weighted fertility schedule, we suppose that a large offspring is 10 times as costly as a small one, so we set The resulting fertility schedule f weighted is shown in Fig. 6a. The mixed fertility schedule, given by Eq. 30 is shown in Fig. 6b. And, finally, the weighted and mixed schedule from Eq. 32 is shown in Fig. 6c.
A large juvenile will, as a result of the combined age-and stage-specific life history characteristics of the model species, produce many more offspring (by a 14-fold difference for small offspring and a 78-fold difference for large offspring). The net reproductive rate, calculated as the dominant eigenvalue of R 11 , is Suppose that we simply want to count total numbers of offspring produced. Then we use a weighting vector c ¼ 1 s , and obtain r weighted ¼ 0:35 5:74 0 0 ð Þ > : We see that, in terms of lifetime number of offspring, large individuals have a 16-fold advantage over small individuals.
To compute the lifetime reproduction of a mixed cohort, we use the mixing distribution p calculated from the stable population in Eq. (106), to obtain r mixed ¼ 0:99 0:18 0 0 ð Þ > : Combining this particular choice of weighting and mixing, we obtain the mean lifetime reproduction, by a

578
HAL Vol. 88,No. 4 mixed cohort, and counting total offspring of both types, as q ¼ 1:167: The analysis of an age 9 stage-classified model untangles the interacting effects of age-and stage-specific parameters. Even in a simple model species, large-born individuals have a higher survivorship than small-born individuals, and their earlier maturation increases their survival advantage because they can escape juvenile mortality sooner. The two stages have very different lifetime reproductive output, not simply as a result of differences in production ( Fig. 2d and e), but as an interacting effect of juvenile survival, adult survival, and maturation. Our model example also demonstrates how the composition of a population will influence the demographic calculations. Mixed cohorts and populations and different choices of weighting vectors determine how a mixed population will perform in terms of survival, reproduction and population growth. These mixture calculations make possible the analysis of heterogeneity in initial population structure, and in the structure that develops as cohorts age within a population.

DISCUSSION
Our goal in this paper has been the development of a systematic methodology, of wide applicability, for the analysis of age 9 stage-classified matrix population models.

Themes
Several themes have appeared repeatedly. Because the state space of an age 9 stage-classified model is two-dimensional, results that are familar as scalars in age-classified or stage-classified models now become vectors or matrices. The operations of marginalization and mixing thus play a critical role.
Consider any demographic quantity calculated from an age 9 stage-classified model. If this quantity is a joint function of age and stage (e.g., the age and stage at death), then the marginal distributions (age at death and stage at death) are obtained as in Eqs. 51 and 52. If, on the other hand, a quantity is conditional on the age and stage of an individual (e.g., mean longevity), then the properties of this quantity for a heterogeneous cohort of individuals in different stages is calculated as a mixture, as in Eqs. 40 and 41. Using the total variance theorem, the variance in this quantity is partitioned into components due to stochasticity and to the heterogeneity created by the mixture, as in Eqs. 43 and 45. Any age 9 stage-classified analysis should keep these concepts in mind.
Another theme that appears throughout our methodology is the interaction of age 9 stage-dependence across three levels: the level of the individual (as in the life table functions), the level of the cohort (as in calculations of life expectancy and lifetime reproduction), and the level of the population. Although these levels are implicit in both population ecology and human demography, they appear explicitly here as a part of the methodology.
Variance is a recurring theme here. There is an increasing appreciation of the importance of variance created by stochastic differences between individuals (individual stochasticity). The inclusion of variance calculations in the approach here makes possible the decomposition of variances into components due to demographic differences (i.e., heterogeneity) among stages and ages, and to stochastic factors within stages.

Estimation and data
It is no surprise that age 9 stage-classified models require an extra dimension of data: age-specific rates at every stage, stage-specific rates at every age. Such detailed data are rare (but see van Groenendael and Slim 1988) but we believe they will become more common as the importance of long-term individual data is recognized (Clutton-Brock and Sheldon 2010). The appropriate estimation procedures will, as always, depend on the properties of the species, the possibilities for monitoring, logistical difficulties, and available statistical methods. For that reason, we make no assumptions about how the stages are chosen or how the matrices U i and F i are estimated.
That being said, the generality of the method suggests some promising directions for future investigation. For example, multistate capture-recapture methods (Lebreton et al. 2009) are designed to estimate survival and transition probabilities from imperfect capture histories. In human demography and biomedical survival analysis, multistate event history methods are used to estimate age-and stagespecific rates of mortality (Andersen and Keiding 2015), but the data are often less fragmentary than in capture-recapture analysis. Possible connections between these approaches might help to estimate parameters for age 9 stage-classified models.
In this context, for the special case of stages defined by continuous traits, integral projection models (IPMs) are essentially statisically sophisticated tools for estimating high-dimensional matrices. An IPM requires kernel functions for growth, survival, and fertility. If these kernels were estimated separately for age classes 1, . . ., x, the resulting model would be analogous to those considered here (Ellner et al. 2016: Chapter 6). The fertility kernel are special cases of the set of F i , and the growth and survival kernels special cases of the U i , when partitioned into G i and Σ i as in Eq. 99. When discretized for analysis, the set of IPMs would provide a set of matrices to which our methods can be applied. Note, however, that stages need not be defined by discretization of continuous variables.

Data requirements
Estimating stage-specific survival, transitions, and fertility is challenging. Obtaining these estimates as a function of age is even more so. However, we have made no assumptions about how the U i and F i vary with age, so even partial age dependence can be incorporated. Perhaps fertility varies with age but survival does not (or cannot be estimated). Or perhaps age dependence can be detected only over sets of age classes (survival and transitions described for age classes 1, 2-5, and >5, while fertility is described for age classes 1-5, 6-10, and >10). The end result in any of these cases is a set of matrices U i and F i , some of which are identical, but the analyses apply without any additional calculations.
In the limit, our method can be applied when U and F do not vary with age at all, in order to explore what would happen if age-dependent effects were introduced (e.g., Caswell and Salguero-G omez 2013).

Applications
Age 9 stage-classified matrix models are, in the end, matrix models. As such, they can be applied in the same way as age-classified or stage-classified models, to, inter alia, conservation of threatened species, control of pests, ecotoxicology, and epidemiology. These applications will require a model that specifies a management-relevant outcome from a management-relevant demographic structure. That might be a simple juvenile-adult model (as in our artificial species) or it might focus on the details of a particular species in a particular place at a particular time. In the latter case, one builds a model based on data from that place and time and applies the relevant methods to analyze it.
If both age and stage are significant components of the i-state (which is the rationale for constructing an age 9 stage-classified model in the first place), then applications will sometimes be well served by including the additional information. (But not necessarily. There is no reason to expect that the most complicated model is the most useful for any specific application.) Having a methodology for age 9 stage models that parallels that for age-and stage-classified models will facilitate model selection for applications.

Heterogeneity
The role of heterogeneity in population dynamics is a problem of increasing interest (e.g. Vindenes et al. 2008, Caswell 2014, Vindenes and Langangen 2015, Cam et al. 2016. From the perspective of an age-classified model, stage structure is a form of unobserved heterogeneity. It distorts age-specific outcomes because of intra-cohort selection (Vaupel et al. 1979, Vaupel andYashin 1983). It is an additional source of variance among individuals in demographic fate above and beyond that generated by the age schedules of fertility and mortality. From the perspective of a stage-classified model, age is a form of unobserved heterogeneity. It distorts the dynamics of stage structures and stage durations, particularly in short-term transient responses. It can also increase variance above that created just by the outcome of stage-specific processes.
The age 9 stage-classified methodology presented here, especially the variance decomposition analysis, can contribute to the analysis of heterogeneity, because it makes it possible to directly measure the consequences of any form of heterogeneity that can be incorporated in the model, for any outcome for which a variance is available. This includes, but is not limited to, the incorporation of latent, unobserved factors such as frailty (Caswell 2014, Hartemink et al. 2017, Jenouvrier et al. 2018, Hartemink and Caswell 2018.
Age, stage, and phenotype: incorporating heritability The models analyzed here include stages of both parents and offspring, and so it is natural to wonder if somehow the relation of parent to offspring, and the associated concept of heritability, could be incorporating. There are three ways in which the methods could be extended to include phenotypes and heritability. Two of these correspond to the models of Coulson et al. (2010); the third is new.
1. Let stages represent phenotype categories. Then F i would become a map from parent phenotype to offspring phenotype, for parents of age class i, encapsulating the notion of heritability. The matrix U i would become a phenotype transition and survival matrix for age class i; its structure would depend on whether phenotype traits do or do not change over an individual lifetime. The result would be an age 9 phenotype model. 2. Replace age classes with phenotype classes. This would lead to a stage 9 phenotype model. The matrices U i would describe the survival and stage transitions of individuals in phenotype class i. The F i would describe the production of offspring of all stages by individuals of all stages in phenotype class i. In the age 9 stage model, the matrix H i is a map from parent age to offspring age. It has the trivial form of Eq. 7 because all newborn individuals start life in age class 1, regardless of the age of their parent. In the stage 9 phenotype model, H i would become a parent-offspring phenotype map, encapsulating the heritability of the phenotypic trait. Perfect heritability would imply that H i ¼ I. In the complete absence of heritability, all columns of H i would contain an identical probability vector. In this model, the matrices D i would no longer describe age transitions, but would be replaced by phenotype transition matrices. Careful thought would be required to distinguish stages, or more generally i-states, from phenotypic traits (e.g., body size might be a stage, while body size at birth, or the coefficients in a growth equation, would be phenotypic traits). See Chevin (2015) for related discussion. 3. Age 9 stage 9 phenotype. The third possibility would be to incorporate phenotype as a third istate dimension, creating a fully age 9 stage 9 phenotype-classified model. The projection matrix for such a model, of dimension wsp (where p is the number of phenotype classes) is obtained from a three-dimensional array of matrices by a multigenerational generalization of the vec operator. Such models are called hyperstate matrix models; their construction and analysis is presented in Roth and Caswell (2016).
The development of age 9 phenotype, phenotype 9 stage, and especially age 9 stage 9 phenotype models is an open research problem.

Extensions of age 9 stage-classified demography
The approach outlined here suggests extensions of the models; here we mention a few of these.
The construction ofñ from the array N yields a population vector in which stages are grouped within age classes. 580 HAL Vol. 88,No. 4 This gives a sort of priority to stage-specific processes, treating age as a secondary characteristic. The alternative, treating age as primary and stage as a secondary kind of heterogeneity, would be obtained by transposing N and creating a population vector, call itñ 0 , given bỹ The dynamics ofñ 0 are given by a projection matrix Then all the results in this paper can be extended to this alternative arrangement by transforming the matrices as in Eq. 114. In doing so, the block-Leslie structure ofÃ in Eq. 14, in which stage matrices are arranged in the pattern of an age-classified matrix is replaced by a structure with age-classified matrices arranged in the pattern of the stage transitions.
Yet another intersting extension is to models structured by stage and age within a stage. Stage-classified models lead to geometrically distributed occupancy times. But if some time must elapse before an individual can advance to the next stage, some internal structure must be imposed to keep track of how long the individual has been in the stage; see Birt et al. (2009) for a recent presentation. These models are related to the age 9 stage-classified models considered here, and it will be valuable to connect the two approaches.
We have presented methods here for linear, time-invariant models, but one value of our approach is that it permits direct extension to more sophisticated models. The population dynamics in Eq. 60 are an ordinary matrix population model, with the projection matrixÃ written as the sum ofŨ andF. Incorporating density dependence would result in a nonlinear model in whichÃ would be a function ofñðtÞ. The possible combinations of age-and stage-specific density effects make it a daunting prospect to constuct such models.
However, the methodology here builds the projection matrices from the block-diagonal matrices like Eq. 10, each of which incorporates one of the sets of processes determining the dynamics (stage development, age transitions, fertility, and age assignment for newborn individuals). Even more, the block-diagonal matrices in turn are composed of well-defined matrices U i and F i describing the processes operating on each age-stage combination. It is at the level of these component matrices that the density effects or time variation actually operate, and the formulation presented here makes those matrices directly accessible for analysis. Something like this has been approached for a stage-classified epidemic model by Klepac and Caswell (2011).
Stability analysis, reactivity analysis, and sensitivity analysis of nonlinear age 9 stage-classified models will then be approachable via extensions, similar to those in Eq. 89, of the matrix calculus approaches developed for nonlinear age-and stage-classified models (Caswell 2008, Verdy andCaswell 2008).
Our methodology invites similar extensions to time-varying and stochastic environments, by making it easily possible to explore the effects of stochasticity in age-and stage-specific survival, transitions, and fertility. The same applies to spatial structure. However, adding location to the model will require increasing the dimensionality of the i-state space (just as did the addition of phenotype discussed above). The population vector for an age 9 stage-classified model is obtained by applying the vec operator to the two-dimensional matrix N in Eq. 1. The population vector for the age 9 stage 9 location model is obtained by applying a multidimensional version of the vec operator to the threedimensional analogue of Eq. 1. Such arrays are called hypermatices, and the higher-dimensional models referred to as hyperstate matrix models; their construction and analysis are detailed in Roth and Caswell (2016). It should come as absolutely no surprise that adding additional dimensions to the i-state space requires increasing amounts of additional data, but the framework presented here makes it possible, in principle, to do so. This list can be extended. We invite the reader to do so.