Consumption Smoothing among Working-Class American Families
before Social Insurance
John A. James
Michael G. Palumbo
Abstract: This paper examines whether the saving decisions of a large sample of working-
class American families around the turn of the twentieth century are consistent with
consumption smoothing tendencies in the spirit of the permanent income hypothesis. We
develop two econometric models to decompose reported annual incomes from micro-data
into expected and unexpected components, then we estimate marginal propensities to save
out of each component of income. The two methodologies deliver similar regression
estimates and reveal empirical patterns consistent to those reported in other recent research
based on quite different contemporary household data. Marginal propensities to save out of
unexpected income shocks are large relative to propensities based on expected income
movements, though the former lie much below one and the latter much above zero. While
these data reject strict parameterizations of the permanent income hypothesis, we nonetheless
conclude that families’ saving decisions in the historical period look quite “modern.”
JEL Classification Codes: D91, N31, E21.
Keywords: Unemployment risk, permanent income hypothesis, precautionary saving.
Affiliations and Acknowledgments: James is in the Department of Economics at the
University of Virginia, Charlottesville, VA 22903 (firstname.lastname@example.org); Palumbo is at the
Federal Reserve Board, Mail Stop 80, Washington, D.C. 20551 (email@example.com); and
Thomas is in the Department of History at the University of Virginia
(firstname.lastname@example.org). We appreciate discussing this research with Martin Browning,
Thomas Crossley, Chinhui Juhn, David Papell, Jonathan Parker and Luigi Pistaferri. Also,
seminar audiences at the American Economic Association, the Federal Reserve Board,
McMaster University and York University provided valuable comments. The views presented
are solely those of the authors and are not necessarily shared by the Federal Reserve Board
or its staff.
Consumption Smoothing among Working-Class American Families
before Social Insurance
Around the turn of the twentieth century, many Americans lived and worked in an environment
of considerable economic uncertainty. Industrial accidents presented significant risks to many (see
Kantor and Fishback, 1996), as did illnesses, but even more pervasive was the risk of unemployment.
Unemployment was much more widespread during the period before World War I than it has been
since World War II. Not only was the natural rate of unemployment higher, but so too was the
cyclical sensitivity of unemployment (James and Thomas, 1996). Moreover, the incidence of
unemployment was more widespread, implying that a greater proportion of workers had need for
precautionary action than today. Unemployment in this historical period had its predictable elements --
the availability of work followed strong seasonal influences, for one thing -- but loss of work also
resulted from much less predictable factors. Business cycle downturns during the late nineteenth and
early twentieth centuries were on average more serious than they had been before or have been since.
The severity of business cycles had increased dramatically from the period before the Civil War to the
one after (James, 1993) -- indicating an increasing need for precautionary behavior -- and this in a
period before the rise of governmental institutions designed to take the sting out of unemployment
spells. Workers in the late nineteenth century were essentially dependent on their own devices to
combat income uncertainty. It is perhaps ironic that the expansion of social insurance after World
War II coincided with a moderation of unemployment volatility.1
Alexander Keyssar, in his well-known study of unemployment in Massachusetts, observes
that employment for workers in this period was “chronically unsteady” (1986, p. 59). Even within the
business cycle he stresses the great diversity of individual experience: “The incidence of joblessness
during depressions was always checkered, erratic, variegated.” Moreover, “a majority of the working
class found that the threat of unemployment remained palpable even when business was good.” (1986,
pp. 55, 58). Substantial negative shocks to household income therefore would have been common and,
especially for lower income families, presented potential economic disasters. Workers faced such risks
without assistance from any sort of public safety net -- no unemployment compensation, no worker’s
compensation for accidents, no sickness coverage. Similarly, few workers could access formal credit
markets during this time to enable smoothing adverse income shocks by taking temporary loans.
ong run changes in the degree of volatility of national output and unemployment in the U.S., of course,
has been a controversial issue (see Romer, 1986; Weir, 1992). However, James and Thomas (1996)
document a decrease in the cyclical response of unemployment from the pre-World War I period to that
post-World War II.
Commercial banks, influenced at least in theory by the real bills doctrine, limited themselves to
commercial rather than personal loans, while mutual and stock savings banks were less than
A similar environment of unpredictable incomes and weak formal credit markets exists for
low-income households in developing countries today. Nonetheless, a number of recent studies have
found that, even without access to formal insurance and credit markets, such households have generally
been able to smooth their expenditures in the face of large and frequent income shocks (e.g., Deaton
1990; Paxson 1992; Townsend 1995). However, most previous research on how households in
developing countries cope with economic risks study families who rely heavily on agricultural
production for their earnings and thus face different circumstances than did American industrial
workers a hundred years ago. Crop diversification and the maintenance of buffer stocks of
commodities were not options open to the non-farm working class in America, on whom we focus in
Does it follow, therefore, that American working-class families responded to uncertain
earnings prospects through different private saving patterns than has been observed among peasant
farm households today? Or does it mean that workers found means other than private saving to
smooth consumption in the face of volatile and unpredictable incomes? We address these themes using
a two-pronged approach. Most of the paper reports on an econometric analysis of the role played by
private saving as a potential means for buffering volatile income experiences by working-class families
a hundred years ago. A second shorter historical section then analyzes the institutional environment
surrounding families of the time to complete the picture. Our historical survey reports on the
availability of consumer credit, insurance coverage and pawnbroking services, all of which might be
expected to have enhanced the ability of working-class Americans during this period to protect their
expenditures from the short-term income disturbances to which they were vulnerable.
The plan of the paper goes as follows. The first section describes the most comprehensive
micro-level, primary-source database assembled to date on saving, income and employment covering
American workers surveyed during twenty-plus years around the turn of the century. By merging
information from thirty cross-section surveys based on nearly identically-worded questionnaires, our
database includes information from more than 32,000 working-class families interviewed between
1884 and 1909. In the second section, we describe an econometric methodology, similar to that
proposed by Paxson (1992), for decomposing annual income realizations into predictable and
unexpected components for each worker in the sample.
The third and fourth sections contain the primary empirical results of the paper based on two
sets of estimated marginal propensities to save out of predictable and unexpected income components.
We first pursue an econometric methodology which exploits variation in unemployment during the
previous year at the worker level to identify the marginal propensity to save from unexpected income
realizations. This method carries some intuitive appeal, but is subject to fairly standard endogeneity
criticisms. Our second econometric approach involves aggregating the micro data up to “groups” of
workers, based on shared characteristics, then identifying the marginal propensity to save out of
unexpected income realizations from “business cycle” variation in group-average outcomes across time
periods. The two disparate empirical methodologies yield generally consistent results and, in
particular, econometric estimates from the aggregate-level analysis show no evidence that potential
endogeneity is responsible for the micro-level regression results.2
Both empirical analyses present strong empirical evidence that working-class American
households used their own saving to smooth consumption in the face of volatile employment
circumstances during the late nineteenth and early twentieth centuries, much as contemporary
American and European families, as well as farmers in developing countries appear to do. As almost
all recent studies find as well, the regression coefficients for household saving reject strict
specifications of the permanent income hypothesis, but seem in line with what one might expect among
precautionary or buffer-stock saving behavior. We conclude, therefore, that the broad saving patterns
in this unique historical micro-level database fit “modern” intertemporal behavior, in this sense.
As mentioned, the fifth section of the paper presents some historical evidence on the
availability of financial institutions that would have accomodated consumption smoothing by American
workers during this period through means other than their own private saving. We argue that these
alternative mechanisms which might have used to complement private saving as a means to buffer
household expenditures were in fact quite limited. Finally the paper concludes by briefly summarizing
our empirical findings and comparing them with results and conclusions from previous related
I. Saving and Income Data from a Series of Worker Surveys
2 We discuss these issues in detail below. Our aggregate-level, or time-series, approach, but not our
micro-level regression, is appropriate if workers differ in terms of their unobserved tendencies to be
employed and to save resources for future uncertain contingencies. Despite the intuitive appeal of such a
hypothesis, we uncover no support for the endogeneity it implies in these historical data.
ote, we leave for future research the more ambitious, and necessarily complex, question of whether
these micro-level saving data are completely consistent with intertemporal optimization under rational
expectations, given the magnitude of income risks apparently facing this sample of American workers.
State-level bureaus of labor statistics published more than one hundred surveys of wage-
earning workers during the late nineteenth and early twentieth centuries. The surveys focused on
economic conditions facing American workers, employed mostly in the non-farm sector, and the living
conditions of their families. Some surveys concentrated on covering workers employed in specific
industries (vehicle manufacture; iron production; furniture), actually holding interviews at the
workplace (the Michigan model); others polled representative samples of workers across industries by
mail (the Kansas model). Although the information collected varied from survey to survey, most
followed a common model using similarly worded or identical questions, following the example
developed by Carroll D. Wright in Massachusetts, who began surveying workers during the early
1860’s. Individual survey responses were invariably published in full, without editorial embellishment
or alteration, once accuracy and consistency were checked. Each survey covered a different cross-
section of workers; none followed the experiences of individual workers or families across more than
one year. It is not possible therefore to construct a true panel from these sources; our database
consists of merged independent cross-section surveys.4 As Table A-1 indicates, the database includes
32,150 workers from 8 states interviewed during 22 different years from 1884 through 1909. Note,
however, that nearly all (97 percent) of the families in our sample reside in Kansas or Michigan; most
surveys used here (71 percent) were conducted during the 1890’s.5
We limit our analysis in this paper to those thirty surveys which report information on income,
saving and/or expenditures and days of unemployment during the previous year. Additionally, we
know the skill level for each worker, as well as his age, industry, state of residence and the year of the
ncome in this paper refers to annual family income, combining income from all sources.
A few surveys report labor earnings disaggregated by earner (or, at least, separately for primary wage-
earners and all others in the family), but most do not. T
he database employs two different measures
e discuss the dataset in more detail elsewhere (James, Palumbo, and Thomas 1996), so here we shall
n fact, we have repeated all of the paper’s empirical analysis using only data from Michigan and
Kansas. None of our results are affected significantly by focusing exclusively on this subsample.
ur working database excludes the few women surveyed, as well as the few men younger than 15 years
of age or older than 75.
one of the surveys separate interest income from labor income, but the former category is unlikely to
be a major contributor among workers included in this sample.
of annual saving.8 Some households reported last year’s saving directly (answering a question like
“How much of your income did you save last year?”); others recorded total family expenditures, from
which saving may be calculated as the residual from annual income.9 We designate the first variable
as “reported saving,” and the second as “calculated saving.” We apply all the empirical analyses in
the paper to both measures of household saving.
Figures 1 and 2 show empirical density and distribution functions for reported and calculated
saving based on the pooled cross-section survey data. The figures show clear differences between the
two distributions. Surveys that measured saving directly only recorded additional money set aside
(positive values for reported saving); otherwise, entries were left blank. After analysis of the data, we
concluded that 'no response' generally indicated zero (or negative) saving. T
hus, Figure 1 shows
some possible effects of left-censoring at zero – a large spike in the distribution function at the
censoring point— as well as other smaller spikes at “round” numbers.
Calculated saving, on the
other hand, takes both positive and negative values. Thus, Figure 2 shows a less skewed distribution
of saving, and a smaller spike occurring exactly at zero dollars per year. Appendix A reports a
detailed investigation into the differences between the two saving series. Among a subsample of
families for whom we observe both reported and calculated saving, the two variables are strongly
linearly related. Thus, the divergence between the distributions shown in Figures 1 and 2 turns out to
ote that the surveys clearly asked respondents about the annual flow of saving out of income during
the previous year, rather than about their accumulated stocks of assets or net worth. Some of the
respondents, naturally, owned their homes. During this period, home mortgages typically involved short-
term contracts in which only interest payments occurred during the loan’s duration with a balloon
payment of principal due at maturity. Thus, paying down the loan principal at maturity must have
accomplished through prior saving and, we have argued (James, Palumbo and Thomas 1996), our saving
measures seem likely to include changes in home equity.
irect savings were recorded by 26,112 families, while savings were calculated as a residual for a
further 11,946 families (see Table A-1).
ne survey, covering vehicle workers in Michigan during 1896, recorded three responses to the
question “amount saved last year?” – a positive value, “none,” and a blank response. Of the 2,787 survey
respondents who do not report positive saving (out of 3,776 total observations), 2,576 explicitly answer
“none”; only 211 have a blank response.
or consistency, we artificially censored the few observations for which we observe negative values for
reported saving. These come from a single Kansas survey in which respondents answered “Did you save
or run a deficit last year?”
have virtually no consequence for our empirical analyses. As we report below, Tobit equations
estimated using reported saving yield nearly identical slope coefficients to OLS regressions based on
calculated saving, though the estimated intercepts differ between the specifications.
Table 1 presents mean income, mean consumption and the variance of expenditure relative to
the variance of income for workers in our dataset who report information on annual saving directly
(grouped by age, by skill and by industry). Similar information is shown in Table 2 for those workers
who directly report income and expenditures (i.e., for those whose savings have been calculated by us).
Both tables clearly indicate the tendency for variability in family income to exceed variability in
expenditure, regardless of classification. In the subsample for which calculated saving is available
(Table 2) the cell variance of consumption is generally at least 30 percent less than that of income;
where saving reported directly (Table 1), the difference is somewhat smaller -- expenditures vary
about 20 percent less than incomes, on average.
An interesting pattern follows from comparing mean income and consumption between savers
and nonsavers within each cell. In virtually every cell of Tables 1 and 2, savers earn higher incomes
than nonsavers, but average consumption levels between the two groups are extremely similar. The
few cells in which consumption differences are relatively large all suffer from relatively small cell
sizes. Among families in the calculated saving subsample, mean consumption levels differ only by 8
percent between savers and nonsavers, but income levels are about 34 percent greater among savers
on average. This striking result suggests that saving might have responded largely to income
"surprises" among these families, thereby motivating the empirical strategies to be described next.
II. Estimating Predictable and Unexpected Components of Income from Annual
A key implication from theories of household saving based on intertemporal optimization is
that marginal propensities to save (mps; or to consume, mpc) differ by composition of income. The
empirical strategy for examining saving behavior employed in this paper involves making explicit
comparisons of the marginal propensities to save out of predicted and unexpected (transitory) income
to discern motives for saving. “Keynesian-saving” families, whose spending simply is a function of
current income, ought to have marginal propensities to save that apply equally to all components of
annual income. On the other hand, if families are guided by a “certainty equivalence” decision rule
(or, according to the “permanent income hypothesis” in Deaton’s (1992) terminology), then
predictable differences in income levels will not affect observed saving levels, but unexpected income
shocks will affect saving decisions dollar-for-dollar.12 Finally, recent theoretical models based on
intertemporal optimization with unpredictable family incomes and “prudence” (a characteristic of
household utility; see Carroll, 1997; or Deaton, 1992) or liquidity constraints (Deaton, 1991) imply
small, but nonzero, marginal propensities to save out of predictable incomes, and marginal propensities
close to one out of unexpected income.13
This approach to econometric analysis of household saving behavior thus requires realized
family income each year to be decomposed into its predictable and unexpected parts. Following
Paxson (1992), rather than estimating transitory income simply as a residual from a regression
equation for annual income on predictable family and worker characteristics, we use survey
information to measure it more directly. In her study of saving behavior among Thai rice farmers,
Paxson uses deviations in rainfall from historical averages to measure unexpected income shocks for
small geographic regions in Thailand. Our application focuses on shocks to time spent out of work
among primary earners in working-class families. Unexpected days lost from work would not have
directly affected family expenditures, but would have translated into important income shocks to
families in our database, which should flow into changes in savings levels, according to modern theory.
A deviation in reported workdays lost from its predicted value, therefore, provides a measure of
unexpected income shock realized by each family in our database, in a fashion analogous to that
produced by variation in annual rainfall among Paxson’s sample of Thai rice farmers.
Our measure of time out of work, which we call “workdays lost”, actually measures
nonemployment (deviations from full-time employment) over the course of the entire survey year, as
reported by each respondent and possibly occurring for a variety of reasons (low-frequency job loss,
high-frequency inabilities to find work, accidents, sicknesses, etc.). Column 1 of Table 3 summarizes
the distribution of annual workdays lost among different categories of respondents. The average
worker missed 37 days of work during the previous year; the median length of time lost being 18 days.
Twenty-five percent of respondents report not missing any work during the previous year; another
twenty-five percent report missing more than fifty workdays last year. Clearly, substantial variation
concise, formal derivation for optimal consumption and saving rules under certainty equivalence, or
the permanent income hypothesis, can be found in Pistaferri (1998); Paxson (1992) includes an informal
discussion of similar results.
13 This statement is made somewhat speculatively. We are not aware of regressions based on simulated
buffer stock or precautionary saving models (other than some of our own) that specifically highlight
theory-based coefficients directly comparable to those we estimate in this paper. In other related work, we
have begun to examine these using contemporary model specifications calibrated to describe our historical
environment (James, Palumbo and Thomas, 1999).
in workdays lost exists in the microdata. In this paper, we essentially ask how variation in lost
workdays contributes to variation in family or group saving decisions.
Decomposing annual income into its predictable and unexpected parts requires us first to
decompose annual workdays lost for each survey respondent. W
e first estimate a regression for
annual workdays lost as a function of each respondent’s age, state of residence interacted with survey
year, and skill category interacted with industry of employment.
15 Using the regression-fitted value
to estimate the predictable number of workdays lost for each observation, the regression residual
estimates the unexpected shock to employment experienced by each worker during the previous year.
Our specification is quite flexible and, because all the explanatory variables are categorical indicators,
the regression effectively defines the predictable number of workdays lost to be the average among all
workers of a particular type. We assign workers to types, or cells, defined by four age categories,
twenty-one survey groups (state-by-survey year combinations) and twenty-two occupational skill-by-
16 The adjusted R o
2 f the regression equation for reported workdays lost during
the survey year on the categorical indicator variables for worker type, which is estimated using all
32,150 observations in the sample, is 0.12. Rather than report all the estimated regression coefficients,
which are cumbersome to interpret, column (1) of Table 3 shows average workdays lost during the
previous year by some of the categories of worker types. The regression results show some clear
differences in average workdays lost among different groups of workers.
The unexpected component of annual workdays lost is defined as each individual survey
respondent's deviation in reported workdays lost during the previous year from the average workdays
lost among members of his type. The second and third columns of Table 3 show estimated dispersion
in unexpected workdays lost during the previous year obtained from this procedure, as measured by
e appendix to the paper includes an algebraic representation of the model described verbally here.
e experimented with numerous alternative regression specifications, some of which include a
national, time-varying business cycle index (and its interactions with other variables) instead of year
dummies. This paper’s results are based on a parsimonious specification described in the text. The
alternative models produce very similar results.
or comparison, Paxson’s “cells” are defined for individual farmers by their geographic proximity to
the nearest of many weather stations located throughout Thailand. Then, it is as if Paxson has a long
time-series of rainfall data on which she estimates a regression of annual rainfall on weather station
dummy variables. The regression residuals then provide her measure of unexpected annual rainfall,
which leads to an estimate of unexpected farm income for each sample observation, following the
procedure we describe next.
the interquartile range by (some of the) worker types. Note that, by construction, unexpected
workdays lost must average zero for each worker type.
.17 Figure 3 shows the density and distribution
functions for unexpected workdays lost realized by workers in our sample estimated according to these
Having in hand an estimate of unexpected workdays lost, we now must translate that variable
into a measure of unexpected income during the previous year for each family in the sample. This
involves estimating a regression for annual family income on unexpected workdays lost by its primary
wage-earner during the previous year and the same set of explanatory variables used in the workdays
lost equation to measure predictable income movements. The income regression, again estimated using
the entire sample of 32,150 families, yields an adjusted R e
qual to 0.43.1 T
hen, unexpected annual
family income is calculated as the product of estimated unexpected workdays lost during the previous
year and its estimated coefficient from the family income regression (-1.7214 with a t statistic of -
78.40). It seems noteworthy that our procedure produces an estimated “price” for the primary wage-
earner missing a day of work extremely close to the average daily wage directly reported in the
surveys, $1.92. Missing workdays experienced by primary wage-earners likely represented an
important exposure to economic risk among these families -- wages lost by the primary wage-earners
do not appear to have been readily made up through additional wages earned by other family members
or through other sources of family income.
Predictable family income for each sample household is estimated using the fitted value from
the age-, survey- and skill-by-industry indicator variables (and their estimated coefficients) in the
family income regression. Column (1) of Table 4 reports average family income, which equals average
predicted income by definition, across some of the worker types defined in the regression. Unexpected
income for each family is estimated as the product of the regression coefficient -$1.72 and each
worker’s unexpected workdays lost during the previous year. The distribution of unexpected income
or parsimony, we construct Tables 3 and 4 (discussed next) by collapsing the 22 skill-by-industry
cells down into four skill categories and five industries. Also, we omit summary statistics by state and
year categories simply to conserve space.
ncluding unexpected workdays lost along with the 44 cell-indicator variables in the annual family
income regression produces an R e
qual to 0.43. Omitting unexpected workdays lost reduces the
explained variation in family income to 0.32. Finally, note that we include the same cell indicator
variables in the workdays lost regression and in the annual income regression. This means that the
estimated coefficient on unexpected workdays lost in the income equation is identical to the coefficient
estimate on actual workdays lost (not the residual from a first-stage regression), if we included that
variable in the income equation instead.