|
|
small (250x250 max)
medium (500x500 max)
Large
Extra Large
Full Size
Full Resolution
|
|
03 OCN-A-oi Alr.2.- tf 8tlilT&ti fip* USDA Measuring Food Security in the United States United States Department of Agriculture Food and Consumer Setvtce Office of Analyst* and Evaluation Household Food Security in the United States in 1995 Technical Report ??-■*>■ w ♦' X• USDA Unrtad States Department of *3rtcUtur« Food and Service Offlcaof Analysiaand Evaluation Household Food Security in the United States in 1995 Technical Report of the Food Security Measurement Project September 1997 Prepared for: Gary W. Bickel, Project Officer U.S. Department of Agriculture Food and Consumer Service 3101 Park Center Drive Alexandria, VA 22302 under contract no. 53-3198-5-028 Prepared by: William L. Hamilton, Project Director* John T. Cook, Principal Investigator1' William W. Thompson' Lawrence F. Buron* Edward A. Frongillo, Jr.c Christine M. 01sonc Cheryl A. Wehler- * Abt Associates, Inc. b Tufts University Center on Hunger, Poverty, and Nutrition Policy e Cornell University Division of Nutritional Sciences d C.A.W. and Associates TABLE OF CONTENTS Chapter One Chapter Two Chapter Three Chapter Four Chapter Five Chapter Six INTRODUCTION 1 METHODS AND RESULTS OF FITTING LINEAR AND NON-LINEAR FACTOR ANALYSIS MODELS TO CPS DATA . . 5 2.1 Preliminary Linear Factor Analysis 8 2.2 Exploratory Two-Parameter Non-linear Factor Analysis Model 10 2.3 Unidimensional One-Parameter Non-linear Factor Analysis Models 13 2.4 Summary 28 RELIABILITY ESTIMATES FOR THE FOOD SECURITY SCALES 29 3.1 Spearman-Brown Split-half Reliability Estimates 31 3.2 Rulon's Split-Half Reliability Estimates 33 3.3 Cronbach's Alpha Reliability Estimates 34 3.4 Rasch Model Reliability Estimates 35 3.5 Reliability in Identifying Cases with No Food Insecurity Problems 38 3.6 Summary 40 DEFINING RANGES OF THE FOOD SECURITY SCALE 43 4.1 Conceptual Basis for a Categorical Food Security Status Variable 43 4.2 Defining Ranges and Selecting Scale Outpoints 45 4.3 Evidence of Food Insecurity 50 4.4 Subjective Reporting of Hunger 53 4.5 Evidence of Child Hunger and Severe Adult Hunger 56 4.6 Summary 58 THE RESOURCE AUGMENTATION QUESTIONS 61 5.1 Two Dimensions of Food Insecurity 61 5.2 The Composite Resource Augmentation Index 65 5.3 Effects of Using the Composite Resource Augmentation Index . 66 5.4 Summary 67 EXTERNAL CONSTRUCT VALIDATION OF THE FOOD SECURITY MEASURES 69 6.1 Relationship of Construct Validation Items to Food Security . . 69 6.2 Weekly Food Expenditures per Household Member 70 6.3 Household Income 72 6.4 Food Sufficiency 76 6.5 Summary 77 Table of Contents Chapter Seven PROCEDURES FOR CALCULATING STANDARD ERRORS FOR FOOD SECURITY PREVALENCE ESTIMATES 79 7.1 CPS Sample Design 79 7.2 Adjustment Factor for Berween-PSU Variance 80 7.3 Estimation of Within-PSU Variance 81 7.4 Calculation of the Standard Errors 84 Chapter Eight POTENTIAL SOURCES OF BIAS IN PREVALENCE ESTIMATES 85 8.1 Screening Bias 86 8.2 Response Bias 88 8.3 Random Error in Survey Responses 89 8.4 Summary 92 REFERENCES 93 Appendix A REVIEW OF LITERATURE FROM PHYSIOLOGY AND CLINICAL NUTRITION RESEARCH ADDRESSING THE NATURE OF HUNGER Appendix B PREVALENCE OF HOUSEHOLD FOOD SECURITY STATUS (30- DAY SCALE) Appendix C PARTICIPANTS IN FEDERAL INTERAGENCY WORKING GROUP FOR FOOD SECURITY MEASUREMENT CHAPTER ONE INTRODUCTION In April 1995, the U.S. Bureau of the Census conducted the first collection of comprehensive food security data as a supplement to its regular Current Population Survey (CPS). With about 45,000 household interviews, this survey is the first to collect the special data needed to measure food insecurity and hunger in a nationally-representative sample of U.S. households. The Food and Consumer Service (PCS) of the U.S. Department of Agriculture led the effort to develop the Food Security Supplement to the CPS, building on research conducted at universities and elsewhere over the past decade. After the survey was conducted, the next step was to analyze the data to create measurement scales that gauge households' levels of severity of food insecurity and hunger. FCS contracted with Abt Associates Inc. and three subcontrac-tors — the Tufts University Center on Hunger, Poverty, and Nutrition Policy; the Cornell University Division of Nutritional Sciences; and CAW and Associates — to carry out the scale construction analysis. The results of that analysis are presented in Household Food Security in the United States in 1995: Summary Report ofthe Food Security Measurement Project, to which this report is a companion volume. The purpose of this report is to describe the analyses through which the food security scales and food security status variable were developed, as well as related tests of die reliability and validity of these measures. Two scales were developed to measure the degree of food insecurity and hunger in American households. One measures food insecurity and hunger over the period of the 12 months prior to the survey interview, and the second measures these conditions in the 30 days immediately preceding the interview. After a number of exploratory analyses, a type of non-linear factor analysis known as a Rasch model was used to form the scales. This methodology and the procedures through which it was applied are described in Chapter Two. The two scales were subjected to a variety of tests of reliability, including tests specific to the Rasch model and more traditional tests commonly used with scales developed through linear factor analysis. The results, presented in Chapter Three, generally indicate good Prepared by Abt Associates Inc. Chapter One: Introduction reliability for the 12-month scale. The 30-day scale, because it is based on a smaller number of questions and provides detailed measurement for a narrower portion of the food insecurity spectrum, has somewhat lower reliability. The two scales serve as the basis for defining two corresponding food security status variables. The 12-month variable has four categories: (1) Food Secure; (2) Food Insecure with No Hunger Evident; (3) Food Insecure with Moderate Hunger Evident; and (4) Food Insecure with Severe Hunger Evident. The 30-day scale has three categories: (1) No Hunger Evident; (2) Food Insecure with Moderate Hunger Evident; and (3) Food Insecure with Severe Hunger Evident. To classify households into the various categories, it was necessary to define ranges on the 12-month and 30-day scales that correspond to each category. The rationale for the range definitions is described in Chapter Four. The food security scale and the food security status indicator represent a central dimension of food insecurity: availability of enough food for the household to meet basic needs. The concept of food insecurity has other dimensions, however, including the specification that households should be able to acquire food in socially acceptable ways. Because the CPS Supplement includes several indicators of "coping" or "resource augmentation" behaviors related to this dimension of food insecurity, the possibility was explored of supplementing the primary food security scale with an index of resource augmentation actions. The analysis, described in Chapter Five, suggests that such an index should not be used in classifying households' food security status at this time. A key question for any new scale is how accurately it represents the condition it attempts to measure. Ideally, one would compare the food security scales and status variables to some more definitive measure or measures of food insecurity and hunger. Because no such definitive measure exists, the best way to judge the measure is to assess its relationship to other measures thought to be related to food insecurity and hunger, such as the household's level of food expenditures or its total income. Chapter Six presents the results of such analyses, which show relationships of the sort that would be expected with a valid measure of food insecurity and hunger. The central purpose of the food security scales and the status variables is to assess the food security of the U.S. population and of subgroups within the population. Estimates of the Prepared by Abt Associates Inc. Chapter One: Introduction prevalence of food insecurity and hunger are presented in the study's main report, based on the April 1995 data. Because these data come from a sample of households, prevalence estimates are subject to sampling error, and the report therefore presents estimated standard errors corresponding to the estimated prevalences. The estimation of standard errors is complicated by the multi-stage sampling design used by the CPS. Chapter Seven describes the methodology used in the estimation of standard errors. Finally, Chapter Eight discusses the potential sources of bias in prevalence estimates that might result from the sample design of the CPS, from household response behaviors to the Food Security Supplement, and from the fact that only a small proportion of the population experiences food insecurity. The analysis indicates that the various potential sources of bias probably lead to quite small levels of estimation error in counterbalancing directions. Prepared by Abt Associates Inc. CHAPTER Two METHODS AND RESULTS OF SCALING ANALYSIS OF CPS DATA This section describes the rationale and the results of conducting preliminary linear factor analyses and subsequently fitting a series of non-linear factor analysis models to the CPS food security data. This latter analysis approach more accurately characterizes the covariation among items in the CPS data set than more traditional linear factor analysis models. Most items available for analysis in the CPS data set were severely skewed and dichotomous or categorical in nature. Therefore, a number of statistical assumptions were violated using the linear factor analysis methods with the CPS items, such as the assumption of normally distributed error variance. Such situations can be dealt with more appropriately using non-linear scaling techniques. Item Response Theory (IRT) describes a general model that was developed by the educational testing industry to assist in creating valid and reliable aptitude tests, such as the Scholastic Aptitude Test (SAT) and the American College Testing Program (ACT) test. When applying a particular IRT model to data, the test designer usually assumes that the responses to a set of items can be accounted for by latent traits or factors that are fewer in number than the test items. The primary goal is to determine how an individual with a certain ability level will respond to an item associated with a particular difficulty level. There are a number of alternative forms the IRT model can take, depending on the assumptions regarding how the underlying data were generated. The three most frequently discussed IRT models in the literature are (1) the three-parameter logistic model, (2) the two-parameter logistic model, and (3) the one-parameter logistic model. The three-parameter logistic IRT model is the most complex, and can include varying discrimination parameters, varying difficulty levels, and varying guessing parameters. Using the notation of HambletUi (1983),1 the three-parameter logistic model can be written as follows: 1 Hambleton. R.K. (ed.). Application ofItem Response Theory, Vancouver: Educational Research Institute of British Columbia. 1983. Prepared by Abt Associates Inc. Chapter Two: Methods and Results of Scaling Analysis of CPS Data Wn) " Ci + (l-Ct) (1) 1 + g'(fi-'bi' where Bn = latent trait score of person n, ai = item discrimination parameter for item / bx = item difficulty for item /, ci - guessing parameter for item /, n = person, and / ■ item. The two-parameter logistic model assumes that guessing does not occur, and therefore the guessing term is dropped from the model. The two-parameter logistic model can be expressed as follows: *M= igra (2) where 6n m latent trait score of person n, at ■ item discrimination parameter for item /, bt = item difficulty for item /, n - person, and / = item. Finally, the one-parameter logistic model is a more straightforward model relative to the two previous models, because the model (1) has no guessing parameters, and (2) specifies mat all Hems have the same discrimination parameter (a). That is, the slopes of the item-characteristic curves are constrained to be equal for all items. The model can be written as follows: Prepared by Abt Associates Inc. Chapter Two: Methods and Results of Scaling Analysis of CPS Data *wv = Da(0n-b,) (3) 1 + e Da(9n-bj> where a h n I = latent trait score of person n, = average item discrimination parameter for item /, = item difficulty for item /, = person, and = item. Because D and d are constants in the model, the one-parameter logistic model can be written in a more simplified form: PM = |+#**-V> (3) We can also express this model using the notation of Wright and Masters (1982): VW 1 + e V„-(*i*r0) (4) where ft. m latent trait score of person n, 5, = item difficulty for item /, rk = threshold parameter for step k of item /, n * person, / = item, and * - step, and include a threshold parameter that is associated with the rating scale model developed by Andrich (1978, 1979). Prepared by Abt Associates Inc. Chapter Two: Methods and Results of Scaling Analysis of CPS Data 2.1 PRELIMINARY LINEAR FACTOR ANALYSIS The CPS Food Security Supplement builds on a substantial amount of recent research on the measurement of food insecurity, some of which included scaling analysis.2 The first analytic step was to replicate some of the prior analyses to determine whether the general patterns and relationships in the data were similar to those seen in prior work. A series of linear factor analyses were fit to the CPS data. One illustrative model, summarized in Exhibit 2-1, was fit for households with children (because this group was asked all questions in the Supplement). The factor model incorporated a Procrustes rotation, which allows one to rotate to a pre-specified factor solution, where the solution was specified to represent the dominant themes of the prior research. Fitting the factor analysis model resulted in three factors with eigenvalues greater than 1.0 prior to rotation (15.0, 1.6, and 1.4), with factor loadings as shown in the exhibit. The first factor includes primarily items related to child food intake reductions and hunger, the second consists mainly of household-level food insecurity items, and the third comprises mainly items related to adult food intake reduction and hunger. In sum, the results generally confirmed that the response patterns in the CPS data were similar to those seen in prior research and that simihu relationships might be expected to exist. In addition, the large positive factor intercorrelations suggested the possibility that non linear factor analysis methods might result in the items loading onto a single factor (i.e., that the separation of factors could occur in part because of the limitations of linear factor analysis in handling low-frequency dichotomous items). Finally, exploratory analyses of groups of households without children suggested that, for those items applicable to all groups, the factors might be relatively invariant across groups. 2 Two key prior studies are Olson, Frongillo, and Kendall (1995), and Scott, Wehler, and Anderson (1995). The first study estimated a factor analysis model including four items from die Community Childhood Hunger Identification Project (CCHIP) and ten items from two previous Cornell surveys. The analysis identified two key factors, one associated with household-level food insecurity and one associated with hunger. The second study, analyzing data from multiple CCHIP studies, found a fust factor comprising mainly household-level food insecurity items and adult hunger items, whereas the second factor included mainly child hunger items. Prepared by Abt Associates Inc. Chapter Two- Methods and Results of Scaling Analysis of CPS Data Exhibit 2-1 SUMMARY OF FACTOR LOADINGS FOR LINEAR FACTOR ANALYSIS MODEL (n=2,991) Items Stanr» -rdized Regression Coefficients F, *2 h Qll 38 Q15 59 Q16 63 Q20 52 Q24 45 Q28 52 Q32 47 Q35 48 Q38 43 Q40 50 Q43 42 Q47 60 Q50 40 Q53 78 Q54 Q55 78 Q56 73 Q57 49 Q58 75 Prepared by Abt Associates Inc. Chapter Two: Methods and Results of Scaling Analysis of CPS Data 2.2 EXPLORATORY TWO-PARAMETER NON-LINEAR FACTOR ANALYSIS MODEL Initially, we fit a series of exploratory non-linear factor analysis models to determine the dimensionality of the Food Security Survey items.3 From these alternative models, we selected one representative non-linear model, labeled M121, which best describes the consistent findings across the various alternative models. M121 was fit as a two-parameter logistic model that included estimates for both factor loadings (discrimination parameters) and uniquenesses (error term).4 Descriptive statistics for the subsample of 994 subjects and 21 items are presented in Exhibit 2-2. The items ranged in proportion of positive responses from .850 (item 15) to .004 (item 50), where the higher the proportion, the lower the severity of food insecurity indicated by the particular item. The results of the non-linear factor analysis model are presented in Exhibit 2-3. The primary fit statistic, the root mean square residual (RMSR) suggested that the one-factor model adequately fit the data (RMSR = .0074). That is, the RMSR was well within the acceptable range with a single factor, and was not materially improved by adding further factors, making the single-factor model the most parsimonious solution. As with the linear factor analysis model, items 15 and 23 were poor-fitting, with low factor loadings (.31 and .22, respectively). Item 22 had a moderately positive factor loading (L = .43), whereas the rest of the items all had large positive loadings above .50. The findings support the linear factor analysis results with respect to item fits, but suggest that items 15 and 23 should be removed from subsequent models. 3 Exploratory non-linear factor analysis models were fit using two software packages: LISCOMP and NOHARM. LISCOMP is a structural equation modeling program that is designed to work with dichotomous and/or ordinal data. NOHARM is a non-linear factor analysis program that analyzes moment matrices. Bom programs allow one to fit a two-parameter item response theory model (non-linear factor analysis model) to the data. Exploratory analysis focused on households with children in random 25 percent suhsamples of die Food Security Supplement sample. Households that did not pass the series of screening questions (i.e., higher-income households with no indication of food insecurity), and consequently were not asked the full series of food insecurity and hunger questions, were excluded from the analysis. 4 The two-parameter model can be fit with either item difficulty or uniqueness as the second parameter. The soecification shown here chose the uniqueness parameter. Prepared by Abt Associates Inc. 10 Chapter Two: Methods and Results of Scaling Analysis of CPS Data Exhibit 2-2 DESCRIPTIVE STATISTICS FOR MODEL M121 Variable Mean Std Sum QH .231 .421 231 Q15 .850 .356 850 Q16 .450 .497 450 Q18 .325 .468 325 Q19 .095 .293 95 Q20 .274 .446 274 Q21 .585 .492 585 Q22 .122 .327 122 Q23 .016 .125 16 Q24 .244 .429 244 Q28 .054 .226 54 Q32 .233 .423 233 Q35 .123 .328 123 Q38 .047 .211 47 Q40 .048 .213 48 Q43 .023 .150 23 Q47 .049 .216 49 Q50 .004 .063 4 Q53 .600 .490 600 Q54 .434 .495 434 Q55 .398 .489 398 Q56 .267 .442 267 Q57 .137 .344 137 Q58 .377 .484 377 Prepared by Abt Associates he. 11 Chapter Two: Methods and Results of Scaling Analysis of CPS Data Exhibit 2-3 SUMMARY OF FACTOR LOADINGS FOR MODEL M121 Item ItemLabd Standardized Regression Coefficients *1 Qll General food sufficiency question 70 Q15 Try to make food or money go further 31 Q16 Run out of foods needed to make meal 70 Q18 Borrow food or money to make meal 56 Q19 Take child to other home for meal 68 Q20 Serve few low-cost foods several days in a row 73 Q21 Put off paying bills to buy food 51 Q22 Get emergency food from church or food bank 43 023 Eat meal at soup kitchen 22 Q24 Adults cut or skip meals because not enough money for food 89 Q28 Adults don't eat for whole day 79 Q32 Eat less than should because not enough money to buy food 88 Q35 Hungry but don't eat because can't afford to 85 Q38 Lost weight because not enough food 75 Q40 Child's meal size cut because not enough money for food 76 Q43 Child skip meal because not enough money for food 60 Q47 Child hungry but can't afford more food 80 Q50 Child did not eat for a whole day 71 Q53 Worry food will run out before getting money for more 79 Q54 Food doesn't last and don't have money to get more 89 Q55 Can't afford to eat balanced meals 88 Q56 Can't feed children a balanced meal 85 Q57 Child uot eating enough because can't afford more food 83 Q58 Child fed only few low-cost foods, running out of money 82 Prepared by Abt Associates Inc. 12 Chapter Two: Methods and Results of Scaling Analysis of CPS Data 2.3 UNTOIMENSIONAL ONE-PARAMETER NON-LINEAR FACTOR ANALYSIS MODELS The exploratory non-linear factor analysis models indicated that the Food Security Survey items could be described efficiently as a unidimensional construct. Therefore, we pursued a specific non-linear factor model called the Rasch model. The Rasch model is a concise one-factor model that constrains the discrimination parameters (factor loadings) to be equal across all items. The statistical constraints of the Rasch model result in several desirable properties for the measurement scale, especially its robustness across multiple samples and multiple variations of the test (Wright and Masters, 1982). Furthermore, the preliminary exploratory models indicated that most of the items had very similar discrimination parameters when the discrimination parameters were allowed to vary. The computer program BIGSTEPS was designed specifically to fit the unidimensional Rasch model. All subsequent models described in this section were fit using BIGSTEPS. Five alternative measurement models based on existing theoretical frameworks were generated for the Food Security Survey items. The five alternative models are summarized in Exhibit 2-4. For most of the models, the items were divided into two subsets based on the specific time frame that the items referenced. For models R101, R102, and R103, the first subset of items references behaviors and events that occurred in the last 12 months, whereas the second subset references behaviors and events that occurred in the last 30 days. Models were fit separately for the 12-month and 30-day time periods. A general summary of item fits for the alternative models is presented in Exhibit 2-5. The identification of poorly-fitting items and/or redundant items is based on item in-fit and out-fit statistics. The out-fit statistic, nit is an unweighted fit statistic. It is based on a standardized residual, written as: iff where v.,- is the score residual for household n on item /, and W„ is the variance of the score *HI 5 Note in Exhibit 2-3 that nearly all factor loadings fall in die fairly narrow range from 70 to 88. The questions with loadings substantially outside this range (Q1S. Q18. Q21, Q22, Q23) are all ultimately excluded from the scale. Prepared by Abt Associates Inc. 13 Chapter Two: Methods and Results of Scaling Analysis of CPS Data Exhibit 2-4 ALTERNATIVE NON-LINEAR FACTOR ANALYSIS MODELS Model 12-Month Scale 30-Day Scale R101 Scale includes items that referenced events that occurred in the last 12 months. Items IS, 16, 18, 19, 20. 21, 22, 23, 24, 25, 28, 29, 32, 35. 38, 40, 43, 44, 47, 50, 53b. 54b. 55b. 56b. 57b, 58b. Scale includes items that referenced events that occurred in the last 30 days. Items 17, 26, 27, 30, 31, 33, 34, 36. 37. 39. 41. 42, 45. 46. 48. 49. 51. 52. R102 Scale includes items that referenced events that occurred in the last 12 months, and excludes resource augmenting behaviors (18, 19, 21. 22, and 23). Items 15, 16, 20, 24, 25, 28, 29, 32, 35, 38, 40, 43, 44, 47, 50, 53b, 54b, 55b, 56b, 57b, 58b. Scale includes items that referenced events that occurred in the last 30 days, and excludes resource augmenting behaviors. Items 17, 26, 27, 30, 31, 33, 34, 36, 37. 39, 41,42,45,46,48,49.51,52. R103 Scale includes food insecurity items based on the CCHIP model. Items 15, 18, 19, 20, 21. 22. 23, 53a. 55a, 56a, 58a. Scale includes food insufficiency and hunger items based on the CCHIP model. Items 16, 17, 24, 25, 26, 27, 28, 29, 30, 31. 32, 33, 34, 35, 36, 37, 38, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50. 51. 52. 54a. 57a. R104 NA Scale includes items that reference events that occurred in the last 30 days. When no 30-day reference was available, items that referenced the last 12-month period are included. Items 15. 17, 18, 19, 20, 21, 22, 23, 26, 27, 30. 31, 33, 34, 36, 37, 39, 41, 42, 45, 46, 48, 49, 51, 52, 53a, 54a, 55a, 56a, 57a, 58a. R105 NA Scale includes items that referenced 30-day period and number of days in die last month. Also includes items that reference "often true" in die last 12 months. Items 17, 26, 27, 30, 31, 33, 34, 36, 37, 39. 41, 42, 45, 46, 48, 49, 51, 52, 53a. 54a. 55a, 56a, S7a, 58a. NOTES: (1) For kerns that referenced number of days, one dummy code was created baaed on whether the behavior or experience occurred five or more times in die last month. (2) For items that referenced number of months, one dummy code was cteated by combining the two more extreme categories of the variable, indicating the experience occurred in three or more of the past 12 months. (3) For items Q53 through Q58, 'a' denotes a dummy code that represents 'often true,' whereas 'b' denotes a dummy code that combines 'sometimes true' and 'often true.' Prepared by Abt Associcaes Inc. 14 Chapter Two: Methods and Results of Scaling Analysis of CPS Data Exhibit 2-5 SUMMARY OF RESULTS FROM ALTERNATIVE NON-LINEAR FACTOR ANALYSIS MODELS Model 12-Month Scale 30-Day Scale Poorly Fitting Items Redundant Items Poorly Fitting Items Redundant Items R101 Q21, Q18, Q15. Q22 Q54b Q17 No redundant items R102 Q15, Q16, Q20 No redundant items Q17 No redundant items R103 No poor fitting items. No redundant items Q16, Q17, Q43 Q26 R104 NA NA Q22, Q23 Q33 R105 NA NA Q58a. Q17 No redundant items residual. The standardized residual is then squared and averaged to obtain a mean estimate of item fit. M/ = N The in-fit statistic, vt, is a weighted fit statistic that includes the same squared standardized residual as n(, and is written as: Both the in-fit and out-fit statistics have an expected value of 1.0. As they deviate from 1.0, die associated items become candidates for removal from the scale. Generally speaking, a mean square fit statistic that is greater than 1.20 indicates a poor fitting item, whereas a mean square fit statistic that is less than .80 indicates an item is redundant with other similar types of hems in the scale. Items that have both an in-fit and out-fit statistic above 1.2 art targeted for removalfrom the scale. Items with both in-fit and out-fit statistics below .80 are redundant with respect to the information they share with other items in the scale. Items that were shown to be redundr.at Hems were also considered for removal and/or combined with other items. Below we focus on describing the results of the 12-month and 30-day scale for M102, because Prepared by Abt Associates Inc. 15 Chapter Two: Methods and Results of Scaling Analysis of CPS Data these two specific models were subsequently considered the most parsimonious by the study team. 12-Month Food Security Scale As with the linear factor analysis models, all Rasch models were initially tested using only households with children, because they comprised the subsample of households that were administered the entire set of food security items. The results for Model M102 are presented in Exhibit 2-6. The summary table contains a large amount of information, briefly described below. The order of items in the table is determined by their item calibration, shown in the fourth column of Exhibit 2-6. A question's item calibration represents the point on the scale at which there is a 50 percent probability that any given household will respond "yes" to the item. That is, households with higher values on the scale than a particular item's calibration score have a greater than SO percent probability of answering that item positively; households with lower values have a less than SO percent probability of a positive response to the item in question. The items are listed from high calibration at the top of the table to low calibration at the bottom. The item calibration is a function of (1) the total number of individuals that have responded to any item in the scale (1,687); (2) the number of individuals that responded to the particular item in the scale (ft); and (3) the number of positive responses to the particular item (raw score). For example, hem SO refers to the item "child did not eat for a whole day." The item has an item calibration of 4.S6, which is the highest in the table. This event occurs rarely in any household. For this specific subsample, this event occurred for only 12 of the 1,684 households that responded to the item. At the other end of the scale, item IS ("run short of money and try to make food or food money go further") is the least severe item included in the analysis. The item has the low calibration of -S.74, based on 1,469 positive responses out of the 1,686 households that answered the question. The column headed "Real SE" shows the standard error of the items, which can be used to create a confidence interval for the Hern calibration. Items located at the severe end of the scale tend to have the largest standard errors, because they tend to have larger variances compared to items throughout the center and less-severe end of the distribution. Prepared by Abt Associates Inc. 16 Chapter Two: Methods and Results of Scaling Analysis of CPS Data Exhibit 2-6 SUMMARY OF MODEL R102A Item n Raw Score Item 1 Calibration Real SE In-fit Out-fit | Point Mean Sq Z Mean Sq Z Biserial Corr. Q50 1,684 12 4.81 .30 .99 0.0 .28 -0.4 .19 Q44 1.684 23 4 01 .22 1.00 0.0 .41 -0.5 .24 Q43 | 1,684 38 3.36 .18 1.04 0.3 1.73 0.5 .28 Q29 | 1,683 62 2.68 ,4 .89 -1, .28 -1.3 .39 Q40 1,683 86 2.21 .13 1.01 0.1 1.99 1.2 .40 Q47 1,684 89 2.15 .12 .88 -1.5 .56 -0.8 .44 Q38 1,683 91 2.12 .13 1.07 0.8 .46 -1.1 .40 Q28 1,684 95 2.06 ..2 .95 -0.6 .41 -1.3 .44 Q35 1,685 212 .65 .09 .91 -1.6 .83 -0.6 .57 Q57 1,680 246 .36 .09 1.00 0.1 .60 -1.8 .57 Q25 1,683 293 -.01 .08 .94 -1.3 .56 -2.4 .61 Q32 1,683 442 -.98 .07 .94 -1.5 .67 -2.7 .64 Q24 1,685 449 -1.01 .07 .86 -3.5 .67 -2.8 .67 Q56 1,679 466 -1.12 .07 1.04 0.9 .75 -2.1 .61 Q20 1,686 480 -1.19 .08 1.24 5.5 1.50 3.5 .52 Q58 1,680 671 -2.18 .07 .99 -0.4 .96 -0.4 .60 Q55 1,678 706 -2.36 .07 .87 -3.6 .68 -3.5 .64 Q54 1,679 785 -2.73 .06 .82 -5.2 .74 -2.5 .64 Q16 1,687 795 -2.77 .07 1.23 5.9 1.22 1.9 .50 Q53 1,680 1,066 -4.01 .06 .95 -1.6 .85 -0.8 .49 Q15 1,686 1.469 -6.06 .09 1.31 6.7 7.70 5.5 ln Mean 1,683 408 .00 .11 1.00 -0.1 1.14 -0.6 SD 2 382 2.82 .06 .13 2.9 1.53 2.1 NOTE: Sample include* households with children only, bents are ordered on terms of severity. Prepared by Abt Associates Inc. 17 Chapter Two: Methods and Results of Scaling Analysis of CPS Data For the 12-month scale presented in Exhibit 2-6, there are three items with both in-fit and out-fit statistics that exceed 1.20 (Q1S, Q16, and Q20). Therefore, these three items were removed from the scale, and the model re-estimated. The results of the revised model are presented in Exhibit 2-7. The effective sample size for the revised model is reduced (n = 1,276) because two of the least severe items were removed from the analysis. This results in fewer subjects who have responded yes to any particular item. For the revised model, there are no items with both in-fit and out-fit statistics that exceed 1.20. Similarly, there are no items with both in-fit and out-fit statistics below .80. Some of the out-fit statistics were small, due primarily to dependencies in some item pairs. For example, item 29 has a low out-fit statistic (mean square = .36), but the item is associated with item 28. We examined several alternative models with these items modeled as trichotomies rather than the multiple dichotomies, but the basic results of the models did not change. Final 12-Month Food Security Scale The analyses for the 12-month scale were replicated on subsequent subsamples of the data set.6 The model replications provided clear support for the invariance of the primary measurement model across subsamples, as well as across different types of households. In each replication, the item calibrations gave identical or near-identical rankings of item severity and consistent clustering of closely-ranked items. Applying models fit on separate subsamples yielded household values that correlated at the .99 level.7 The final model estimates are based upon all households in the analysis sample; these are presented in Exhibit 2-8. Of the 18,370 households that passed the screener and responded to at least half of the questions applicable to them, there were 7,897 households in which the respondent answered "yes" to at least one of the 12-month scale items. The ordering of the 6 The overall sample was initially divided into four random subsamples. Initial model estimation was carried out for households with children within one nibsample. Tests for invariance were performed for households with children in the other three random subsamples. Invariance tests were also performed for households without children, subdividing them into households with any elderly members (age 60 or over) and households with no elderly members. 7 In this procedure, we separately fit the model to each subpopulation, such as households with children, households with no children but with elderly members, and households with neither children nor elderly. Each of die separate models was then used to compute scale values for all households in the full sample. The values computed with the different models were then compared through plotting and correlation analysis. Prepared by Abt Associates Inc. 18 Chapter Two: Methods and Results of Scaling Analysis of CPS Data Exhibit 2-7 SUMMARY OF REVISED MODEL R102A ■ Item II _ Raw Score Item Calibration Real SE In-fit Out-fit Point Biserial 1 torr. Mean Sq Z Mean Sq Z Q50 1,275 12 4.38 .30 .96 -0.2 32 -0.5 .21 Q44 | 1,275 23 3.59 .22 .99 -0.1 .50 -0.5 .25 Q43 | 1.275 38 2.93 .18 1.01 0.1 1.50 0.5 .29 Q29 | 1.274 62 2.26 .14 .90 -1.0 .36 -1.4 .40 Q40 1.274 86 1.77 .13 1.02 0.2 2.34 2.0 .39 Q47 1,275 89 1.72 .12 .88 -1.4 .70 -0.7 .45 Q38 1,274 91 1.69 .13 1.09 1.1 .65 -0.8 .39 Q28 1,275 95 1.63 .12 .96 -0.5 .52 -1.3 .44 Q35 | 1.276 212 .21 .09 .95 -0.9 1.09 0.4 .55 Q57 1,274 246 -.11 .09 .99 -0.2 .65 -2.1 .56 Q25 1,274 293 -.49 .08 .98 -0.4 .76 -1.6 .57 Q32 1,274 442 -1.53 .08 1.01 0.2 .99 -0.1 .57 Q24 1,276 449 -1.56 .08 .96 -1.0 1.01 0.1 .59 Q56 1,273 466 -1.68 .08 1.08 1.9 .97 -0.3 .54 Q58 1,274 671 -2.89 .08 1.11 2.6 1.28 2.1 .47 Q55 1,272 706 -3.09 .07 .94 -1.7 .84 -1.2 .53 Q54 1,273 785 -3.54 .07 .92 -2.2 .94 -0.4 .49 Q53 1,274 1.066 -5.28 .09 1.16 3.7 1.28 0.7 .23 Mean 1,274 324 .00 .12 .99 0.0 .93 -0.3 SD 1 303 2.70 .06 .07 1.5 .46 1.1 NOTE: Simple includes households with children only. Items are ordered in terms of severity. Prepared by Abt Associates Inc. 19 Chapter Two: Methods and Results of Scaling Analysis of CPS Data items in the final model changes slightly relative to the orde. ing of the items described in Exhibit 2-7; however, these minor fluctuations in item severities are expected with different random subsrmples of households.8 Exhibit 2-9 shows the frequency distribution for the number of responses to items in the survey. The two most frequent response patterns are 10 items and 18 items.9 The response pattern of 10 items applies largely to the households without children, because these had the opportunity to respond to a maximum of 10 items. The response pattern of 18 items applies to households with children, who had an opportunity to respond to 18 items. These two response patterns account for 98.8 percent of the households, indicating a very low incidence of item nonresponse (1.2 percent of all respondents). Households, whether with or without children, that responded to less than half the items administered had their household score set to "missing." The central function of the Rasch model is to assign to each responding household a value on the food security scale. The household scale value is fundamentally based on a count of the number of affirmative responses to questions included in the scale. At its simplest, if all households respond to the same set of questions, the household scale value is a constant arithmetic transformation of the count of positive responses. For example, among households with children responding to all 18 questions in the scale, all households with three positive responses have a scale value of -4.13. Households with more affirmative responses have higher scale values; for example, households with children giving ten affirmative responses have a scale value of 0.62. The scale value does not depend on which questions the household answers affirmatively: all households with children who give three affirmative answers have the same scale value, even if they give affirmative answers to quite different questions. * The Rasch model software initially assigns scale values in a range that yields i mean of zero. Because the presence of positive and negative values in the scale can be confusing or misleading, it is conventional to transform the values into a range such as 0-1,0-10, or 0-100. Values of the 12-month scale presented in other reports from this project transform the original scale values to range from 0.0 to 10.0. The original value is multiplied by .8333 and added to S.071 to obtain the transformed value. All respondents giving zero affirmative responses are assigned a value of zero, and respondents answering all questions affirmatively get a value of 10.0. 9 Over half of all households in the sample were higher-income households that did not pass the screening questions, and therefore were not asked any of the questions included in the scales. Prepared by Abt Associates Inc. 20 Chapter Two: Methods and Results of Scaling Analysis of CPS Data Exhibit 2-8 SUMMARY OF FINAL 12-MONTH SCALE Item ■ Raw Score Item Calibration Real SE In-fit Out-fit Point Biserial Corr. Trans-formed Mean Sq Z Mean Sq Z Item Calibra- 1 tion' Q50 4,333 29 4.92 .20 1.09 0.5 6.02 1.8 .18 9.2 Q44 4,331 87 3.48 .12 .84 -1.8 .28 -1.6 .34 8.0 Q43 4,332 135 2.86 .10 .88 -1.7 .78 -0.5 .37 7.5 Q29 7,889 332 2.55 .06 .89 -2.5 .55 -1.8 .35 7.2 Q47 4,333 257 1.88 .07 .93 -1.3 .97 -0.1 .44 6.6 Q28 7,892 537 1.82 .05 .97 -1.0 1.16 0.8 .39 6.6 Q40 4,333 290 1.69 .07 1.01 0.3 1.28 1.0 .44 6.5 Q38 7,861 625 1.54 .05 1.10 3.1 1.31 1.6 .39 6.4 Q35 7,883 1,249 .27 .04 .91 -4.0 .77 -2.6 .54 5.3 Q57 4,324 779 -.15 .05 1.07 2.3 .86 -1.4 .53 5.0 Q25 7,879 1,919 -.70 .03 .93 -3.4 .76 -4.6 .58 4.5 Q32 7,885 2,661 -1.56 .03 .94 -3.5 .94 -1.5 .57 3.8 Q56 4,325 1,453 -1.64 .04 1.08 3.4 .94 -1.0 .54 3.7 Q24 7,893 2,824 -1.72 .03 .88 -7.3 .87 -3.2 .59 3.6 Q58 4,324 2,295 -3.10 .04 1.14 6.5 1.29 3.3 .43 2.5 Q55 7,862 4,627 -3.42 .03 1.03 2.1 1.61 7.9 .41 2.2 Q54 7,863 4,973 -3.73 .03 .92 -5.9 1.06 0.8 .42 2.0 Q53 7,870 6.312 -4.99 .03 1.16 9.9 3.04 9.4 .18 0.9 Mean 6.301 1.744 .00 .06 .99 -0.2 1.36 0.5 SD 1,763 1.833 2.71 .04 .10 4.2 1.26 3.5 * The transformed kern calibration is a linear transform of the kern calibration that place* all values in the range from 0.0 tc 10.0. If all respondents are given exactly the same set of questions, the scale value depends solely on the number of affirmative responses. If different respondents answer different sets of questions, however, scale values depend on the severity (as indicated by the item calibration) of the questions that the respondent answers. In the current situation, households with children are asked 18 questions, whereas those without children are asked only ten. Moreover, the Prepared by Abt Associates Inc. 21 Chapter Two: Methods and Results of Scaling Analysis of CPS Data Exhibit 2-9 NUMBER OF QUESTIONS ANSWERED: QUESTIONS IN THE 12-MONTH SCALE Number of Questions Answered Frequency Percent Cumulative Frequency Cumulative Percent 2 7 0.0 7 0.0 3 4 0.0 11 0.1 4 6 0.0 17 0.1 5 11 0.1 28 0.2 6 14 0.1 42 0.2 7 53 0.3 95 0.5 8 11 0.1 106 0.6 9 51 0.3 157 0.9 10 10293 55.9 10450 56.8 12 21 0.1 10471 56.9 13 2 0.0 10473 56.9 14 2 0.0 10475 56.9 15 3 0.0 10478 56.9 16 11 0.1 10489 57.0 17 29 0.2 10518 57.1 18 7888 42.9 18406* 100.0 * Households that answered fewer than half of the applicable questions are excluded from the main analysis, reducing the sample to 18,370. questions asked only of households with children are disproportionately the more severe questions. The Rasch model takes these differences into account, assigning values to both types of household that are comparable even though they responded to different types of questions. Similarly, the model adjusts the scale values assigned to households with or without children that failed to respond to one or more of the 'terns applicable to them. The frequency distribution of household values on the 12-month scale is presented in Exhibit 2-10. Household values for the 12-month scale range from -6.08 to 5.91 in the original model estimation (values transformed to a 0-10 range are also shown). Most households in the analysis sample responded "no" to all items in the scale, and received a scale value of -6.08 Prepared by Abt Associates Inc. 22 Chapter Two: Methods and Results of Scaling Analysis of CPS Data Exhibit 2-10 FREQUENCY DISTRIBUTION FOR HOUSEHOLD VALUES ON THE 12-MONTH SCALE I Value on Scale Frequency Percent Cumulative Frequency Cumulative Percent Transformed Scale Value" -6.08 10276 56.5 10276 56.5 0.0 -5.2 970 5.3 11246 61.9 0.7 -4.96 902 5.0 12148 66.8 0.9 -4.13 661 3.6 12809 70.5 1.6 -3.73 614 3.4 13423 73.8 2.0 -3.36 550 3.0 13973 76.9 2.3 -2.73 657 3.6 14630 80.5 2.8 -2.69 386 2.1 15016 82.6 2.8 -2.09 343 1.9 15359 84.5 3.3 -1.82 306 1.7 15665 86.2 3.6 -1.52 358 2.0 16023 88.1 3.8 -0.97 255 1.4 16278 89.5 4.3 ^0.96 285 1.6 16563 91.1 4.3 -0.43 188 1.0 16751 92.1 4.7 -0.09 295 1.6 17046 93.8 5.0 0.1 176 1.0 17222 94.7 5.2 0.62 132 0.7 17354 95.5 5.6 0.81 231 1.3 17585 96.7 5.8 1.13 86 0.5 17671 97.2 6.0 1.62 59 0.3 17730 97.5 6.4 1.75 128 0.7 17858 98.2 6.5 2.12 59 0.3 17917 98.6 6.8 2.65 28 0.2 17945 98.7 7.3 2.88 85 0.5 18030 99.2 7.5 3.24 15 0.1 18045 99.3 7.8 3.77 103 0.6 18148 99.8 8.2 3.96 12 0.1 18160 99.9 8.4 5.02 13 0.1 18173 100.0 9.3 5.91 6 0.0 18179** 100.0 10.0 * The transformed scale value is a linear transform that places all values in the range from 0.0 to 10.0. k Includes only households that responded to all applicable items. Prepared by Abt Associates Inc. 23 Chapter Two: Methods and Results of Scaling Analysis of CPS Data (10,276 households).10 All other households responded "yes" to at least one item. Their assigned scale value is a non-linear transformation of the total number of items to which they responded affirmatively. If all households had responded to all 18 items, there would be 19 possible scale score values that could be assigned to households. Because households without children could respond to only 10 items, however, there are a number of additional scale scores that can be assigned to households based on a missing data adjustment that is part of the Rasch measurement model. The small proportion of households in either group that failed to respond to one or more questions also received distinct measure scores, depending on the number of items missed. Final 90-Day Food Security Scale The 30-day scale was developed in the same manner as the 12-month scale, though there were fewer 30-day items available for analysis. The 30-day scale also has a larger number of item dependencies than the 12-month scale. The results of the final Rasch model for the 30- day scale are presented in Exhibit 2-11. The 30-day scale includes 17 items, and the estimated item calibrations range from -4.37 to 4.00. For the most severe item (item 52), only five households responded affirmatively. Exhibit 2-12 shows the number of responses households made to the 30-day items administered in the survey. Similar to the 12-month scale, there were two major response categories: 9 (households without children) and 17 (households with children). These two response patterns account for 99.3 percent of households. Here also, households that did not respond to at least half the items administered had their scale value set to "missing." Exhibit 2-13 provides the frequency distribution of the 30-day household scale scores. The scale scores range from -S.62 to S.32. Almost 90 percent of the households that passed the series of screening questions responded "no" to all items in the 30-day scale. The 30-day scale in its present form is not considered as useful as the 12-month scale, for both conceptual and statistical reasons. Conceptually, the 30-day scale provides detail on a narrower portion of the spectrum of food insecurity than the 12-month scale. Most of the less- 10 For analyses involving the full sample, households that did not pass the screen are assigned the minimum possible score (—6.08). This procedure is also used in classifying households on the food security status variables. Prepared by Abt Associates Inc. 24 Chapter Two: Methods and Results of Scaling Analysis of CPS Data Exhibu 2-11 SUMMARY OF FINAL 30-DAY SCALE Item ■ Raw Score Item Calibration Real SE In-fit (hit-fit Point Biaerial Corr. Mean Sq Z Mean Sq Z Q52 990 5 4.00 .45 .83 -0.4 .22 -0.7 .23 Q51 990 13 2.91 .30 1.07 0.3 1.04 0.0 .20 Q46 988 21 2.33 .23 .92 -.4 .68 -0.5 .34 Q31 1992 83 1.61 .12 .83 -1.9 .27 -3.2 .34 Q49 990 45 1.37 .16 .80 -1.7 .44 -1.7 .47 Q42 990 64 .91 .14 .88 -1.1 .59 -1.5 .46 Q45 988 69 .80 .14 1.10 1.0 1.67 1.8 .32 Q37 1985 249 .10 .08 .84 -3.3 .51 -4.1 .46 Q48 990 129 -.09 11 1.03 0.4 1.07 0.4 .40 Q30 1992 294 -.17 .07 1.08 1.8 1.22 1.6 .34 Q41 990 154 -.37 .11 1.14 2.1 1.42 2.3 .34 Q39 1958 344 .48 .07 1.18 4.0 1.42 3.4 .29 Q34 1983 611 -1.52 .06 .92 -2.6 .73 -4.7 .46 Q36 1985 637 -1.61 .06 .94 -1.9 .91 -1.4 .44 Q27 1993 715 -1.86 .06 1.04 1.2 .96 -0.8 .37 Q33 1983 1285 -3.56 .05 .97 -1.3 .87 -1.5 .29 Q26 1993 1549 -4.37 .06 1.13 4.3 1.54 3.3 .14 Mean 1516 369 .00 .13 .98 0.0 .92 -0.4 SD 497 444 2.12 .10 .12 2.1 .43 2.3 severe conditions and behaviors incorporated in die 12-month scale were not measuied in the 30-day time frame in the CPS Supplement. The 30-day measures thus focus on reductions of food intake and related indicators of hunger, providing little information on food insecurity with no hunger evident. The broader range of the 12-month scale makes it likely to be more useful both in describing the conditions of the population at a point in time and in monitoring changes. Statistically, Chapter Three will show that the 30-day scale is considerably less reliable than the 12-month scale in its ability to discriminate between households at varying levels of Prepared by Abt Associates Inc. 25 Chapter Two: Methods and Results of Scaling Analysis of CPS Data Exhibit 2-12 NUMBER OF QUESTIONS ANSWERED: QUESTIONS IN THE 30-DAY SCALE Number of Responses Frequency Percent Cumulative Frequency Cumulative Percent 2 7 0.0 7 0.0 3 1 0.0 8 0.0 4 6 0.0 14 0.1 5 2 0.0 16 0.1 6 10 0.1 26 0.1 7 17 0.1 43 0.2 8 35 0.2 78 0.4 9 10369 56.3 10447 56.8 10 1 0.0 10448 56.8 11 2 0.0 10450 56.8 13 1 0.0 10451 56.8 15 16 0.1 10467 56.9 16 15 0.1 10482 57.0 17 7922 43.0 18404 100.0 food insecurity. This more limited reliability stems mainly from the smaller number of independent questions asked in the 30-day time frame. The 30-day scale has just nine independent items, and a total of 17 when follow-up items are included.11 The 12-month scale has IS independent questions, plus three follow-up items. In addition, the absence of questions measuring the less severe food insecurity conditions creates a situation in which an extremely small proportion of the population gives affirmative responses to any of the items, which makes it more difficult for the scale to discriminate reliably among different levels of food insecurity. For these reasons, the main report of this study focuses almost exclusively on the 12- month scale, and this report provides less detail on the 30-day than die 12-month scale. Estimates of die prevalence of hunger based on the 30-day scale are presented in Appendix B. 11 The primary question typically asks if a particular behavior or condition occurred in the past 30 days. If the response is affirmative, the follow-up question men asks on how many of die 30 days die behavior or condition occurred. Prepared by Abt Associates Inc. 26 Chapter Two: Methods and Results of Scaling Analysis of CPS Data Exhibit 2-13 FREQUENCY DISTRIBUTION OF HOUSEHOLD VALUES ON THE 30-DAY SCALE Scale Values Frequency Percent Cumulative Frequency Cumulative Percent -5.62 16309 89.2 16309 89.2 -4.69 261 1.4 16570 90.6 -4.63 288 1.6 16858 92.2 -3.5 239 1.3 17097 93.5 -3.39 246 1.3 17343 94.8 -2.66 123 0.7 17466 95.5 -2.45 96 0.5 17562 96.0 -2.01 113 0.6 17675 96.7 -1.68 144 0.8 17819 97.5 -1.47 67 0.4 17886 97.8 -1 57 0.3 17943 98.1 -0.97 69 0.4 18012 98.5 -0.56 34 0.2 18046 98.7 -0.25 59 0.3 18105 99.0 -0.14 25 0.1 18130 99.2 0.27 23 0.1 18153 99.3 0.57 47 0.3 18200 99.5 0.68 9 0.0 18209 99.6 1.11 5 0.0 18214 99.6 1.57 5 0.0 18219 99.6 1.7 24 0.1 18243 99.8 2.08 4 0.0 18247 99.8 2.62 31 0.2 18278 100.0 2.66 2 0.0 18280 100.0 3.39 3 0.0 18283 100.0 4.44 1 0.0 18284 100.0 5.32 1 0.0 18285 100.0 Prepared by Abt Associates Inc. 27 Chapter Two: Methods and Results of Scaling Analysis of CPS Data 2.4 SUMMARY The scale development process involved five main steps: • Exploratory linear factor analysis replicating key elements of prior research, which indicated that the response patterns and relationships in the CPS Food Security Supplement were largely similar to those seen previously. • Estimation of two-parameter non-linear models, which indicated that a one-factor solution would be appropriate. • Preliminary estimation of one-factor Rasch models on a one-fourth random subsample of the full CPS sample, resulting in the specification of an 18-item set for inclusion in the 12-month scale and a 17-item set for the 30-day scale. • Tests of invariance of the model across other random subsamples of the full population and across three demographic subgroups (households with children, households without children but with elderly members, and households with neither children nor elderly members), which indicated that the models were quite invariant across groups. • Estimation of the final scales on the full CPS sample. Subsequent chapters of this report detail the steps taken to test the scales for reliability, construct validity, and estimation bias. Primary attention is given to the 12-month scale, which appears more useful than the 30-day scale on both conceptual and statistical grounds. Prepared by Abt Associates Inc. 28 CHAPTER THREE RELIABILITY ESTIMATES FOR THE FOOD SECURITY SCALES Whenever an instrument is used to measure some quality of a person — whether it be a heart rate, a psychological profile, or a level of food insecurity — researchers want to be assured that the instrument is reliable. A reliable instrument is one that, if it were administered to the same individual on two occasions under similar conditions, would provide similar results in both tests. Reliability indices therefore attempt to measure the degree to which an individual's score is expected to remain stable (relative to other individuals' scores) over repeated occasions using the same instrument. Often it is not feasible to administer an instrument repeatedly to the same individuals under similar circumstances. Reliability indices have therefore been developed that attempt to approximate this result through a single administration of the instrument. Most reliability indices for multi-item scales attempt to provide an estimate of the ratio of the true score variance to the total variance for a particular instrument. The underlying concept is that an individual's score on a scale (x) is composed of the individual's "true" score (t) and an error component. A general equation for a measure indicating the reliability of a scale (p) can be written as: ax where a2, is the variance of the households' true scores and a2x is the variance of the observed measure (i.e., the household scores on the scale). There are a number of reliability indices available for characterizing the reliability of a measure. Because the food security scales are estimated using a Rasch modeling approach, die most appropriate index is the Rasch reliability index. Because the Rasch reliability index has not been used as often in the scale development literature as some other reliability estimators, however, we provide estimates using some of the more common reliability indices as veil as die Rasch reliability index to characterize the reliability of the food security scale. One major difference between the more traditional reliability indices and the Rasch reliability index is the treatment of cases with extreme scores. Cases with extreme scores are Prepared by At* Associates Inc. Chapter Three: Reliability Estimates for the Food Security Scales those with either the maximum or minimum score possible on the measure (i.e., those that have responded affirmatively to all questions in the scale, or negatively to all questions). When scale scores are normally distributed over a population, very few cases have extreme scores and consequently they have very little impact on the reliability estimate. When the distribution is severely skewed, however, the treatment of cases with extreme scores can have a major impact on reliability estimates. This is very relevant to the food security scales, because over 80 percent of the population has the lowest possible score on the 12-month scale and over 90 percent on the 30-day scale. Because of differences in estimation algorithms, the Rasch reliability estimate always decreases when extreme scores are included, whereas the more traditional reliability estimates always increase. The Rasch model typically provides two reliability estimates, one including and one excluding the cases with extreme scores. The conventional practice with the more traditional reliability indices is to include the extreme scores. The discussion below provides separate reliability estimates that include and exclude extreme scores. In general, the estimate excluding households with extreme scores can be taken as indicating the reliability of the scale in measuring the severity of food insecurity and hunger among households that have experienced at least one of the food insecurity or hunger conditions represented in the scale. The interpretation of the estimate when extreme scores are included is less clear. Among the more traditional indices, Nunnally (1978) recommended that at least two types of reliability coefficients be reported: correlations between alternate test forms, and coefficient alpha. The discussion below presents the rest, using three traditional reliability indices, two of which are based on the correlation between alternate test forms (the Spearman- Brown split-half reliability estimate, and Rulon's split-half reliability estimate), and Cronbach's alpha. All three reliability indices are based on the use of linear composites, and therefore do not correspond exactly to the Rasch model (a non-linear model). Nonetheless, the indices provide a general indication of the reliability of die scale and familiar measures that may be compared to other work. Prepared by Abt Associates Inc. 30 Chapter Three: Reliability Estimates for the Food Security Scales 3.1 SPEARMAN-BROWN SPLIT-HALF RELIABILITY ESTIMATES The general form of the Spearman-Brown prophecy formula can be written as: fry P* = l+(*-l)pw/ ' where pjp represents the reliability of the composite measure with k parallel tests, and pu represents the reliability of any one particular test. A simplified form of the equation can be written as: 2Pofr Psp ■ - Pab where Pd, represents the correlation coefficient between two parallel tests. In order to create two somewhat parallel tests, the item pool (i.e., all the items used in the scale) is typically split in half randomly. Each subset of the items is considered a separate scale, and the results of the two scales arc compared. When the number of available items is small, as in the present situation, a commonly used method is to order the items in terms of severity and assign odd-numbered items to one test and even-numbered items to another test. The two new scales should have the same number of items, so if the item pool contains an odd number of hems, one is dropped before the pool is split. To estimate p_ for the 12-month scale, it was necessary to drop dependent items in order to generate unbiased reliability estimates.1 It was also considered informative to generate reliability estimates separately for items that were administered only to households with children and for hems that were administered to all households. For households whh children, there were 15 independent items available to create two parallel measures. Because there were an odd number of items, the most severe item was dropped from the list. For the first parallel scale, households' responses to items 43, 28, 38, 57, 56, 58, and 54 were summed to create the household score. For the second parallel scale, hems 47, 40, 35, 32, 24, 55, and 53 were summed. Based on the correlation between 1 Dtpmrtw* items are those that are follow-ups to previous items. A number of items in the food insecurity scales have an initial question (e.g., did this situation occur within the past 12 months?) and a follow-up (e.g., in how many of the past 12 months did the situation occur?) Prepared by Abt Associates Inc. 31 Chapter Three: Reliability Estimates for the Food Security Scales household scores on these two scales, the Spearman-Brown reliability estimate for the total scale was .852 with extreme scores excluded (see Exhibit 3-1). Including extreme scores raises the reliability index to .903. Exhibit 3-1 SUMMARY OF RELIABILITY ESTIMATES USING TRADITIONAL INDICES Household Type Reliability Estimate Extreme Scores Included Extreme Scores Excluded 12-Morth Scale All households Spearman .899 .794 Rulon .932 .878 Alpha .856 .743 Households with children Spearman .903 .852 Rulon .899 .813 Alpha .882 .814 30-Day Seal* All households Spearman .840 .357 Rulon .888 .650 Alpha .789 .356 Households with children Spearman .852 .530 Rulon .844 .530 Alpha .799 .555 For all household types (i.e., households with any combination of either children, adults, and elderly), there were eight independent items available to create two parallel measures. For the first parallel scale, items 28, 35, 24, and 54 were summed. For the second parallel scale, hems 38,32,55, and 53 were summed. The reliability estimate for the total scale is .794 with extreme scores excluded, and .899 with extreme scores included. For the 30-day scale, the reliability estimate for households with children is .530 and the reliability estimate for all households is .357 with extreme scores excluded. Including extreme scores generates a striking increase in the reliability estimates, to .852 for households with children and .840 for all households. Prepared by Abt Associates Inc. 32 Chapter Three: Reliability Estimates for the Food Security Scales Note that, although including cases with extreme scores increases the reliability estimate for both scales, the effect is particularly striking for the 30-day scale. This occurs for three reasons. First, the number of items in the paired subscales is smaller for the 30-day scale. The 30-day scale contains just five independent items that apply to all households, and ten that apply to households with children. This means thai the split-half scales each contain just two items in the analysis for all households, and five in the analysis of households with children. In contrast, the split-half 12-month scales contain four items for the analysis of all households and seven items for the analysis of households with children. Smaller numbers of items in general lead to lower reliability estimates. The second factor is that the 30-day scale measures a narrower band of the spectrum of food insecurity than the 12-month scale. The least severe items in the 12-month scale were not asked in the 30-day time frame. This means that the 30-day scale not only contains fewer items, but that the scale is attempting to make distinctions within a narrower range than the 12- month scale. In effect, this means that the 30-day scale faces a more difficult challenge in distinguishing the varying levels of food insecurity and hunger among those households that have experienced one or more of the conditions measured. The final distinction between the scales is that a far greater proportion of households answered negatively to all items on the 30-day scale than the 12-month scale (89 percent vs. 57 i percent of households that passed the screening questions). Thus, including or excluding the households with extreme scores will have a greater effect on the 30-day than the 12-month scales. 3.2 RULON'S SPLIT-HALF RELIABILITY ESTIMATES Rulon proposed an alternative method for estimating the reliability of a scale using the split-half tests.2 The method involves estimating the difference between household scores on two parallel tests and estimating the ratio of the variance of the difference score to the variance of die total score. The equation for Rulon's method is written as: 2 Rulon, P.J., "A Simplified Procedure for Dctcrmini g the Reliability of a Test by Split Halves,' Harvard Educational Review vol. 9, pp. 99-103, 1939. Prepared by Abt Associates Inc. 33 Chapter Three: Reliability Estimates for the Food Security Scales **•!- 2 where a2 2 D is the variance of the difference score and ax is the variance of the total score. To estimate the index, we used the same subsets of items described above for the Spearman test, again performing the computation both for households with children and for all households (see Exhibit 3-1). For the 12-month scale, the reliability estimate for households with children is .813 and the estimate for all households is .878 with extreme scores excluded. When extreme scores are included, the estimates increase to .899 for households with children and .932 for all households. For the 30-day scale, the reliability estimate for households with children is .530 and the reliability estimate for all household types is .650 when extreme scores are excluded. Including the extreme scores raises the estimates to .844 and .888, respectively. 3.3 CRONBACH'S ALPHA RELIABILITY ESTIMATES Cronbach's alpha and Kuder Richardson 20 (McDonald, 1985) produce identical results when using independent items that are dichotomous in form. Therefore, for the 12-month scale, these two equations are interchangeable. For simplicity, we will refer to Cronbach's alpha when describing these reliability estimates. Cronbach's alpha was developed to circumvent problems associated with the non-random selection of subsets of items when using methods such as the Spearman-Brown or Rulon methods. Cronbach's alpha, aa, can be written as: °a * [wr] i- *? 2 2 where k represents the number of items in the test, at represents the variance of item »', andax represents the variance of the total test score. Alpha is considered to be the lower bound of the true theoretical reliability estimate, the coefficient of precision. Prepared by Abt Associates Inc. 34 Chapter Three: Reliability Estimates for the Food Security Scales The overall reliability estimates, summarized in Exhibit 3-1, are similar to thore seen with the prior tests. With extreme scores excluded, the values of o for the 12-month scale are .814 for households with children and .743 for all households. Including the households with extreme scores raises the estimates to .882 for households with children and .856 for all households. For the 30-day scale, the a values are .555 for households with children and .356 for all households when cases with extreme values are excluded. When households with extreme values are included, the values are .799 for households with children and .789 for all households. In addition to assessing the reliability of the total scale, Cronbach's alpha is often used to examine the appropriateness of including individual items in the scale. The usual rule s that if a increases substantially when an item is removed from the scale, the item should be considered for removal. It is also possible to evaluate how the reliability of the scale changes when any one item is removed from the scale. Exhibits 3-2 and 3-3 show that in nearly all instances, removing an item would reduce the estimated reliability of the scale. The only potential exception would be item 53;3 removing this item would generate a small increase in the reliability estimate with extreme scores excluded, but the loss of information at the end of the scale would be more detrimental to scale validity than is justified by this small increase in reliability. 3.4 RASCH MODEL RELIABILITY ESTIMATES The Rasch reliability indices behave in a slightly different manner and yield somewhat lower estimates of reliability than the more traditional indices presented above. The reliability index for the Rasch Scale is defined as: (a2 x-MSE) 3 Removing item 28 with extreme scores included also generates an increase in a, but die difference is tiny (measured in the third decimal). Prepared by Abt Associates Inc. 35 Chapter Three: Reliability Estimates for the Food Security Scales Exhibit 3-2 CRONBACH'S ALPHA FOR THE 12-MONTH SCALE FOR HOUSEHOLDS WITH CHILDREN Item Extreme Scores Included (a * .882; n=7,888) Extreme Scores Excluded (a * .814; *=4,278) Item Mean Correlation with Total Score a with Item Deleted Item Mean Correlation with Total Score a with Item Deleted 43 .017 .338 .882 .028 .309 .812 47 .033 .433 .879 .057 .415 .806 28 .036 .397 .880 .063 .354 .809 40 .037 .433 .879 .064 .408 .806 38 .040 .429 .879 .071 .394 .806 35 .081 .565 .873 .146 .529 .796 57 .098 .587 .872 .177 .540 .795 32 .179 .669 .867 .327 .567 .791 56 .183 .664 .867 .333 .556 .793 24 .182 .642 .868 .332 .522 .796 58 .288 .656 .868 .528 .441 .804 55 .290 .709 .865 .532 .528 .796 54 .338 .692 .866 .621 .462 .801 53 .450 .607 .873 .827 .221 .818 where pr is the reliability index, ax is the variance of the scale, and MSE is the mean square error of the scale. Like the previously described reliability indices, pr is intended to represent the proportion of total variance in household scores that is caused by variance in households "true" scores. In Exhibit 3-4, the reliability estimates for the 12-month and 30-day scale are presented. Separate estimates are presented for two treatments of the variables that involve follow-up questions. For example, the 12-month scale includes an item that indicates that adults have cut or skipped meals in the past 12 months, and a second (answered only by people who responded positively to the first item) that indicates that meals were cut or skipped in three or more months. In one treatment, these are considered as independent dkhotomous items. In the Prepared by Abt Associates Inc. 36 Chapter Three: Reliability Estimates for the Food Security Scales Exhibit 3-3 CRONBACH'S ALPHA FOR THE 12-MONTH SCALE FOR ALL HOUSEHOLDS Item Extreme Scores Included (a » .856; n=18,179) Extreme Scores Excluded (a ■ .743; n=7902) Item Mean Correlation with Total Score a with Item Deleted Item Mean Correlation with Total Score a with Item Deleted 28 .034 .434 .858 .080 .429 .727 38 .040 .459 .855 .092 .451 .723 35 .072 .594 .842 .167 .582 .695 32 .149 .701 .827 .343 .595 .686 24 .157 .682 .829 .362 .545 .697 55 .257 .678 .830 .591 .373 .736 54 .276 .725 .823 .635 .439 .721 53 .349 .646 .837 .803 .206 .760 Exhibit 3-4 RASCH RELIABILITY ESTIMATES FOR THE 12-MONTH AND 30-DAY SCALES Scale Model Type Including Households with Extreme Scores FyrhnHng Households with Extreme Scores 12-month scale Dichotomout .63 .74 Trichotomous .58 •» 30-day scale Dichotomous .00 57 Trichotomous .00 1 second treatment, they are combined into a single trichotomous Hern (no meals cut/skipped in past 12 months; meals cut/skipped in one or two months; meals cut/skipped in three or more months). Treating such question sets as trichotomous items reduces the number of items in the scale, and hence reduces the estimated reliability. With extreme scores excluded, the reliability estimates for the 12-month scale are .74 (dichotomous) and .70 (trichotomous). The reliability estimates for the 30-day scale are .57 and .44. Prepared by Abt Associates he. 37 Chapter Three: Reliability Estimates for the Food Security Scales Unlike the previous reliability indicators, the Rasch reliability estimate decreases when extreme scores are included. Thus, the reliability estimates for the 12-month scale are .63 and .58 with the extreme scores included. For the 30-day scale, because 88 percent of the households that passed the screener responded negatively to all questions, the reliability estimate falls to zero when cases with extreme scores are included. 3.5 RELIABILITY IN IDENTIFYING CASES WITH NO FOOD INSECURITY CONDITIONS As noted earlier, none of the reliability statistics deal adequately with situations in which a large percentage of cases have extreme scores. For present purposes, then, the statistics are primarily useful in indicating the scales' reliability in distinguishing the level of food insecurity among households that experience at least one of the conditions measured by items included in the scales. The statistics provide little information about the scales' reliability in distinguishing between households that experience none of the food insecurity conditions measured and households that experience one or more of the conditions. To provide additional insight on this point, a further analysis was conducted. The analysis follows the split-half procedure: for each scale, we separate the items into two groups to constitute two new scales; we then examine the relationship between the two new scales. The scales are split as described earlier, but each of the new scales is then collapsed into a dichotomous variable. The two categories on the dichotomous variable are (1) "answered all questions negatively," and (2) "answered one or more questions positively." The agreement between the new dichotomous items is then assessed. A simple test of correspondence is the percentage of cases classified similarly by the two variables. When the population is unevenly divided between the two categories of the dichotomous variables, however, a high rate of agreement can occur by chance. The more appropriate test is therefore the Kappa statistic. The Kappa statistic is a measure of the extent to which mere is agreement above and beyond what would be expected by chance. Kappa («) is computed as: (percent observed agreement) - (percent agreement expected by chance alone) 100% - (percent agreement expected by chance alone) Prepared by Abt Associates Inc. 38 Chapter Three: Reliability Estimates for the Food Security Scales To test the hypothesis HQ: * « C VJ. H,: «> 0, we can use the lambda statistic X = _!L. A formula for the estimation of the standard error of « can be found in Rosner (1986). Landis and Koch (1977) suggested that a K below 0.4 represents poor agreement, between 0.4 and 0.75 represents good agreement, and greater than 0.75 represents excellent agreement. The percent agreement between paired subscales and the Kappa statistics are shown in Exhibit 3-5. As expected, the two scales in each pair are in agreement in a high percentage of cases-around 85 percent for the 12-month scale, and around 95 percent for the 30-day scale. More importantly, the K values are all close to .70, which is toward the high end of the range representing "good" agreement.4 Exhibit 3-5 LEVEL OF AGREEMENT BETWEEN DICHOTOMIZED SPLIT-HALF SCALES Households with Children Household! without Children Percent Agreement ■ Petcent Agreement I 12-month scale 84.8% .70 85.8% .69 30-day scale 94.5% .68 95.1% .67 This suggests that the scales provide a reasonable level of reliability in dstiiiguishing between households that have experienced any of the rr-asurcd facets of food insecurity and households that have not experienced any of these conditions. It is particularly worth noting that the i statistics for the 30-day scale are quite siniilar to those for the 12-rnonm scak, even thw the 30-day subacales have very few items and a very high percentage of respondents answering all questions negatively. These factors appear to reduce the 30-day scale's reliability in dftrrimin.ti.ig among households that have experienced one or more of the measured conditions, but the scale remains reasonably strong at distiiigiiisttng thore t^ conditions from those that have not. 4 In all of the comparisons, the X statistic indkawthatttelevdofiaiteinertUi^^ would be expected by chance (p < .001). Prepared by Abt Associates Inc. 39 Chapter Three: Reliability Estimates for the Food Security Scales 3.6 SUMMARY Although there is no absolute rule regarding minimum acceptable levels of reliability, the literature provides at least some rough guidelines. Nunnally (1978), writing in the context of the more traditional measures of reliability, suggests that reliabilities of about .70 can be sufficient to suggest general reliability, particularly in the early stages of measurement development. Nunnally suggests that for basic research, requiring a very high reliability (e.g., above .80) can be counterproductive, as resources are devoted to improving the scale instead of learning about the underlying phenomenon. He also argues, however, that scales used to support decisions regarding the treatment of specific individuals should have reliabilities exceeding .90. Using the three traditional measures and following the conventional practice of including households with extreme scores, both the 12-month scale and the 30-day scale would be judged quite reliable. Estimated reliability values range from .86 to .93 for the 12-month scale, and from .79 to .89 for the 30-day scale. As noted previously, however, this conventional approach yields statistics that can be influenced by the type of highly-skewed distributions that characterize the food insecurity scales. A more conservative approach is to separate two types of reliability. The first considers the scale's reliability in describing the level of food insecurity among households that experience one or more of the food insecurity or hunger conditions measured by items in the scale. The second asks about the scale's reliability in distinguishing between households that have vs. have not experienced any of the measured food insecurity or hunger conditions. The 12-month scale fares quite well on both dimensions of reliability. When households that answered all questions negatively are excluded from the analysis, the Rasch reliability estimate ranges from .70 to .74, and the more traditional indices range from .74 to .88. Using the dichotomous split-half test, the x statistics are .69 to .70. Although this approach is novel, and no established benchmarks provide standards for "good" reliability, all of these scores are in the acceptable range for other uses of the statistics. The 30-day scale is equally reliable at distinguishing households that have vs. have not experienced any of the measured food insecurity and hunger conditions. The x statistics of .67 to .68 are nearly the same as those for the 12-month scale. The 30-day scale, however, seems less reliable at distinguishing among levels of food insecurity for households that experience one Prepared by Abt Associates Inc. 40 Chapter Three: Reliability Estimates for the Food Security Scales or more of the measured conditions. When we consider only the households that answered at least one question affirmatively, reliability estimates range from .36 to .65. Two factors reduce the 30-day scale's estimated reliability in distinguishing levels of food insecurity and hunger among households that experience one or more of the measured conditions. First, the number of independent items on the 30-day scale is small. Second, the 30-day scale measures a narrower range of food insecurity, because some of the less severe questions were not asked in the 30-day time frame. To increase the reliability of the 30-day scale to be more comparable to the 12-month scale, it would probably be necessary to add more 30-day items to the Food Security Survey, and in particular to add items measuring less severe conditions of food insecurity than those currently included in the scale. Prepared by Abt Associates Inc. 41 UD2EBIK A CHAPTER FOUR DEFINING RANGES OF THE FOOD SECURITY SCALE The analyses discussed in earlier chapters provide the basis for concluding that food security can be reliably measured as a unidimensional phenomenon. Households can be ranked on the basis of scale values across a continuous range indicating the severity of food insecurity experienced within the household. The full range of severity measured extends from no measurable food insecurity at all, through increasing levels of severity characterized by reduced food intake and hunger for household members, to some maximum measured level. Although the phenomenon of food insecurity can be viewed as unidimensional and continuous, several distinct ranges of severity are of interest. Identifying these ranges of severity enables one to supplement the continuous food security scale, subdividing it to create a categorical variable providing a comparatively simple measure of food security status in terms of several broad ranges of severity. In this chapter we describe the conceptual and empirical bases for a priori expectations regarding the structure of a categorical food security status variable, and the process leading to definition of categorical ranges within the continuous food security scale. Several specific issues related to selection of threshold levels or scale dividing lines are summarized, and the final categorical food security status variable is described. 4.1 CONCEPTUAL BASIS FOR A CATEGORICAL FOOD SECURITY STATUS VARIABLE The first threshold level of severity, or dividing line, to be identified on the unidimensional food security scale is the point of transition from food secure status to food insecure status. In addition to this threshold, two other cutpoints, deriving from the LSRO/AIN conceptual definitions of food security, food insecurity, and hunger, are of interest.1 As noted 1 The cmcfpwn' rationale underlying the measurement of food insecurity and hunger developed in the present study is described in Bickel, Andrews and Klein (1996). The research background leading to this measurement approach is documented in the U.S. Department of Agriculture report. Food Security Measurement and Research Conference: Papers and Proceedings, Alexandria, VA: USDA Food and Consumer Service, Office of Analysis and Evaluation, June 1995. Prepared by Abt Associates Inc. 43 Chapter Four: Defining Ranges of the Food Security Scale in the main report of this study,2 the LSRO/AIN conceptual clarification provides a working definition of hunger as "the uneasy or painful sensation caused by a lack of food" and identifies hunger as "a potential but not necessary consequence of food insecurity" (Anderson/LSRO, 1990). Previous studies examined by the AIN expert group had led to a consensus view of hunger as "nested" within the broader phenomenon of food insecurity, and occurring at the more severe levels of food insecurity as experienced in U.S. households. Moreover, empirical evidence supports the conceptual view of household-level food insecurity as a managed process involving identifiable patterns or stages of behavioral responses to food insufficiency as the degree of such insufficiency increases (Radimer, Olson and Campbell, 1990; Basiotis, 1992; Cristofar and Basiotis, 1992; Radimer et al., 1992; Wehler, Scott and Anderson, 1992; Burt, 1993; Cohen, Burt and Schulte, 1993). Within this framework, food insecurity in the household begins with an initial stage characterized by adult household members' experiences of food insufficiency, anxiety about their food situation, and adjustments in their budget and food management patterns. These latter behavioral "coping strategies" may involve efforts to augment the household's food supply from emergency or other non-normal sources, and may involve modifications to the variety and quality of food available to household members, but normally do not include reduction in overall quantity of food intake. In this initial stage there is little or no evidence that household members experience actual hunger — "the uneasy or painful sensation caused by a lack of food" — as a result of their household's level of food insecurity. The second stage involves intensification of food economizing behaviors, some of which lead to patterns of reduced food intake among one or more of the adults in the household. When children are present in a household, efforts are made to spare them from food intake reduction through various rationing strategies. If the household's food insecurity persists or worsens, however, a third stage appears in which adult hunger is manifested in more severe forms (e.g., going whole days with no food) and, in households with children, the children experience actual hunger, revealed in patterns of reduced food intake. 2 Hamilton etai. (1997), Household Food Security in the United States in 1995: Summary Report of the Food Security Measurement Project, Alexandria, VA: U.S. Department of Agriculture, Food and Consumer Service, June 1997, Chapters One and Two. Prepared by Abt Associates Inc. Chapter Four: Defining Ranges of the Food Security Scale This conceptual framework suggests four potentially identifiable stages or levels of severity within the continuous food security variable. Those severity-level categories are: (1) Food Secure; (2) Food Insecure with No Hunger Evident; (3) Food Insecure with Moderate (adult) Hunger Evident; and (4) Food Insecure with Severe Hunger (child hunger, and severe adult hunger) Evident. Given these conceptual categories, the question is how best to subdivide the 12-month and 30-day scales into ranges of severity that correspond operationally to the designated conceptual categories. 4.2 DEFINING RANGES AND SELECTING SCALE OUTPOINTS As described in earlier chapters, the Rasch model assigns a scale value to each household based on the number of scale items answered affirmatively relative to the total number of items answered.3 As an interdependent part of its estimation from the data, the model also ranks scale items according to their level of severity on the basis of the actual response patterns of all households in the data. The 18 items in the final 12-month scale are shown in Exhibit 4-1, with items listed by increasing order of severity from top to bottom in the table. If all responses were perfectly ordered, an affirmative response to any scale item would occur only in conjunction with affirmative responses to all prior, or less severe, scale items. Therefore, as perfect scale ordering is approached among the actual sample households, any number "n" of affirmative responses approaches exact correspondence to the first n items in the scale. Although the data are not perfectly ordered for all households, in fact the most common pattern of household responses (the mode) does foUow the sequential order of severity.4 That is, the 3 For ease of explication this discussion is presented without addressing separately the cases of households with and without children. Readers should note that these two typo of households were presented different numbers of items, because questions addressing conditions of children in die household were not presented to households without children. The form of the Rasch measurement model and the BIGSTEPS software that impLHiw^tt the model take these differences into account in calculating household scale scores. 4 For example, among households with no children, 82 percent followed the modal pattern on the 12- month items. Households answering "no" to all questions, however, amount to 65 percent of the total. Among households answering "yes" to at least one question, 49 percent followed the modal pattern. For the non-modal households, responses deviate from the pattern that would be observed under perfect ordering. Some households answer "yes" to items without answering "yes" tc all prior items. A non-modal household with n affirmatives has answered negatively one or more of the n less-severe questions, instead affirming one or more of the more severe questions. The Rasch model implicitly considers them equivalent, in effect treating all households as modal and assigning both households the same scale value. Prepared by Abt Associates Inc. 45 Chapter Four: Defining Ranges of the Food Security Scale modal household that answers n items affirmatively gives "yes" responses to the n least severe items in the scale sequence. Defining ranges on the continuous scale is the operational means of assigning values to the categorical variable measuring households' food security status. This categorical measure identifies the particular range of severity of food insecurity that a given sample household has experienced in the prior 12-month or 30-day period. Defining the appropriate scale ranges for classifying households according to food security status involves identifying subsets of the sequential indicator items that best correspond to the conceptual categories described above. After a subset is identified in general terms, it is necessary to identify the appropriate classification boundaries, or points of transition from one severity range to the next. Each such boundary is marked by a particular "threshold item." The threshold items and their classification boundaries developed in the present study for the purpose of giving operational definition to the categorical food security status variable are depicted by the shaded rows in Exhibit 4-2.5 Thus, the scale itself, with items ranked from least to most severe, provides a meaningful framework within which to identify operationally the designated ranges of behaviors and conditions corresponding to the conceptual construct summarized above. The scale, whose values range from 0 to 10, must be subdivided in terms of numeric values so that a household with a particular scale value can be assigned to a particular food security status category. This subdivision, however, can be accomplished by considering the behaviors and conditions represented by values at each point on the scale. The procedure for subdividing the scale rests on two features of the scaling methodology described above. First, household values on the food security scale are based fundamentally on a simple count of the number of questions to which they respond affirmatively. Second, most households' responses follow the sequential logic of item severity: a household that says "yes" to a particular question typically says "yes" to all less severe questions as well. In general, then, one can characterize households that have a particular scale value as having responded affirmatively to a particular group of questions. Exhibit 4-2, which is organized in terms of increasing severity of the questions, illustrates the point. A household that 3 Exhibits 4-1 and 4-2 in the main report of this study (Hamilton et al., 1997), also illustrate mis division of the scaled indicator items into the respective severity-level classes of the categorical food security measure. Prepared by Abt Associates Inc. 46 Chapter Four: Defining Ranges of the Food Security Scale Exhibit 4-1 ITEMS IN THE FINAL 12-MONTH SCALE LISTED BY INCREASING SEVERITY LEVEL Item Label Item Content (All questions refer to the last 12 months) Q53 Household members worried whether food would run out before they got money to buy more (sometimes or often). Q54 Respondent reports that the food they bought just didn't last, and they didn't have money to get more (sometimes or often). Q55* Household members couldn't afford to eat balanced meals (sometimes or often). Q58 Household relied on a few kinds of low-cost foods to feed children because they were running out of money to buy food (sometimes or often). Q24 Adults in the household cut die size of meals or skipped meals because there wasn't enough money for food. Q56 Household couldn't afford to feed children a balanced meal, because they couldn't afford that (sometimes or often). Q32 Respondent ate less than he/she felt they should because there wasn't enough money to buy food. Q25" Adults in the household cut tbj size of meals or skipped meals because there wasn't enough money for food in at least 3 of the last 12 months. Q57 Children were not eating enough because household couldn't afford enough food (sometimes or often). Q35 Respondent was hungry but didn't eat because couldn't afford enough food. Q38 Respondent lost weight because there wasn't enough food. Q40 Adults cut the size of children's meals because there wasn't enough money for food. Q»- Adults in househoid did not eat for a whole day. Q47 Children were hungry but household couldn't afford more food. Q29 Adults in household did not eat for a whole day in at least 3 of the hut 12 mos. Q43 Children skipped meals because there wasn't enough money for food. Q44 Children skipped meals because there wasn't enough money for food in at least 3 of the last 12 mos. Q50 Children did not eat for a whole day because there wasn't enough money for food. liliram threshold items in die scale. For each designated range of severity comprising the categorical food-security variable, die subset of indicators saajsjassj, win the threshold item and continuing through the successively more severe indicators, up to die next identified threshold, serve operationally to define and characterize that designated range. Prepared by Abt Associates Inc. 47 Chapter Four: Defining Ranges of the Food Security Scale EXHIBIT 4-2 THRESHOLD ITEMS DEFINING RANGES OF THE FOOD SECURITY SCALE Question* (in order of increasing severity) Households with Children Number of Affirmatives Modal Household Value Households without Children Number of Affirmatives Modal Household Value 0.0 0 0.0 Q53 Worried food would run out 0.1 0.9 Adult fed child few low-cost foods Q24 Adult cut size or skipped meals 3.3 3.6 Q56 Couldn't feed child balanced meals 3.8 Q57 Adult eat less than felt Child not eating enough 4.3 *7 5.2 Q35 Adult hungry but didn't eat 10 5.6 5.8 Q38 Adult lost weight 11 6.0 6.5 Q40 Cut size of child's meals 12 m 6.4 6.8 7.5 14 7.3 Adult not eat whole day, 3+ mos. 15 7.8 10 10.0 Child slapped meal 16 8.4 Child slopped meal, 34- mot. 17 9.3 fl)sMlfflTtU9iwlr?lfftY _18_ 10.0 gives one affirmative answer most often answers Q53 affirmatively, a household with two affirmatives most often affirms Q53 and Q54, and so on. For each question, the exhibit shows the number of affirmative responses and the associated scale value for households whose responses follow the sequential logic of item severity. For example, if the most severe question affirmed by a household with children is Q24, that household has also responded affirmatively to the four less severe questions (Q53, Q54, Q55, and Q58) and has a total of five affirmative responses. Its corresponding scale score Prepared by Abt Associates Inc. 48 Chapter Four: Defining Ranges of the Food Security Scale is 3.3. The exhibit also shows parallel, but slightly different, values for a similar household without children. Q58 is not applicable to that household. Thus, if the most severe question it affirms is Q24, it will have a total of just four affirmative responses. Because the Rasch model, however, computes a scale value that takes into account the number and severity of the questions the household was asked, the scale value for the household without children (3.6) is quite close to the value for the household with children (3.3). It is possible to describe any point on the scale in terms of the questions that the "modal" or typical household with that scale value has answered affirmatively. Similarly, one can say that all modal households with values at or above a specified point on the scale have responded affirmatively to at least the group of questions corresponding to the specified point. For example, all modal households with values at or above 2.3 have responded affirmatively to at least the three least severe questions in the scale (QS3, Q54, Q5S). All modal households with values of 4.7 or higher have responded affirmatively at least to Q24 and to all applicable less severe questions.6 Thus, although the scale itself is a continuous measure of a single dimension (i.e., the severity level of food insecurity), it can be subdivided by considering the collection of conditions and behaviors associated with particular ranges of scale values. In this manner, the scale and the severity rankings provided by the Rasch model yield a statistical framework for defining conceptually meaningful categories for the food security status variable. Within this statistical framework, however, the exact location of the category boundaries or scale thresholds depends upon informed judgment about how best to interpret the conceptual constructs based upon the LSRO/AIN definitions and the previous empirical research findings on food security and hunger. The next section reviews those judgments and the reasoning behind them. 6 Non-modal households with a given scale value have, by definition, not responded affirmatively to all of the applicable leas severe questions, but instead have responded affirmatively to more severe questions. For example, a non-modal household (with children) with a scale value of 2.3 must have answered three questions affirmatively. Instead of Q53, Q54, and Q55, however — die three least severe questions — the household might have said "yes" to Q53. Q54, and Q58, although saying "no" to Q55. Prepared by Abt Associates Inc. 49 Chapter Four: Defining Ranges of the Food Security Scale 4.3 EVIDENCE OF FOOD INSECURITY The LSRO/AIN definitions of food security and food insecurity are: • Food security: "Access by all people at all times to enough food for an active healthy life. Food security includes at a minimum: (1) the ready availability of nutritionally adequate and safe foods, and (2) an assured ability to acquire acceptable foods in socially acceptable ways (e.g., without resorting to emergency food supplies, scavenging, stealing, or other coping strategies)" (Anderson/LSRO, 1990, p. 1598). • Food insecurity: "Limited or uncertain availability of nutritionally adequate and safe foods or limited or uncertain ability to acquire acceptable foods in socially acceptable ways" (ibid.). Several dimensions or aspects of food security are apparent in these definitions, of which the most central and fundamental is described as "enough food for an active, healthy life" — i.e., a sufficient quantity of acceptable foods to meet the household's basic needs. A number of additional dimensions are also apparent, including the nutritional quality and safety of available foods, the social acceptability of the means of obtaining food, and the household's assurance or certainty of its ability to obtain needed food. These additional dimensions of the broad conceptual definition of food security, however, are not directly captured in the questions incorporated in the food security scale. Rather, the measure focuses on the simple quantitative dimension of "enough" food. The food quality dimension is represented only to the extent that some particular quality of food (in both nutritional and conventional senses) is perceived and understood by households members to be necessary. The scale consists entirely of items indicating either this quantitative or qualitative aspect of food sufficiency, as experienced and understood by the household respondent, in relation to his or her self-perception of basic needs. Several of the questions included in the CPS Food Security Supplement were intended to capture those aspects of households' food coping behaviors that seek to augment insufficient household food supply through emergency or other non-normal means. These extraordinary coping methods, such as obtaining food from food banks or pantries, borrowing money for food, taking children to others' homes for meals, or getting meals at soup kitchens, have been regarded as good behavioral indicators of a condition of food insecurity or insufficiency within the household, and they may be presumed to reflect the concept of acceptability of sources or means of food-acquisition within U.S. social norms. These food-augmenting coping behavior Prepared by Abt Associates Inc. 50 Chapter Four: Defining Ranges of the Food Security Scale items in the CPS data, however, do not factor together with the indicators that are included in the measurement scale. Thus, they represent a dimension of the conceptual definition of food security — the assurance of access to food through socially-acceptable means — that is not represented within the unidimensional measure of severity of food insecurity.7 Examining the items in the 12-month scale, shown in severity-ranked order in Exhibits 4-1 and 4-2, the basic question is how many items must be answered affirmatively in order to provide clear evidence of food insecurity as defined above. Item Q53 could be interpreted as indicating uncertainty about the household's access to adequate acceptable food, or the ability to acquire it in socially acceptable ways. By itself, however, this subjective item may be considered to lack face validity as a sufficient indicator of food insecurity. An affirmative response to only this one item was therefore judged by the technical analysis team as insufficient to indicate the threshold level of food insecurity. Giving affirmative responses to two items (in the modal case, items Q53 and QS4) indicates worry or anxiety about the household's food position, and also initial perceptions of insufficiency of the household's food supply (food bought just didn't last). Although these two items together provide stronger evidence of household food insecurity, they were still judged insufficient to establish unequivocally that severity has reached the threshold level required for the categorical measure of food insecurity. Including item Q55, however, captures not only reports that the household food supply is substandard, but also efforts to cope with this insufficient food supply in ways that, although they may maintain the quantity of food intake, reduce the perceived quality of diets below the level the respondent understands to be needed to maintain "balanced meals." It is useful to consider the relative severity of items as well as the simple rankings shown in prior exhibits. Exhibit 4-3 therefore maps the relative severities, using the item calibrations presented in Chapter Two. The three least-severe hems in the scale (QS3, QS4, and Q55) appear just prior to a substantial gap in the spacing of item calibrations, indicating a large difference in severity between these items and the group comprised by items 024, Q56, and Q32. Although item Q58 (child fed few low-cost foods) is very close in severity to the item Q55 and consistent in 7 See Chapter Five for further discussion of these indicators of coping behaviors. Prepared by Abt Associates he. 51 Chapter Four: Defining Ranges ofthe Food Security Scale Exhibit 4-9 SEVERITY RANKING OF QUESTIONS IN FOOD SECURITY SCALE 10 < I QBO Chid net aat tor whola day 7- 8 H • Q44 chid aktoped meal. 3* montha < Q43 Chid aktoped meal < Q29 Adut not aat tor whola day. 3* month* 5H 1 Q47 Chid hungry Oft A•*«» mot ear far wkoto day Q40 Chid maal aka out Q38 Raapondant toat weight { Q86 Respondent hungry but did net eat QS7 Chid net eating enough Off Adft-ff «*fe MMto, *♦ MOJI0M 4- 8- JH ll Q32 Reapendent eat bee tun aheuld Q56 Chid net tod batonoed meafc Q24 Adul outtokto meab) Q58 CUM tod tow. low-coat toede off ftaa>owa*aiit not aat Q54 Food bought did not toat Q53 Worrtod toed would run out tJ piaw i wwn) to >■! (wart aaaajaj. of toe toed eaaeily etotoa laabator. Prepared by Abt Associates he. 52 Chapter Four: Defining Ranges of the Food Security Scale conceptual content, selection of the threshold or cutpoint item aims at identifying the point of transition from food security into food insecurity. Thus, the first item completing a group that is conceptually and statistically consistent with food insecurity was judged most appropriate for identifying the threshold. Item Q55 meets this criterion, and the set of three household- or adult-level items answered affirmatively by modal households responding "yes" to item Q55, taken together, was judged to provide sufficient evidence that the household has experienced food insecurity, although at a level not yet showing evidence of actual hunger among household members. 4.4 SUBJECTIVE REPORTING OF HUNGER As summarized above, this research has aimed to develop both a continuous measure of severity and a broad categorical measure of resource-constrained food insecurity that can differentiate three broad ranges of severity, the two most severe of which involve actual hunger for household members. This measurement task is guided by the LSRO/AIN conceptual definitions of food insecurity and hunger, where hunger is nested as "a potential but not necessary consequence" of food insecurity, and is defined as "the uneasy or painful sensation caused by a lack of food." Therefore, an essential measurement task is to identify households whose members have experienced actual hunger — the "uneasy or painful sensation caused by a lack of food" — as a result of constrained or insufficient household financial resources. Food insecurity or hunger resulting from eating disorders, dieting, or causes other than household resource constraints are not being measured. Three related factors enter into the conceptual consideration of what constitutes the specific phenomenon being measured. These are access to adequate food, the physiological sensation of hunger, and potential malnutrition. The relationships between the first two of these — die bask dimension of food insecurity and hunger as experienced within households — constitute the focus of the present research. The relationship of this basic experiential dimension to malnutrition (which is also defined as nested — a "potential but not necessary consequence" — within food insecurity) is not addressed in this research. All items in the CPS Food Security Supplement addressing aspects of food insecurity or hunger contain explicit language making it clear to respondents that the condition being asked about is specifically caused by constrained household financial resour ;s. For example, item Prepared by Abt Associates Inc. 53 Chapter Four: Defining Ranges of the Food Security Scale Q53 states "I/We worried whether (my/our) food would run out before (I/we) got money to buy more." Item Q54 states "The food (I/we) bought just didn't last, and (I/we) didn't have money to get more," whereas item Q55 states "(I/We) couldn't afford to eat balanced meals." Such qualifying language is included consistently in all food insecurity and hunger items in the CPS instrument, including all those appearing in the food security scales. As a result, within the limits of unidentifiable measurement error, affirmative responses to scale items can be expected to reflect clear understanding by respondents that such answers are identifying resource-constrained conditions. Although the possibility of respondents' intentional misreporting exists, as in every survey, the history and nature of the CPS, the high degree of preparedness of CPS interviewers, and the careful design and testing of the Food Security Supplement items all tend to reduce this and other types of measurement error. This point is important because identifying the second classification boundary — the transition from food insecurity with no hunger evident into food insecurity with moderate hunger (adult hunger) evident — relies primarily on evidence that reduced food intake consistent with hunger has occurred within the referenced time period among adults in the household, and that this hunger has resulted specifically from the resource-constrained food insecurity of the household. The task faced by the analysis team of determining the most appropriate severity level of the initial boundary for the severity range of food insecurity with hunger present involved two kinds of judgment. First, it was necessary to decide which specific items available in the scale should be taken to indicate actual hunger for one or more adults in the household attributable to resource constraint. These potentially include measures of reduced quantities of food intake for adult household members (e.g., Q24, Q25), respondents' subjective assessment of intake adequacy (Q32), or direct perception and report of personal hunger (Q3S). Second, given the scale items available, a judgment is required as to bow many such items are needed to provide sufficient evidence that household members have experienced actual hunger due to resource constraint. As explained below, the threshold ultimately chosen relies on evidence of a repeated pattern of reductions in food intake by adults over the referenced time period. The physiological sensation of hunger is experienced universally by all humans, and a large research literature exists examining the nature of the experience in the context of basic Prepared by Abt Associates Inc. 54 Chapter Four: Defining Ranges of the Food Security Scale human physiology and clinical nutrition.8 Several articles from this research literature are summarized in Appendix A of the present volume. The studies described in this literature provide strong support for the validity of subjective reporting of the sensation of hunger (see, for example, Mattes and Friedman, 1993), although they find considerable variation in how the sensation is experienced and described. These studies seem to provide clear evidence that when usual patterns of eating are interrupted by reducing food intake through actions such as cutting the size of meals or dripping meals, the "uneasy or painful sensation caused by a lack of food" is the natural result. The intensity of the sensations experienced is positively associated with the length of the perod of abstinence, although they diminish and may disappear altogether after an extended period of fasting (usually several days). The results reported in this literature are thus consistent with tie use of items indicating that reduced food intakes below usual or normal meal patterns, due to resource stringency, are evidence that hunger has been experienced. Referring to Exhibit 4-3 above, after Q55 the next most severe item to indicate reduction of food intake among adults is item Q24 (Adults cut/skip meals). Note that this item appears in Exhibit 4-3 at virtually the same level as child item Q56 (Child not fed balanced meals), which indicates reduction in the quality of diets provided to children in the household at this level of severity of food insecurity. The next item (Q32, Respondent eat less than should) indicates that food intake has fallen below the respondent's own normative standard for the amount of food he or she should be eating. An affirmative response to item Q25 indicates that, in addition to all of the foregoing conditions, adults in the household cut the size of or skipped meals in three or more of the previous twelve months due to constrained resources, indicating a pattern of repetition of reduced food intakes among adult household members. This item was judged to provide sufficient additional evidence for the presence of adult hunger in the household, and was chosen, therefore, as the item inditing the point of transition from the category of food insecurity with hunger not evident to the category of food insecurity with adult hunger evident. Households in which the respondent answered affirmatively to item Q25 will, in the modal case, also have 8 See Mattes and Friedman (1993) and Read, French and Cunningham (1994) for two general reviews covering much of this research (see References, Appendix A). Prepared by Abt Associates Inc. 55 Chapter Four: Defining Ranges of the Food Security Scale answered affirmatively to all previous items, indicating the household has experienced a comparatively severe level of food insecurity. The affirmative answer to item Q25 indicates that adults in the household have experienced, in addition, a pattern of repeated reductions in food intakes of a type that the physiological research literature indicates is normally accompanied by the "uneasy or painful sensation caused by a lack of food," or hunger. When considering the selection or identification of outpoint items, and when deciding whether affirmative responses to items or sets of items yielded sufficiently clear evidence of a particular condition (e.g., resource-constrained adult hunger), the study team employed a general principle of requiring a pattern of repetition of either behaviors or items, or both. Thus, in considering items indicating reduced food intake among adults, Q25 was viewed as providing sufficient evidence because it involved occurrence of the behavior "cutting or skipping meals" in a recurring pattern over the previous twelve months. Similarly, when considering items indicating the existence of food insecurity with no hunger evident, a pattern of affirmative responses to a sequential series of items was considered stronger evidence than affirmation of only one or two pertinent items. This principle was employed to provide additional assurance against response error.9 4.5 EVIDENCE OF CHILD HUNGER AND SEVERE ADULT HUNGER Exhibit 4-3 shows items Q38, Q40, Q28, and Q47 all grouped at nearly the same level of severity and located at a considerably increased level of severity beyond items Q25, Q57, and Q3S. The logic described above for selection of item Q25 as the threshold item for food insecurity with adult hunger evident might suggest item Q40 (size of children's meals cut) as a likely candidate for the best item indicating the transition into food insecurity with severe hunger, because children's hunger is conceptually the most salient aspect of severe hunger in the household. For reasons similar to those outlined above, however, a more severe item was chosen. The wording of item Q40 allows the respondent to answer affirmatively if children in die household had their meal size cut due to resource constraint only once or a small number of times within the previous twelve months. Here again, sufficient evidence of hunger among 9 lames of response error sxe discussed further in Chapter Eight. Prepared by Abt Associates Inc. Chapter Four: Defining Ranges of the Food Security Scale children was thought to require either a repetitive pattern of reduced food intake or a multiple series of responses indicating such a condition. Note that the child items indicating meals being cut and skipping meals occur as two separate items, unlike the adult version, in which these two conditions are combined as one item. The item addressing children skipping meals appears in Exhibit 4-3 at a much higher level of severity than the item regarding size of children's meals being cut. Skipping meals, as would be expected, reflects a more severe condition than cutting the size of meals. In addition, adult items Q38, Q28, and Q29, all of which indicate comparatively severe levels of adult hunger, appear prior to child item Q44, which indicates a pattern of repeatedly skipped meals among children. These circumstances led team members initially to choose item Q47 (child hungry but couldn't afford more food) as the cutpoint indicating the beginning of food insecurity with child or severe adult hunger evident. Assignment of household food security status using item Q47 as this cutpoint, however, led to anomalous results due to the different numbers of items presented to households with and without children. This anomaly was avoided by choosing item Q28, which appears at virtually the same severity level as item Q47 in Exhibit 4-3, as the cutpoint item indicating the transition from food insecurity with adult hunger evident into food insecurity with child and severe adult hunger evident. In modal households with children responding affirmatively to item Q28, two items related to reduction of food intake among children receive "yes" answers: item Q57 (children were not eating enough) and item Q40 (children had meal size cut). Moreover, respondents in all household types respond affirmatively to Q3S, Q38, and Q28, indicating that adults in the households "were hungry but did not eat because they couldn't afford food," "lost weight because there wasn't enough food," and did "not eat for a whole day because there wasn't enough money for food." Affirmative responses to these items, taken together with affirmative responses to all less severe items, appear to provide clear and strong evidence of child hunger and severe adult hunger. Prepared by Abt Associates Inc. 57 Chapter Four: Defining Ranges of the Food Security Scale 4.6 SUMMARY The primary task of the food security measurement study was to identify, test, and develop a unidimensional measure of food insecurity and hunger based on the CPS food security data, if a statistically strong and sound measure of this kind could be found. The Rasch measurement method was successful in producing a unidimensional, continuous-variable measure of severity of food insecurity and hunger from the CPS data that met these requirements. The second task of the project, which was dependent upon the success of the underlying continuous measure, was to develop a categorical-variable measure of several designated ranges of severity of food insecurity, and the classification of households into these designated severity ranges-or categories, as follows: • food secure • food insecure with hunger not evident • food insecure with moderate hunger • food insecure with severe hunger The conceptual construct for these designated ranges of severity was drawn from the AIN/LSRO conceptual definitions of food insecurity and hunger, from other prior research on food security measurement, and from limiting the measurement effort to one of the central elements of the broad food security concept that is amenable to direct measurement, the direct household experience of insufficient food to meet basic needs. Other elements of the broad conceptual definition, such as safety of food, actual nutritional adequacy of diets, and social acceptability of food acquisition, are not encompassed in the present measure of severity of food insecurity. The categorical measure of food security status depends on classifying households into identifiable ranges of severity on the underlying continuous severity measure. The aim in identifying or selecting the appropriate ranges of severity on the continuous measure was to achieve acceptably close correspondence to the conceptual bases of the designated broad food security status categories described above. The operational means of establishing the several severity ranges was to select the most appropriate indicator items from among those available in the continuous measurement scale to identify, or define operationally, the classification boundaries, or thresholds, separating each designated severity range category from the next. Prepared by Abt Associates Inc. 58 Chapter Four: Defining Ranges of the Food Security Scale This task involved judgment as to which items best reflect the transition from one broad range or category of severity to the next. Identification of the threshold items and their associated scale cutpoint scores for each level of the categorical food security status variable involved use of statistical results from the Rasch model, guided by the LSRO/AIN conceptual definitions of hunger and the results of previous research in the areas of physiology, clinical nutrition, and food security measurement. Team members combined these factors to select thresholds or cutpoint items that are most consistent with the statistical results, empirical evidence, and the conceptual framework representing the predominant understanding of food insecurity and hunger within the nutrition science community. Prepared by Abt Associates Inc. 59 OEMlEEaigg rVT Q' f' CHAPTER FIVE THE RESOURCE AUGMENTATION QUESTIONS In fitting the model for the 12-month food security scale, one group of questions was conspicuously not included because they did not meet the statistical criteria for inclusion in the scale. Theue questions involve actions that households might take to deal with a problem of constrained food resources, and specifically actions other than reducing food intake or < herwise modifying the internal household management of food resources. The questions refer to actions such as putting off other bills in order to buy food, or obtaining meals from soup kitchens. The class of actions has variously been termed "coping" or "resource augmentation" behaviors. Because resource augmentation behaviors are pertinent to one dimension of the LSRO/ AIN definition of food insecurity — the ability to acquire food in "socially acceptable ways" — the research team considered it important to explore the possibility of supplementing the primary food security scale with some composite based on the resource augmentation questions. For example, the food security status variable, rather than simply being based on a subdivision of the primary scale, might also take into account the household's value on the resource augmentation composite. Ultimately it was concluded that, although such a composite might be useful for some researchers in particular situations, it does not add significant value to the food security status variable. This chapter reviews both the conceptual underpinnings of the effort to construct a composite, the procedures that were implemented, and the likely effect of using a composite such as that described. S.l Two DIMENSIONS OF FOOD INSECURITY The LSRO/AIN conceptual definition of food insecurity includes several diverse aspects or dimensions of households' food situations, of which only one central element — the direct experience of insufficient food to meet basic needs — is captured in the measure developed from the CPS food security data. Households can, however, be food insecure either because they are unable to obtain enough food (for discussion, call this food insecurity "type A"), or because they have to resort Prepared by Abt Associates Inc. 61 Chapter Five: The Resource Augmentation Questions to socially unacceptable ways of obtaining food (call this "type B"). They may also be food insecure for both these reasons. That is, they may resort to socially unacceptable ways of obtaining food and still not obtain access to sufficient food (call this "type A&B"). Because resource-constrained hunger is understood to be nested within food insecurity, it will not occur in a household unless that household is food insecure. If a household is food insecure type A (unable to obtain enough food) at a sufficient level of severity, then hunger may result. Likewise, if a household is food insecure type A&B, hunger may still emerge, despite the household's efforts to augment its available food through various coping measures. If a household's food insecurity is limited to type B only, however, the presence of basic food insufficiency and hunger within the household cannot be inferred from this information. This relationship is illustrated in Exhibit 5-1. Exhibit 5-1 ILLUSTRATION OF ROLE OF RESOURCE AUGMENTATION BEHAVIORS Food Availability Mode of Acquisition Food Security Status Sufficient food available AND Socially acceptable acqui- «■" Food secure lition Limited or uncertain availability (anxiety, adjustments to budget management, adjustments to food quality) OR Resource augmentation via socially unacceptable means Food insecure with hun-ger not evident Severely limited availability (reduced food intake and other indicators) Food insecure with evi-dence of hunger The availability of sufficientfoods to meet basic needs (food insecurity type A). This dimension is well represented in the final unidimensional 12-month scale. As described in the previous chapter, scale development activities demonstrated that it is possible to define a range of values on this scale that can be used to classify households as "food insecure" on the basis Prepared by Abt Associates Inc. 62 Chapter Five: The Resource Augmentation Questions of limited availability of foods relative to household need, operationally indicated by a pattern of anxiety about the adequacy of the household's food supply, and deterioration in the quality and quantity of food available in the household. The ability to acquire foods in socially acceptable ways, or via normal channels (food insecurity type B). The scale development models employed do not capture this dimension. Using the final 12-month scale to classify households as food insecure leaves open the possibility that some households relying on extraordinary coping methods to acquire food in socially unacceptable ways will be classified as food secure. This situation emerges because the items in the CPS Food Security data that address this latter dimension of food insecurity do not fit the measurement models leading to the final 12- month scale. Two sets of items ask questions that provide indications of whether households obtained food in ways that might be considered socially unacceptable. One set of items asks whether households undertook actions to augment thtif food supply or other household resources within the previous 12 months. These items are summarized in Exhibit 5-2. Exhibit 5-2 RESOURCE AUGMENTATION ITEMS IN THE FOOD SECURITY SURVEY INSTRUMENT Item Label Q18 Q19 Q21 Q22 Q23 Item Summary/Description "get food or borrow money for food from family or friends?* "•end or take children to the homes of friends or relatives for a meal?' "put off paying a bill so you would have money to buy food?" "get emergency food from a church, food pantry, or food bank?" "eat meals at a soup kitchen?" A second set of items asks whether members of the household obtained food through federal food assistance programs. These programs include food stamps, elderly feeding programs, the child and adult care feeding program, school feeding programs, and WIC. There are two strong arguments, however, for not using these items to classify households as food insecure. Prepared by Abt Associates Inc. 63 Chapter Five: The Resource Augmentation Questions First, participation in such programs may not be considered "socially unacceptable" by many of the participants. There is some evidence to that effect, although this point has not been adequately researched (Trippe and Beebout, 1988; Fraker, 1990; Radimer, Olson and Campbell, 1990; Trippe, Doyle and Asher, 1992; Olson, Frongillo and Kendall, 199S). Second, there is a problem of logical circularity that could diminish the usefulness of the food insecurity measures for policy considerations. The food insecurity measures are potentially useful in helping policy makers assess the need for government food assistance programs. Including program participation in the food insecurity measures, however, permits the following potentially perverse result: If the government makes programs more available (for example, by increasing the income eligibility threshold for free school lunches, or food stamps), more people will participate and the experienced level of food insecurity would be expected to decline. The measured level, however, may either decline or increase, depending on how the participation indicator interacts with other indicators of the condition. Conversely, if the government cuts back on programs, participation will decline and the effect of the participation indicator may cause the measured level of food insecurity to go down (i.e., the food insecurity problem can be "solved" by taking away the programs). Because of this situation, participation in government food assistance programs was not included in the candidate pool of items for a resource-augmentation index. For the classification of households as food insecure to be more fully consistent with the LSRO/AIN definitions, there would need to be a way to include information on food acquisition through ways that are not socially acceptable (non-normal channels). An important part of the indicator items used in earlier efforts to develop measures of food insecurity and hunger reflect actions or behaviors undertaken by household food managers to avoid or ameliorate hunger when food or financial resou
Click tabs to swap between content that is broken into logical sections.
Title | Household food security in the United States in 1995 technical report of the food security measurement project |
Date | 1997 |
Contributors (individual) |
Bickel, Gary W. Hamilton, William L. |
Contributors (group) | United States Dept. of Agriculture Food and Consumer Service Office of Analysis and Evaluation. |
Subject headings |
Food consumption--United States Food supply--United States |
Type | Text |
Format | Pamphlets |
Physical description | 95, [20] p. |
Publisher | [Washington, D.C.] : U.S. Dept. of Agriculture, Food and Consumer Service, Office of Analysis and Evaluation, |
Language | en |
Contributing institution | Martha Blakeney Hodges Special Collections and University Archives, UNCG University Libraries |
Source collection | Government Documents Collection (UNCG University Libraries) |
Rights statement | http://rightsstatements.org/vocab/NoC-US/1.0/ |
Additional rights information | NO COPYRIGHT - UNITED STATES. This item has been determined to be free of copyright restrictions in the United States. The user is responsible for determining actual copyright status for any reuse of the material. |
SUDOC number | A 98.2:H 81/2/TECH.RPT. |
Digital publisher | The University of North Carolina at Greensboro, University Libraries, PO Box 26170, Greensboro NC 27402-6170, 336.334.5304 |
Full-text | 03 OCN-A-oi Alr.2.- tf 8tlilT&ti fip* USDA Measuring Food Security in the United States United States Department of Agriculture Food and Consumer Setvtce Office of Analyst* and Evaluation Household Food Security in the United States in 1995 Technical Report ??-■*>■ w ♦' X• USDA Unrtad States Department of *3rtcUtur« Food and Service Offlcaof Analysiaand Evaluation Household Food Security in the United States in 1995 Technical Report of the Food Security Measurement Project September 1997 Prepared for: Gary W. Bickel, Project Officer U.S. Department of Agriculture Food and Consumer Service 3101 Park Center Drive Alexandria, VA 22302 under contract no. 53-3198-5-028 Prepared by: William L. Hamilton, Project Director* John T. Cook, Principal Investigator1' William W. Thompson' Lawrence F. Buron* Edward A. Frongillo, Jr.c Christine M. 01sonc Cheryl A. Wehler- * Abt Associates, Inc. b Tufts University Center on Hunger, Poverty, and Nutrition Policy e Cornell University Division of Nutritional Sciences d C.A.W. and Associates TABLE OF CONTENTS Chapter One Chapter Two Chapter Three Chapter Four Chapter Five Chapter Six INTRODUCTION 1 METHODS AND RESULTS OF FITTING LINEAR AND NON-LINEAR FACTOR ANALYSIS MODELS TO CPS DATA . . 5 2.1 Preliminary Linear Factor Analysis 8 2.2 Exploratory Two-Parameter Non-linear Factor Analysis Model 10 2.3 Unidimensional One-Parameter Non-linear Factor Analysis Models 13 2.4 Summary 28 RELIABILITY ESTIMATES FOR THE FOOD SECURITY SCALES 29 3.1 Spearman-Brown Split-half Reliability Estimates 31 3.2 Rulon's Split-Half Reliability Estimates 33 3.3 Cronbach's Alpha Reliability Estimates 34 3.4 Rasch Model Reliability Estimates 35 3.5 Reliability in Identifying Cases with No Food Insecurity Problems 38 3.6 Summary 40 DEFINING RANGES OF THE FOOD SECURITY SCALE 43 4.1 Conceptual Basis for a Categorical Food Security Status Variable 43 4.2 Defining Ranges and Selecting Scale Outpoints 45 4.3 Evidence of Food Insecurity 50 4.4 Subjective Reporting of Hunger 53 4.5 Evidence of Child Hunger and Severe Adult Hunger 56 4.6 Summary 58 THE RESOURCE AUGMENTATION QUESTIONS 61 5.1 Two Dimensions of Food Insecurity 61 5.2 The Composite Resource Augmentation Index 65 5.3 Effects of Using the Composite Resource Augmentation Index . 66 5.4 Summary 67 EXTERNAL CONSTRUCT VALIDATION OF THE FOOD SECURITY MEASURES 69 6.1 Relationship of Construct Validation Items to Food Security . . 69 6.2 Weekly Food Expenditures per Household Member 70 6.3 Household Income 72 6.4 Food Sufficiency 76 6.5 Summary 77 Table of Contents Chapter Seven PROCEDURES FOR CALCULATING STANDARD ERRORS FOR FOOD SECURITY PREVALENCE ESTIMATES 79 7.1 CPS Sample Design 79 7.2 Adjustment Factor for Berween-PSU Variance 80 7.3 Estimation of Within-PSU Variance 81 7.4 Calculation of the Standard Errors 84 Chapter Eight POTENTIAL SOURCES OF BIAS IN PREVALENCE ESTIMATES 85 8.1 Screening Bias 86 8.2 Response Bias 88 8.3 Random Error in Survey Responses 89 8.4 Summary 92 REFERENCES 93 Appendix A REVIEW OF LITERATURE FROM PHYSIOLOGY AND CLINICAL NUTRITION RESEARCH ADDRESSING THE NATURE OF HUNGER Appendix B PREVALENCE OF HOUSEHOLD FOOD SECURITY STATUS (30- DAY SCALE) Appendix C PARTICIPANTS IN FEDERAL INTERAGENCY WORKING GROUP FOR FOOD SECURITY MEASUREMENT CHAPTER ONE INTRODUCTION In April 1995, the U.S. Bureau of the Census conducted the first collection of comprehensive food security data as a supplement to its regular Current Population Survey (CPS). With about 45,000 household interviews, this survey is the first to collect the special data needed to measure food insecurity and hunger in a nationally-representative sample of U.S. households. The Food and Consumer Service (PCS) of the U.S. Department of Agriculture led the effort to develop the Food Security Supplement to the CPS, building on research conducted at universities and elsewhere over the past decade. After the survey was conducted, the next step was to analyze the data to create measurement scales that gauge households' levels of severity of food insecurity and hunger. FCS contracted with Abt Associates Inc. and three subcontrac-tors — the Tufts University Center on Hunger, Poverty, and Nutrition Policy; the Cornell University Division of Nutritional Sciences; and CAW and Associates — to carry out the scale construction analysis. The results of that analysis are presented in Household Food Security in the United States in 1995: Summary Report ofthe Food Security Measurement Project, to which this report is a companion volume. The purpose of this report is to describe the analyses through which the food security scales and food security status variable were developed, as well as related tests of die reliability and validity of these measures. Two scales were developed to measure the degree of food insecurity and hunger in American households. One measures food insecurity and hunger over the period of the 12 months prior to the survey interview, and the second measures these conditions in the 30 days immediately preceding the interview. After a number of exploratory analyses, a type of non-linear factor analysis known as a Rasch model was used to form the scales. This methodology and the procedures through which it was applied are described in Chapter Two. The two scales were subjected to a variety of tests of reliability, including tests specific to the Rasch model and more traditional tests commonly used with scales developed through linear factor analysis. The results, presented in Chapter Three, generally indicate good Prepared by Abt Associates Inc. Chapter One: Introduction reliability for the 12-month scale. The 30-day scale, because it is based on a smaller number of questions and provides detailed measurement for a narrower portion of the food insecurity spectrum, has somewhat lower reliability. The two scales serve as the basis for defining two corresponding food security status variables. The 12-month variable has four categories: (1) Food Secure; (2) Food Insecure with No Hunger Evident; (3) Food Insecure with Moderate Hunger Evident; and (4) Food Insecure with Severe Hunger Evident. The 30-day scale has three categories: (1) No Hunger Evident; (2) Food Insecure with Moderate Hunger Evident; and (3) Food Insecure with Severe Hunger Evident. To classify households into the various categories, it was necessary to define ranges on the 12-month and 30-day scales that correspond to each category. The rationale for the range definitions is described in Chapter Four. The food security scale and the food security status indicator represent a central dimension of food insecurity: availability of enough food for the household to meet basic needs. The concept of food insecurity has other dimensions, however, including the specification that households should be able to acquire food in socially acceptable ways. Because the CPS Supplement includes several indicators of "coping" or "resource augmentation" behaviors related to this dimension of food insecurity, the possibility was explored of supplementing the primary food security scale with an index of resource augmentation actions. The analysis, described in Chapter Five, suggests that such an index should not be used in classifying households' food security status at this time. A key question for any new scale is how accurately it represents the condition it attempts to measure. Ideally, one would compare the food security scales and status variables to some more definitive measure or measures of food insecurity and hunger. Because no such definitive measure exists, the best way to judge the measure is to assess its relationship to other measures thought to be related to food insecurity and hunger, such as the household's level of food expenditures or its total income. Chapter Six presents the results of such analyses, which show relationships of the sort that would be expected with a valid measure of food insecurity and hunger. The central purpose of the food security scales and the status variables is to assess the food security of the U.S. population and of subgroups within the population. Estimates of the Prepared by Abt Associates Inc. Chapter One: Introduction prevalence of food insecurity and hunger are presented in the study's main report, based on the April 1995 data. Because these data come from a sample of households, prevalence estimates are subject to sampling error, and the report therefore presents estimated standard errors corresponding to the estimated prevalences. The estimation of standard errors is complicated by the multi-stage sampling design used by the CPS. Chapter Seven describes the methodology used in the estimation of standard errors. Finally, Chapter Eight discusses the potential sources of bias in prevalence estimates that might result from the sample design of the CPS, from household response behaviors to the Food Security Supplement, and from the fact that only a small proportion of the population experiences food insecurity. The analysis indicates that the various potential sources of bias probably lead to quite small levels of estimation error in counterbalancing directions. Prepared by Abt Associates Inc. CHAPTER Two METHODS AND RESULTS OF SCALING ANALYSIS OF CPS DATA This section describes the rationale and the results of conducting preliminary linear factor analyses and subsequently fitting a series of non-linear factor analysis models to the CPS food security data. This latter analysis approach more accurately characterizes the covariation among items in the CPS data set than more traditional linear factor analysis models. Most items available for analysis in the CPS data set were severely skewed and dichotomous or categorical in nature. Therefore, a number of statistical assumptions were violated using the linear factor analysis methods with the CPS items, such as the assumption of normally distributed error variance. Such situations can be dealt with more appropriately using non-linear scaling techniques. Item Response Theory (IRT) describes a general model that was developed by the educational testing industry to assist in creating valid and reliable aptitude tests, such as the Scholastic Aptitude Test (SAT) and the American College Testing Program (ACT) test. When applying a particular IRT model to data, the test designer usually assumes that the responses to a set of items can be accounted for by latent traits or factors that are fewer in number than the test items. The primary goal is to determine how an individual with a certain ability level will respond to an item associated with a particular difficulty level. There are a number of alternative forms the IRT model can take, depending on the assumptions regarding how the underlying data were generated. The three most frequently discussed IRT models in the literature are (1) the three-parameter logistic model, (2) the two-parameter logistic model, and (3) the one-parameter logistic model. The three-parameter logistic IRT model is the most complex, and can include varying discrimination parameters, varying difficulty levels, and varying guessing parameters. Using the notation of HambletUi (1983),1 the three-parameter logistic model can be written as follows: 1 Hambleton. R.K. (ed.). Application ofItem Response Theory, Vancouver: Educational Research Institute of British Columbia. 1983. Prepared by Abt Associates Inc. Chapter Two: Methods and Results of Scaling Analysis of CPS Data Wn) " Ci + (l-Ct) (1) 1 + g'(fi-'bi' where Bn = latent trait score of person n, ai = item discrimination parameter for item / bx = item difficulty for item /, ci - guessing parameter for item /, n = person, and / ■ item. The two-parameter logistic model assumes that guessing does not occur, and therefore the guessing term is dropped from the model. The two-parameter logistic model can be expressed as follows: *M= igra (2) where 6n m latent trait score of person n, at ■ item discrimination parameter for item /, bt = item difficulty for item /, n - person, and / = item. Finally, the one-parameter logistic model is a more straightforward model relative to the two previous models, because the model (1) has no guessing parameters, and (2) specifies mat all Hems have the same discrimination parameter (a). That is, the slopes of the item-characteristic curves are constrained to be equal for all items. The model can be written as follows: Prepared by Abt Associates Inc. Chapter Two: Methods and Results of Scaling Analysis of CPS Data *wv = Da(0n-b,) (3) 1 + e Da(9n-bj> where a h n I = latent trait score of person n, = average item discrimination parameter for item /, = item difficulty for item /, = person, and = item. Because D and d are constants in the model, the one-parameter logistic model can be written in a more simplified form: PM = |+#**-V> (3) We can also express this model using the notation of Wright and Masters (1982): VW 1 + e V„-(*i*r0) (4) where ft. m latent trait score of person n, 5, = item difficulty for item /, rk = threshold parameter for step k of item /, n * person, / = item, and * - step, and include a threshold parameter that is associated with the rating scale model developed by Andrich (1978, 1979). Prepared by Abt Associates Inc. Chapter Two: Methods and Results of Scaling Analysis of CPS Data 2.1 PRELIMINARY LINEAR FACTOR ANALYSIS The CPS Food Security Supplement builds on a substantial amount of recent research on the measurement of food insecurity, some of which included scaling analysis.2 The first analytic step was to replicate some of the prior analyses to determine whether the general patterns and relationships in the data were similar to those seen in prior work. A series of linear factor analyses were fit to the CPS data. One illustrative model, summarized in Exhibit 2-1, was fit for households with children (because this group was asked all questions in the Supplement). The factor model incorporated a Procrustes rotation, which allows one to rotate to a pre-specified factor solution, where the solution was specified to represent the dominant themes of the prior research. Fitting the factor analysis model resulted in three factors with eigenvalues greater than 1.0 prior to rotation (15.0, 1.6, and 1.4), with factor loadings as shown in the exhibit. The first factor includes primarily items related to child food intake reductions and hunger, the second consists mainly of household-level food insecurity items, and the third comprises mainly items related to adult food intake reduction and hunger. In sum, the results generally confirmed that the response patterns in the CPS data were similar to those seen in prior research and that simihu relationships might be expected to exist. In addition, the large positive factor intercorrelations suggested the possibility that non linear factor analysis methods might result in the items loading onto a single factor (i.e., that the separation of factors could occur in part because of the limitations of linear factor analysis in handling low-frequency dichotomous items). Finally, exploratory analyses of groups of households without children suggested that, for those items applicable to all groups, the factors might be relatively invariant across groups. 2 Two key prior studies are Olson, Frongillo, and Kendall (1995), and Scott, Wehler, and Anderson (1995). The first study estimated a factor analysis model including four items from die Community Childhood Hunger Identification Project (CCHIP) and ten items from two previous Cornell surveys. The analysis identified two key factors, one associated with household-level food insecurity and one associated with hunger. The second study, analyzing data from multiple CCHIP studies, found a fust factor comprising mainly household-level food insecurity items and adult hunger items, whereas the second factor included mainly child hunger items. Prepared by Abt Associates Inc. Chapter Two- Methods and Results of Scaling Analysis of CPS Data Exhibit 2-1 SUMMARY OF FACTOR LOADINGS FOR LINEAR FACTOR ANALYSIS MODEL (n=2,991) Items Stanr» -rdized Regression Coefficients F, *2 h Qll 38 Q15 59 Q16 63 Q20 52 Q24 45 Q28 52 Q32 47 Q35 48 Q38 43 Q40 50 Q43 42 Q47 60 Q50 40 Q53 78 Q54 Q55 78 Q56 73 Q57 49 Q58 75 Prepared by Abt Associates Inc. Chapter Two: Methods and Results of Scaling Analysis of CPS Data 2.2 EXPLORATORY TWO-PARAMETER NON-LINEAR FACTOR ANALYSIS MODEL Initially, we fit a series of exploratory non-linear factor analysis models to determine the dimensionality of the Food Security Survey items.3 From these alternative models, we selected one representative non-linear model, labeled M121, which best describes the consistent findings across the various alternative models. M121 was fit as a two-parameter logistic model that included estimates for both factor loadings (discrimination parameters) and uniquenesses (error term).4 Descriptive statistics for the subsample of 994 subjects and 21 items are presented in Exhibit 2-2. The items ranged in proportion of positive responses from .850 (item 15) to .004 (item 50), where the higher the proportion, the lower the severity of food insecurity indicated by the particular item. The results of the non-linear factor analysis model are presented in Exhibit 2-3. The primary fit statistic, the root mean square residual (RMSR) suggested that the one-factor model adequately fit the data (RMSR = .0074). That is, the RMSR was well within the acceptable range with a single factor, and was not materially improved by adding further factors, making the single-factor model the most parsimonious solution. As with the linear factor analysis model, items 15 and 23 were poor-fitting, with low factor loadings (.31 and .22, respectively). Item 22 had a moderately positive factor loading (L = .43), whereas the rest of the items all had large positive loadings above .50. The findings support the linear factor analysis results with respect to item fits, but suggest that items 15 and 23 should be removed from subsequent models. 3 Exploratory non-linear factor analysis models were fit using two software packages: LISCOMP and NOHARM. LISCOMP is a structural equation modeling program that is designed to work with dichotomous and/or ordinal data. NOHARM is a non-linear factor analysis program that analyzes moment matrices. Bom programs allow one to fit a two-parameter item response theory model (non-linear factor analysis model) to the data. Exploratory analysis focused on households with children in random 25 percent suhsamples of die Food Security Supplement sample. Households that did not pass the series of screening questions (i.e., higher-income households with no indication of food insecurity), and consequently were not asked the full series of food insecurity and hunger questions, were excluded from the analysis. 4 The two-parameter model can be fit with either item difficulty or uniqueness as the second parameter. The soecification shown here chose the uniqueness parameter. Prepared by Abt Associates Inc. 10 Chapter Two: Methods and Results of Scaling Analysis of CPS Data Exhibit 2-2 DESCRIPTIVE STATISTICS FOR MODEL M121 Variable Mean Std Sum QH .231 .421 231 Q15 .850 .356 850 Q16 .450 .497 450 Q18 .325 .468 325 Q19 .095 .293 95 Q20 .274 .446 274 Q21 .585 .492 585 Q22 .122 .327 122 Q23 .016 .125 16 Q24 .244 .429 244 Q28 .054 .226 54 Q32 .233 .423 233 Q35 .123 .328 123 Q38 .047 .211 47 Q40 .048 .213 48 Q43 .023 .150 23 Q47 .049 .216 49 Q50 .004 .063 4 Q53 .600 .490 600 Q54 .434 .495 434 Q55 .398 .489 398 Q56 .267 .442 267 Q57 .137 .344 137 Q58 .377 .484 377 Prepared by Abt Associates he. 11 Chapter Two: Methods and Results of Scaling Analysis of CPS Data Exhibit 2-3 SUMMARY OF FACTOR LOADINGS FOR MODEL M121 Item ItemLabd Standardized Regression Coefficients *1 Qll General food sufficiency question 70 Q15 Try to make food or money go further 31 Q16 Run out of foods needed to make meal 70 Q18 Borrow food or money to make meal 56 Q19 Take child to other home for meal 68 Q20 Serve few low-cost foods several days in a row 73 Q21 Put off paying bills to buy food 51 Q22 Get emergency food from church or food bank 43 023 Eat meal at soup kitchen 22 Q24 Adults cut or skip meals because not enough money for food 89 Q28 Adults don't eat for whole day 79 Q32 Eat less than should because not enough money to buy food 88 Q35 Hungry but don't eat because can't afford to 85 Q38 Lost weight because not enough food 75 Q40 Child's meal size cut because not enough money for food 76 Q43 Child skip meal because not enough money for food 60 Q47 Child hungry but can't afford more food 80 Q50 Child did not eat for a whole day 71 Q53 Worry food will run out before getting money for more 79 Q54 Food doesn't last and don't have money to get more 89 Q55 Can't afford to eat balanced meals 88 Q56 Can't feed children a balanced meal 85 Q57 Child uot eating enough because can't afford more food 83 Q58 Child fed only few low-cost foods, running out of money 82 Prepared by Abt Associates Inc. 12 Chapter Two: Methods and Results of Scaling Analysis of CPS Data 2.3 UNTOIMENSIONAL ONE-PARAMETER NON-LINEAR FACTOR ANALYSIS MODELS The exploratory non-linear factor analysis models indicated that the Food Security Survey items could be described efficiently as a unidimensional construct. Therefore, we pursued a specific non-linear factor model called the Rasch model. The Rasch model is a concise one-factor model that constrains the discrimination parameters (factor loadings) to be equal across all items. The statistical constraints of the Rasch model result in several desirable properties for the measurement scale, especially its robustness across multiple samples and multiple variations of the test (Wright and Masters, 1982). Furthermore, the preliminary exploratory models indicated that most of the items had very similar discrimination parameters when the discrimination parameters were allowed to vary. The computer program BIGSTEPS was designed specifically to fit the unidimensional Rasch model. All subsequent models described in this section were fit using BIGSTEPS. Five alternative measurement models based on existing theoretical frameworks were generated for the Food Security Survey items. The five alternative models are summarized in Exhibit 2-4. For most of the models, the items were divided into two subsets based on the specific time frame that the items referenced. For models R101, R102, and R103, the first subset of items references behaviors and events that occurred in the last 12 months, whereas the second subset references behaviors and events that occurred in the last 30 days. Models were fit separately for the 12-month and 30-day time periods. A general summary of item fits for the alternative models is presented in Exhibit 2-5. The identification of poorly-fitting items and/or redundant items is based on item in-fit and out-fit statistics. The out-fit statistic, nit is an unweighted fit statistic. It is based on a standardized residual, written as: iff where v.,- is the score residual for household n on item /, and W„ is the variance of the score *HI 5 Note in Exhibit 2-3 that nearly all factor loadings fall in die fairly narrow range from 70 to 88. The questions with loadings substantially outside this range (Q1S. Q18. Q21, Q22, Q23) are all ultimately excluded from the scale. Prepared by Abt Associates Inc. 13 Chapter Two: Methods and Results of Scaling Analysis of CPS Data Exhibit 2-4 ALTERNATIVE NON-LINEAR FACTOR ANALYSIS MODELS Model 12-Month Scale 30-Day Scale R101 Scale includes items that referenced events that occurred in the last 12 months. Items IS, 16, 18, 19, 20. 21, 22, 23, 24, 25, 28, 29, 32, 35. 38, 40, 43, 44, 47, 50, 53b. 54b. 55b. 56b. 57b, 58b. Scale includes items that referenced events that occurred in the last 30 days. Items 17, 26, 27, 30, 31, 33, 34, 36. 37. 39. 41. 42, 45. 46. 48. 49. 51. 52. R102 Scale includes items that referenced events that occurred in the last 12 months, and excludes resource augmenting behaviors (18, 19, 21. 22, and 23). Items 15, 16, 20, 24, 25, 28, 29, 32, 35, 38, 40, 43, 44, 47, 50, 53b, 54b, 55b, 56b, 57b, 58b. Scale includes items that referenced events that occurred in the last 30 days, and excludes resource augmenting behaviors. Items 17, 26, 27, 30, 31, 33, 34, 36, 37. 39, 41,42,45,46,48,49.51,52. R103 Scale includes food insecurity items based on the CCHIP model. Items 15, 18, 19, 20, 21. 22. 23, 53a. 55a, 56a, 58a. Scale includes food insufficiency and hunger items based on the CCHIP model. Items 16, 17, 24, 25, 26, 27, 28, 29, 30, 31. 32, 33, 34, 35, 36, 37, 38, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50. 51. 52. 54a. 57a. R104 NA Scale includes items that reference events that occurred in the last 30 days. When no 30-day reference was available, items that referenced the last 12-month period are included. Items 15. 17, 18, 19, 20, 21, 22, 23, 26, 27, 30. 31, 33, 34, 36, 37, 39, 41, 42, 45, 46, 48, 49, 51, 52, 53a, 54a, 55a, 56a, 57a, 58a. R105 NA Scale includes items that referenced 30-day period and number of days in die last month. Also includes items that reference "often true" in die last 12 months. Items 17, 26, 27, 30, 31, 33, 34, 36, 37, 39. 41, 42, 45, 46, 48, 49, 51, 52, 53a. 54a. 55a, 56a, S7a, 58a. NOTES: (1) For kerns that referenced number of days, one dummy code was created baaed on whether the behavior or experience occurred five or more times in die last month. (2) For items that referenced number of months, one dummy code was cteated by combining the two more extreme categories of the variable, indicating the experience occurred in three or more of the past 12 months. (3) For items Q53 through Q58, 'a' denotes a dummy code that represents 'often true,' whereas 'b' denotes a dummy code that combines 'sometimes true' and 'often true.' Prepared by Abt Associcaes Inc. 14 Chapter Two: Methods and Results of Scaling Analysis of CPS Data Exhibit 2-5 SUMMARY OF RESULTS FROM ALTERNATIVE NON-LINEAR FACTOR ANALYSIS MODELS Model 12-Month Scale 30-Day Scale Poorly Fitting Items Redundant Items Poorly Fitting Items Redundant Items R101 Q21, Q18, Q15. Q22 Q54b Q17 No redundant items R102 Q15, Q16, Q20 No redundant items Q17 No redundant items R103 No poor fitting items. No redundant items Q16, Q17, Q43 Q26 R104 NA NA Q22, Q23 Q33 R105 NA NA Q58a. Q17 No redundant items residual. The standardized residual is then squared and averaged to obtain a mean estimate of item fit. M/ = N The in-fit statistic, vt, is a weighted fit statistic that includes the same squared standardized residual as n(, and is written as: Both the in-fit and out-fit statistics have an expected value of 1.0. As they deviate from 1.0, die associated items become candidates for removal from the scale. Generally speaking, a mean square fit statistic that is greater than 1.20 indicates a poor fitting item, whereas a mean square fit statistic that is less than .80 indicates an item is redundant with other similar types of hems in the scale. Items that have both an in-fit and out-fit statistic above 1.2 art targeted for removalfrom the scale. Items with both in-fit and out-fit statistics below .80 are redundant with respect to the information they share with other items in the scale. Items that were shown to be redundr.at Hems were also considered for removal and/or combined with other items. Below we focus on describing the results of the 12-month and 30-day scale for M102, because Prepared by Abt Associates Inc. 15 Chapter Two: Methods and Results of Scaling Analysis of CPS Data these two specific models were subsequently considered the most parsimonious by the study team. 12-Month Food Security Scale As with the linear factor analysis models, all Rasch models were initially tested using only households with children, because they comprised the subsample of households that were administered the entire set of food security items. The results for Model M102 are presented in Exhibit 2-6. The summary table contains a large amount of information, briefly described below. The order of items in the table is determined by their item calibration, shown in the fourth column of Exhibit 2-6. A question's item calibration represents the point on the scale at which there is a 50 percent probability that any given household will respond "yes" to the item. That is, households with higher values on the scale than a particular item's calibration score have a greater than SO percent probability of answering that item positively; households with lower values have a less than SO percent probability of a positive response to the item in question. The items are listed from high calibration at the top of the table to low calibration at the bottom. The item calibration is a function of (1) the total number of individuals that have responded to any item in the scale (1,687); (2) the number of individuals that responded to the particular item in the scale (ft); and (3) the number of positive responses to the particular item (raw score). For example, hem SO refers to the item "child did not eat for a whole day." The item has an item calibration of 4.S6, which is the highest in the table. This event occurs rarely in any household. For this specific subsample, this event occurred for only 12 of the 1,684 households that responded to the item. At the other end of the scale, item IS ("run short of money and try to make food or food money go further") is the least severe item included in the analysis. The item has the low calibration of -S.74, based on 1,469 positive responses out of the 1,686 households that answered the question. The column headed "Real SE" shows the standard error of the items, which can be used to create a confidence interval for the Hern calibration. Items located at the severe end of the scale tend to have the largest standard errors, because they tend to have larger variances compared to items throughout the center and less-severe end of the distribution. Prepared by Abt Associates Inc. 16 Chapter Two: Methods and Results of Scaling Analysis of CPS Data Exhibit 2-6 SUMMARY OF MODEL R102A Item n Raw Score Item 1 Calibration Real SE In-fit Out-fit | Point Mean Sq Z Mean Sq Z Biserial Corr. Q50 1,684 12 4.81 .30 .99 0.0 .28 -0.4 .19 Q44 1.684 23 4 01 .22 1.00 0.0 .41 -0.5 .24 Q43 | 1,684 38 3.36 .18 1.04 0.3 1.73 0.5 .28 Q29 | 1,683 62 2.68 ,4 .89 -1, .28 -1.3 .39 Q40 1,683 86 2.21 .13 1.01 0.1 1.99 1.2 .40 Q47 1,684 89 2.15 .12 .88 -1.5 .56 -0.8 .44 Q38 1,683 91 2.12 .13 1.07 0.8 .46 -1.1 .40 Q28 1,684 95 2.06 ..2 .95 -0.6 .41 -1.3 .44 Q35 1,685 212 .65 .09 .91 -1.6 .83 -0.6 .57 Q57 1,680 246 .36 .09 1.00 0.1 .60 -1.8 .57 Q25 1,683 293 -.01 .08 .94 -1.3 .56 -2.4 .61 Q32 1,683 442 -.98 .07 .94 -1.5 .67 -2.7 .64 Q24 1,685 449 -1.01 .07 .86 -3.5 .67 -2.8 .67 Q56 1,679 466 -1.12 .07 1.04 0.9 .75 -2.1 .61 Q20 1,686 480 -1.19 .08 1.24 5.5 1.50 3.5 .52 Q58 1,680 671 -2.18 .07 .99 -0.4 .96 -0.4 .60 Q55 1,678 706 -2.36 .07 .87 -3.6 .68 -3.5 .64 Q54 1,679 785 -2.73 .06 .82 -5.2 .74 -2.5 .64 Q16 1,687 795 -2.77 .07 1.23 5.9 1.22 1.9 .50 Q53 1,680 1,066 -4.01 .06 .95 -1.6 .85 -0.8 .49 Q15 1,686 1.469 -6.06 .09 1.31 6.7 7.70 5.5 ln Mean 1,683 408 .00 .11 1.00 -0.1 1.14 -0.6 SD 2 382 2.82 .06 .13 2.9 1.53 2.1 NOTE: Sample include* households with children only, bents are ordered on terms of severity. Prepared by Abt Associates Inc. 17 Chapter Two: Methods and Results of Scaling Analysis of CPS Data For the 12-month scale presented in Exhibit 2-6, there are three items with both in-fit and out-fit statistics that exceed 1.20 (Q1S, Q16, and Q20). Therefore, these three items were removed from the scale, and the model re-estimated. The results of the revised model are presented in Exhibit 2-7. The effective sample size for the revised model is reduced (n = 1,276) because two of the least severe items were removed from the analysis. This results in fewer subjects who have responded yes to any particular item. For the revised model, there are no items with both in-fit and out-fit statistics that exceed 1.20. Similarly, there are no items with both in-fit and out-fit statistics below .80. Some of the out-fit statistics were small, due primarily to dependencies in some item pairs. For example, item 29 has a low out-fit statistic (mean square = .36), but the item is associated with item 28. We examined several alternative models with these items modeled as trichotomies rather than the multiple dichotomies, but the basic results of the models did not change. Final 12-Month Food Security Scale The analyses for the 12-month scale were replicated on subsequent subsamples of the data set.6 The model replications provided clear support for the invariance of the primary measurement model across subsamples, as well as across different types of households. In each replication, the item calibrations gave identical or near-identical rankings of item severity and consistent clustering of closely-ranked items. Applying models fit on separate subsamples yielded household values that correlated at the .99 level.7 The final model estimates are based upon all households in the analysis sample; these are presented in Exhibit 2-8. Of the 18,370 households that passed the screener and responded to at least half of the questions applicable to them, there were 7,897 households in which the respondent answered "yes" to at least one of the 12-month scale items. The ordering of the 6 The overall sample was initially divided into four random subsamples. Initial model estimation was carried out for households with children within one nibsample. Tests for invariance were performed for households with children in the other three random subsamples. Invariance tests were also performed for households without children, subdividing them into households with any elderly members (age 60 or over) and households with no elderly members. 7 In this procedure, we separately fit the model to each subpopulation, such as households with children, households with no children but with elderly members, and households with neither children nor elderly. Each of die separate models was then used to compute scale values for all households in the full sample. The values computed with the different models were then compared through plotting and correlation analysis. Prepared by Abt Associates Inc. 18 Chapter Two: Methods and Results of Scaling Analysis of CPS Data Exhibit 2-7 SUMMARY OF REVISED MODEL R102A ■ Item II _ Raw Score Item Calibration Real SE In-fit Out-fit Point Biserial 1 torr. Mean Sq Z Mean Sq Z Q50 1,275 12 4.38 .30 .96 -0.2 32 -0.5 .21 Q44 | 1,275 23 3.59 .22 .99 -0.1 .50 -0.5 .25 Q43 | 1.275 38 2.93 .18 1.01 0.1 1.50 0.5 .29 Q29 | 1.274 62 2.26 .14 .90 -1.0 .36 -1.4 .40 Q40 1.274 86 1.77 .13 1.02 0.2 2.34 2.0 .39 Q47 1,275 89 1.72 .12 .88 -1.4 .70 -0.7 .45 Q38 1,274 91 1.69 .13 1.09 1.1 .65 -0.8 .39 Q28 1,275 95 1.63 .12 .96 -0.5 .52 -1.3 .44 Q35 | 1.276 212 .21 .09 .95 -0.9 1.09 0.4 .55 Q57 1,274 246 -.11 .09 .99 -0.2 .65 -2.1 .56 Q25 1,274 293 -.49 .08 .98 -0.4 .76 -1.6 .57 Q32 1,274 442 -1.53 .08 1.01 0.2 .99 -0.1 .57 Q24 1,276 449 -1.56 .08 .96 -1.0 1.01 0.1 .59 Q56 1,273 466 -1.68 .08 1.08 1.9 .97 -0.3 .54 Q58 1,274 671 -2.89 .08 1.11 2.6 1.28 2.1 .47 Q55 1,272 706 -3.09 .07 .94 -1.7 .84 -1.2 .53 Q54 1,273 785 -3.54 .07 .92 -2.2 .94 -0.4 .49 Q53 1,274 1.066 -5.28 .09 1.16 3.7 1.28 0.7 .23 Mean 1,274 324 .00 .12 .99 0.0 .93 -0.3 SD 1 303 2.70 .06 .07 1.5 .46 1.1 NOTE: Simple includes households with children only. Items are ordered in terms of severity. Prepared by Abt Associates Inc. 19 Chapter Two: Methods and Results of Scaling Analysis of CPS Data items in the final model changes slightly relative to the orde. ing of the items described in Exhibit 2-7; however, these minor fluctuations in item severities are expected with different random subsrmples of households.8 Exhibit 2-9 shows the frequency distribution for the number of responses to items in the survey. The two most frequent response patterns are 10 items and 18 items.9 The response pattern of 10 items applies largely to the households without children, because these had the opportunity to respond to a maximum of 10 items. The response pattern of 18 items applies to households with children, who had an opportunity to respond to 18 items. These two response patterns account for 98.8 percent of the households, indicating a very low incidence of item nonresponse (1.2 percent of all respondents). Households, whether with or without children, that responded to less than half the items administered had their household score set to "missing." The central function of the Rasch model is to assign to each responding household a value on the food security scale. The household scale value is fundamentally based on a count of the number of affirmative responses to questions included in the scale. At its simplest, if all households respond to the same set of questions, the household scale value is a constant arithmetic transformation of the count of positive responses. For example, among households with children responding to all 18 questions in the scale, all households with three positive responses have a scale value of -4.13. Households with more affirmative responses have higher scale values; for example, households with children giving ten affirmative responses have a scale value of 0.62. The scale value does not depend on which questions the household answers affirmatively: all households with children who give three affirmative answers have the same scale value, even if they give affirmative answers to quite different questions. * The Rasch model software initially assigns scale values in a range that yields i mean of zero. Because the presence of positive and negative values in the scale can be confusing or misleading, it is conventional to transform the values into a range such as 0-1,0-10, or 0-100. Values of the 12-month scale presented in other reports from this project transform the original scale values to range from 0.0 to 10.0. The original value is multiplied by .8333 and added to S.071 to obtain the transformed value. All respondents giving zero affirmative responses are assigned a value of zero, and respondents answering all questions affirmatively get a value of 10.0. 9 Over half of all households in the sample were higher-income households that did not pass the screening questions, and therefore were not asked any of the questions included in the scales. Prepared by Abt Associates Inc. 20 Chapter Two: Methods and Results of Scaling Analysis of CPS Data Exhibit 2-8 SUMMARY OF FINAL 12-MONTH SCALE Item ■ Raw Score Item Calibration Real SE In-fit Out-fit Point Biserial Corr. Trans-formed Mean Sq Z Mean Sq Z Item Calibra- 1 tion' Q50 4,333 29 4.92 .20 1.09 0.5 6.02 1.8 .18 9.2 Q44 4,331 87 3.48 .12 .84 -1.8 .28 -1.6 .34 8.0 Q43 4,332 135 2.86 .10 .88 -1.7 .78 -0.5 .37 7.5 Q29 7,889 332 2.55 .06 .89 -2.5 .55 -1.8 .35 7.2 Q47 4,333 257 1.88 .07 .93 -1.3 .97 -0.1 .44 6.6 Q28 7,892 537 1.82 .05 .97 -1.0 1.16 0.8 .39 6.6 Q40 4,333 290 1.69 .07 1.01 0.3 1.28 1.0 .44 6.5 Q38 7,861 625 1.54 .05 1.10 3.1 1.31 1.6 .39 6.4 Q35 7,883 1,249 .27 .04 .91 -4.0 .77 -2.6 .54 5.3 Q57 4,324 779 -.15 .05 1.07 2.3 .86 -1.4 .53 5.0 Q25 7,879 1,919 -.70 .03 .93 -3.4 .76 -4.6 .58 4.5 Q32 7,885 2,661 -1.56 .03 .94 -3.5 .94 -1.5 .57 3.8 Q56 4,325 1,453 -1.64 .04 1.08 3.4 .94 -1.0 .54 3.7 Q24 7,893 2,824 -1.72 .03 .88 -7.3 .87 -3.2 .59 3.6 Q58 4,324 2,295 -3.10 .04 1.14 6.5 1.29 3.3 .43 2.5 Q55 7,862 4,627 -3.42 .03 1.03 2.1 1.61 7.9 .41 2.2 Q54 7,863 4,973 -3.73 .03 .92 -5.9 1.06 0.8 .42 2.0 Q53 7,870 6.312 -4.99 .03 1.16 9.9 3.04 9.4 .18 0.9 Mean 6.301 1.744 .00 .06 .99 -0.2 1.36 0.5 SD 1,763 1.833 2.71 .04 .10 4.2 1.26 3.5 * The transformed kern calibration is a linear transform of the kern calibration that place* all values in the range from 0.0 tc 10.0. If all respondents are given exactly the same set of questions, the scale value depends solely on the number of affirmative responses. If different respondents answer different sets of questions, however, scale values depend on the severity (as indicated by the item calibration) of the questions that the respondent answers. In the current situation, households with children are asked 18 questions, whereas those without children are asked only ten. Moreover, the Prepared by Abt Associates Inc. 21 Chapter Two: Methods and Results of Scaling Analysis of CPS Data Exhibit 2-9 NUMBER OF QUESTIONS ANSWERED: QUESTIONS IN THE 12-MONTH SCALE Number of Questions Answered Frequency Percent Cumulative Frequency Cumulative Percent 2 7 0.0 7 0.0 3 4 0.0 11 0.1 4 6 0.0 17 0.1 5 11 0.1 28 0.2 6 14 0.1 42 0.2 7 53 0.3 95 0.5 8 11 0.1 106 0.6 9 51 0.3 157 0.9 10 10293 55.9 10450 56.8 12 21 0.1 10471 56.9 13 2 0.0 10473 56.9 14 2 0.0 10475 56.9 15 3 0.0 10478 56.9 16 11 0.1 10489 57.0 17 29 0.2 10518 57.1 18 7888 42.9 18406* 100.0 * Households that answered fewer than half of the applicable questions are excluded from the main analysis, reducing the sample to 18,370. questions asked only of households with children are disproportionately the more severe questions. The Rasch model takes these differences into account, assigning values to both types of household that are comparable even though they responded to different types of questions. Similarly, the model adjusts the scale values assigned to households with or without children that failed to respond to one or more of the 'terns applicable to them. The frequency distribution of household values on the 12-month scale is presented in Exhibit 2-10. Household values for the 12-month scale range from -6.08 to 5.91 in the original model estimation (values transformed to a 0-10 range are also shown). Most households in the analysis sample responded "no" to all items in the scale, and received a scale value of -6.08 Prepared by Abt Associates Inc. 22 Chapter Two: Methods and Results of Scaling Analysis of CPS Data Exhibit 2-10 FREQUENCY DISTRIBUTION FOR HOUSEHOLD VALUES ON THE 12-MONTH SCALE I Value on Scale Frequency Percent Cumulative Frequency Cumulative Percent Transformed Scale Value" -6.08 10276 56.5 10276 56.5 0.0 -5.2 970 5.3 11246 61.9 0.7 -4.96 902 5.0 12148 66.8 0.9 -4.13 661 3.6 12809 70.5 1.6 -3.73 614 3.4 13423 73.8 2.0 -3.36 550 3.0 13973 76.9 2.3 -2.73 657 3.6 14630 80.5 2.8 -2.69 386 2.1 15016 82.6 2.8 -2.09 343 1.9 15359 84.5 3.3 -1.82 306 1.7 15665 86.2 3.6 -1.52 358 2.0 16023 88.1 3.8 -0.97 255 1.4 16278 89.5 4.3 ^0.96 285 1.6 16563 91.1 4.3 -0.43 188 1.0 16751 92.1 4.7 -0.09 295 1.6 17046 93.8 5.0 0.1 176 1.0 17222 94.7 5.2 0.62 132 0.7 17354 95.5 5.6 0.81 231 1.3 17585 96.7 5.8 1.13 86 0.5 17671 97.2 6.0 1.62 59 0.3 17730 97.5 6.4 1.75 128 0.7 17858 98.2 6.5 2.12 59 0.3 17917 98.6 6.8 2.65 28 0.2 17945 98.7 7.3 2.88 85 0.5 18030 99.2 7.5 3.24 15 0.1 18045 99.3 7.8 3.77 103 0.6 18148 99.8 8.2 3.96 12 0.1 18160 99.9 8.4 5.02 13 0.1 18173 100.0 9.3 5.91 6 0.0 18179** 100.0 10.0 * The transformed scale value is a linear transform that places all values in the range from 0.0 to 10.0. k Includes only households that responded to all applicable items. Prepared by Abt Associates Inc. 23 Chapter Two: Methods and Results of Scaling Analysis of CPS Data (10,276 households).10 All other households responded "yes" to at least one item. Their assigned scale value is a non-linear transformation of the total number of items to which they responded affirmatively. If all households had responded to all 18 items, there would be 19 possible scale score values that could be assigned to households. Because households without children could respond to only 10 items, however, there are a number of additional scale scores that can be assigned to households based on a missing data adjustment that is part of the Rasch measurement model. The small proportion of households in either group that failed to respond to one or more questions also received distinct measure scores, depending on the number of items missed. Final 90-Day Food Security Scale The 30-day scale was developed in the same manner as the 12-month scale, though there were fewer 30-day items available for analysis. The 30-day scale also has a larger number of item dependencies than the 12-month scale. The results of the final Rasch model for the 30- day scale are presented in Exhibit 2-11. The 30-day scale includes 17 items, and the estimated item calibrations range from -4.37 to 4.00. For the most severe item (item 52), only five households responded affirmatively. Exhibit 2-12 shows the number of responses households made to the 30-day items administered in the survey. Similar to the 12-month scale, there were two major response categories: 9 (households without children) and 17 (households with children). These two response patterns account for 99.3 percent of households. Here also, households that did not respond to at least half the items administered had their scale value set to "missing." Exhibit 2-13 provides the frequency distribution of the 30-day household scale scores. The scale scores range from -S.62 to S.32. Almost 90 percent of the households that passed the series of screening questions responded "no" to all items in the 30-day scale. The 30-day scale in its present form is not considered as useful as the 12-month scale, for both conceptual and statistical reasons. Conceptually, the 30-day scale provides detail on a narrower portion of the spectrum of food insecurity than the 12-month scale. Most of the less- 10 For analyses involving the full sample, households that did not pass the screen are assigned the minimum possible score (—6.08). This procedure is also used in classifying households on the food security status variables. Prepared by Abt Associates Inc. 24 Chapter Two: Methods and Results of Scaling Analysis of CPS Data Exhibu 2-11 SUMMARY OF FINAL 30-DAY SCALE Item ■ Raw Score Item Calibration Real SE In-fit (hit-fit Point Biaerial Corr. Mean Sq Z Mean Sq Z Q52 990 5 4.00 .45 .83 -0.4 .22 -0.7 .23 Q51 990 13 2.91 .30 1.07 0.3 1.04 0.0 .20 Q46 988 21 2.33 .23 .92 -.4 .68 -0.5 .34 Q31 1992 83 1.61 .12 .83 -1.9 .27 -3.2 .34 Q49 990 45 1.37 .16 .80 -1.7 .44 -1.7 .47 Q42 990 64 .91 .14 .88 -1.1 .59 -1.5 .46 Q45 988 69 .80 .14 1.10 1.0 1.67 1.8 .32 Q37 1985 249 .10 .08 .84 -3.3 .51 -4.1 .46 Q48 990 129 -.09 11 1.03 0.4 1.07 0.4 .40 Q30 1992 294 -.17 .07 1.08 1.8 1.22 1.6 .34 Q41 990 154 -.37 .11 1.14 2.1 1.42 2.3 .34 Q39 1958 344 .48 .07 1.18 4.0 1.42 3.4 .29 Q34 1983 611 -1.52 .06 .92 -2.6 .73 -4.7 .46 Q36 1985 637 -1.61 .06 .94 -1.9 .91 -1.4 .44 Q27 1993 715 -1.86 .06 1.04 1.2 .96 -0.8 .37 Q33 1983 1285 -3.56 .05 .97 -1.3 .87 -1.5 .29 Q26 1993 1549 -4.37 .06 1.13 4.3 1.54 3.3 .14 Mean 1516 369 .00 .13 .98 0.0 .92 -0.4 SD 497 444 2.12 .10 .12 2.1 .43 2.3 severe conditions and behaviors incorporated in die 12-month scale were not measuied in the 30-day time frame in the CPS Supplement. The 30-day measures thus focus on reductions of food intake and related indicators of hunger, providing little information on food insecurity with no hunger evident. The broader range of the 12-month scale makes it likely to be more useful both in describing the conditions of the population at a point in time and in monitoring changes. Statistically, Chapter Three will show that the 30-day scale is considerably less reliable than the 12-month scale in its ability to discriminate between households at varying levels of Prepared by Abt Associates Inc. 25 Chapter Two: Methods and Results of Scaling Analysis of CPS Data Exhibit 2-12 NUMBER OF QUESTIONS ANSWERED: QUESTIONS IN THE 30-DAY SCALE Number of Responses Frequency Percent Cumulative Frequency Cumulative Percent 2 7 0.0 7 0.0 3 1 0.0 8 0.0 4 6 0.0 14 0.1 5 2 0.0 16 0.1 6 10 0.1 26 0.1 7 17 0.1 43 0.2 8 35 0.2 78 0.4 9 10369 56.3 10447 56.8 10 1 0.0 10448 56.8 11 2 0.0 10450 56.8 13 1 0.0 10451 56.8 15 16 0.1 10467 56.9 16 15 0.1 10482 57.0 17 7922 43.0 18404 100.0 food insecurity. This more limited reliability stems mainly from the smaller number of independent questions asked in the 30-day time frame. The 30-day scale has just nine independent items, and a total of 17 when follow-up items are included.11 The 12-month scale has IS independent questions, plus three follow-up items. In addition, the absence of questions measuring the less severe food insecurity conditions creates a situation in which an extremely small proportion of the population gives affirmative responses to any of the items, which makes it more difficult for the scale to discriminate reliably among different levels of food insecurity. For these reasons, the main report of this study focuses almost exclusively on the 12- month scale, and this report provides less detail on the 30-day than die 12-month scale. Estimates of die prevalence of hunger based on the 30-day scale are presented in Appendix B. 11 The primary question typically asks if a particular behavior or condition occurred in the past 30 days. If the response is affirmative, the follow-up question men asks on how many of die 30 days die behavior or condition occurred. Prepared by Abt Associates Inc. 26 Chapter Two: Methods and Results of Scaling Analysis of CPS Data Exhibit 2-13 FREQUENCY DISTRIBUTION OF HOUSEHOLD VALUES ON THE 30-DAY SCALE Scale Values Frequency Percent Cumulative Frequency Cumulative Percent -5.62 16309 89.2 16309 89.2 -4.69 261 1.4 16570 90.6 -4.63 288 1.6 16858 92.2 -3.5 239 1.3 17097 93.5 -3.39 246 1.3 17343 94.8 -2.66 123 0.7 17466 95.5 -2.45 96 0.5 17562 96.0 -2.01 113 0.6 17675 96.7 -1.68 144 0.8 17819 97.5 -1.47 67 0.4 17886 97.8 -1 57 0.3 17943 98.1 -0.97 69 0.4 18012 98.5 -0.56 34 0.2 18046 98.7 -0.25 59 0.3 18105 99.0 -0.14 25 0.1 18130 99.2 0.27 23 0.1 18153 99.3 0.57 47 0.3 18200 99.5 0.68 9 0.0 18209 99.6 1.11 5 0.0 18214 99.6 1.57 5 0.0 18219 99.6 1.7 24 0.1 18243 99.8 2.08 4 0.0 18247 99.8 2.62 31 0.2 18278 100.0 2.66 2 0.0 18280 100.0 3.39 3 0.0 18283 100.0 4.44 1 0.0 18284 100.0 5.32 1 0.0 18285 100.0 Prepared by Abt Associates Inc. 27 Chapter Two: Methods and Results of Scaling Analysis of CPS Data 2.4 SUMMARY The scale development process involved five main steps: • Exploratory linear factor analysis replicating key elements of prior research, which indicated that the response patterns and relationships in the CPS Food Security Supplement were largely similar to those seen previously. • Estimation of two-parameter non-linear models, which indicated that a one-factor solution would be appropriate. • Preliminary estimation of one-factor Rasch models on a one-fourth random subsample of the full CPS sample, resulting in the specification of an 18-item set for inclusion in the 12-month scale and a 17-item set for the 30-day scale. • Tests of invariance of the model across other random subsamples of the full population and across three demographic subgroups (households with children, households without children but with elderly members, and households with neither children nor elderly members), which indicated that the models were quite invariant across groups. • Estimation of the final scales on the full CPS sample. Subsequent chapters of this report detail the steps taken to test the scales for reliability, construct validity, and estimation bias. Primary attention is given to the 12-month scale, which appears more useful than the 30-day scale on both conceptual and statistical grounds. Prepared by Abt Associates Inc. 28 CHAPTER THREE RELIABILITY ESTIMATES FOR THE FOOD SECURITY SCALES Whenever an instrument is used to measure some quality of a person — whether it be a heart rate, a psychological profile, or a level of food insecurity — researchers want to be assured that the instrument is reliable. A reliable instrument is one that, if it were administered to the same individual on two occasions under similar conditions, would provide similar results in both tests. Reliability indices therefore attempt to measure the degree to which an individual's score is expected to remain stable (relative to other individuals' scores) over repeated occasions using the same instrument. Often it is not feasible to administer an instrument repeatedly to the same individuals under similar circumstances. Reliability indices have therefore been developed that attempt to approximate this result through a single administration of the instrument. Most reliability indices for multi-item scales attempt to provide an estimate of the ratio of the true score variance to the total variance for a particular instrument. The underlying concept is that an individual's score on a scale (x) is composed of the individual's "true" score (t) and an error component. A general equation for a measure indicating the reliability of a scale (p) can be written as: ax where a2, is the variance of the households' true scores and a2x is the variance of the observed measure (i.e., the household scores on the scale). There are a number of reliability indices available for characterizing the reliability of a measure. Because the food security scales are estimated using a Rasch modeling approach, die most appropriate index is the Rasch reliability index. Because the Rasch reliability index has not been used as often in the scale development literature as some other reliability estimators, however, we provide estimates using some of the more common reliability indices as veil as die Rasch reliability index to characterize the reliability of the food security scale. One major difference between the more traditional reliability indices and the Rasch reliability index is the treatment of cases with extreme scores. Cases with extreme scores are Prepared by At* Associates Inc. Chapter Three: Reliability Estimates for the Food Security Scales those with either the maximum or minimum score possible on the measure (i.e., those that have responded affirmatively to all questions in the scale, or negatively to all questions). When scale scores are normally distributed over a population, very few cases have extreme scores and consequently they have very little impact on the reliability estimate. When the distribution is severely skewed, however, the treatment of cases with extreme scores can have a major impact on reliability estimates. This is very relevant to the food security scales, because over 80 percent of the population has the lowest possible score on the 12-month scale and over 90 percent on the 30-day scale. Because of differences in estimation algorithms, the Rasch reliability estimate always decreases when extreme scores are included, whereas the more traditional reliability estimates always increase. The Rasch model typically provides two reliability estimates, one including and one excluding the cases with extreme scores. The conventional practice with the more traditional reliability indices is to include the extreme scores. The discussion below provides separate reliability estimates that include and exclude extreme scores. In general, the estimate excluding households with extreme scores can be taken as indicating the reliability of the scale in measuring the severity of food insecurity and hunger among households that have experienced at least one of the food insecurity or hunger conditions represented in the scale. The interpretation of the estimate when extreme scores are included is less clear. Among the more traditional indices, Nunnally (1978) recommended that at least two types of reliability coefficients be reported: correlations between alternate test forms, and coefficient alpha. The discussion below presents the rest, using three traditional reliability indices, two of which are based on the correlation between alternate test forms (the Spearman- Brown split-half reliability estimate, and Rulon's split-half reliability estimate), and Cronbach's alpha. All three reliability indices are based on the use of linear composites, and therefore do not correspond exactly to the Rasch model (a non-linear model). Nonetheless, the indices provide a general indication of the reliability of die scale and familiar measures that may be compared to other work. Prepared by Abt Associates Inc. 30 Chapter Three: Reliability Estimates for the Food Security Scales 3.1 SPEARMAN-BROWN SPLIT-HALF RELIABILITY ESTIMATES The general form of the Spearman-Brown prophecy formula can be written as: fry P* = l+(*-l)pw/ ' where pjp represents the reliability of the composite measure with k parallel tests, and pu represents the reliability of any one particular test. A simplified form of the equation can be written as: 2Pofr Psp ■ - Pab where Pd, represents the correlation coefficient between two parallel tests. In order to create two somewhat parallel tests, the item pool (i.e., all the items used in the scale) is typically split in half randomly. Each subset of the items is considered a separate scale, and the results of the two scales arc compared. When the number of available items is small, as in the present situation, a commonly used method is to order the items in terms of severity and assign odd-numbered items to one test and even-numbered items to another test. The two new scales should have the same number of items, so if the item pool contains an odd number of hems, one is dropped before the pool is split. To estimate p_ for the 12-month scale, it was necessary to drop dependent items in order to generate unbiased reliability estimates.1 It was also considered informative to generate reliability estimates separately for items that were administered only to households with children and for hems that were administered to all households. For households whh children, there were 15 independent items available to create two parallel measures. Because there were an odd number of items, the most severe item was dropped from the list. For the first parallel scale, households' responses to items 43, 28, 38, 57, 56, 58, and 54 were summed to create the household score. For the second parallel scale, hems 47, 40, 35, 32, 24, 55, and 53 were summed. Based on the correlation between 1 Dtpmrtw* items are those that are follow-ups to previous items. A number of items in the food insecurity scales have an initial question (e.g., did this situation occur within the past 12 months?) and a follow-up (e.g., in how many of the past 12 months did the situation occur?) Prepared by Abt Associates Inc. 31 Chapter Three: Reliability Estimates for the Food Security Scales household scores on these two scales, the Spearman-Brown reliability estimate for the total scale was .852 with extreme scores excluded (see Exhibit 3-1). Including extreme scores raises the reliability index to .903. Exhibit 3-1 SUMMARY OF RELIABILITY ESTIMATES USING TRADITIONAL INDICES Household Type Reliability Estimate Extreme Scores Included Extreme Scores Excluded 12-Morth Scale All households Spearman .899 .794 Rulon .932 .878 Alpha .856 .743 Households with children Spearman .903 .852 Rulon .899 .813 Alpha .882 .814 30-Day Seal* All households Spearman .840 .357 Rulon .888 .650 Alpha .789 .356 Households with children Spearman .852 .530 Rulon .844 .530 Alpha .799 .555 For all household types (i.e., households with any combination of either children, adults, and elderly), there were eight independent items available to create two parallel measures. For the first parallel scale, items 28, 35, 24, and 54 were summed. For the second parallel scale, hems 38,32,55, and 53 were summed. The reliability estimate for the total scale is .794 with extreme scores excluded, and .899 with extreme scores included. For the 30-day scale, the reliability estimate for households with children is .530 and the reliability estimate for all households is .357 with extreme scores excluded. Including extreme scores generates a striking increase in the reliability estimates, to .852 for households with children and .840 for all households. Prepared by Abt Associates Inc. 32 Chapter Three: Reliability Estimates for the Food Security Scales Note that, although including cases with extreme scores increases the reliability estimate for both scales, the effect is particularly striking for the 30-day scale. This occurs for three reasons. First, the number of items in the paired subscales is smaller for the 30-day scale. The 30-day scale contains just five independent items that apply to all households, and ten that apply to households with children. This means thai the split-half scales each contain just two items in the analysis for all households, and five in the analysis of households with children. In contrast, the split-half 12-month scales contain four items for the analysis of all households and seven items for the analysis of households with children. Smaller numbers of items in general lead to lower reliability estimates. The second factor is that the 30-day scale measures a narrower band of the spectrum of food insecurity than the 12-month scale. The least severe items in the 12-month scale were not asked in the 30-day time frame. This means that the 30-day scale not only contains fewer items, but that the scale is attempting to make distinctions within a narrower range than the 12- month scale. In effect, this means that the 30-day scale faces a more difficult challenge in distinguishing the varying levels of food insecurity and hunger among those households that have experienced one or more of the conditions measured. The final distinction between the scales is that a far greater proportion of households answered negatively to all items on the 30-day scale than the 12-month scale (89 percent vs. 57 i percent of households that passed the screening questions). Thus, including or excluding the households with extreme scores will have a greater effect on the 30-day than the 12-month scales. 3.2 RULON'S SPLIT-HALF RELIABILITY ESTIMATES Rulon proposed an alternative method for estimating the reliability of a scale using the split-half tests.2 The method involves estimating the difference between household scores on two parallel tests and estimating the ratio of the variance of the difference score to the variance of die total score. The equation for Rulon's method is written as: 2 Rulon, P.J., "A Simplified Procedure for Dctcrmini g the Reliability of a Test by Split Halves,' Harvard Educational Review vol. 9, pp. 99-103, 1939. Prepared by Abt Associates Inc. 33 Chapter Three: Reliability Estimates for the Food Security Scales **•!- 2 where a2 2 D is the variance of the difference score and ax is the variance of the total score. To estimate the index, we used the same subsets of items described above for the Spearman test, again performing the computation both for households with children and for all households (see Exhibit 3-1). For the 12-month scale, the reliability estimate for households with children is .813 and the estimate for all households is .878 with extreme scores excluded. When extreme scores are included, the estimates increase to .899 for households with children and .932 for all households. For the 30-day scale, the reliability estimate for households with children is .530 and the reliability estimate for all household types is .650 when extreme scores are excluded. Including the extreme scores raises the estimates to .844 and .888, respectively. 3.3 CRONBACH'S ALPHA RELIABILITY ESTIMATES Cronbach's alpha and Kuder Richardson 20 (McDonald, 1985) produce identical results when using independent items that are dichotomous in form. Therefore, for the 12-month scale, these two equations are interchangeable. For simplicity, we will refer to Cronbach's alpha when describing these reliability estimates. Cronbach's alpha was developed to circumvent problems associated with the non-random selection of subsets of items when using methods such as the Spearman-Brown or Rulon methods. Cronbach's alpha, aa, can be written as: °a * [wr] i- *? 2 2 where k represents the number of items in the test, at represents the variance of item »', andax represents the variance of the total test score. Alpha is considered to be the lower bound of the true theoretical reliability estimate, the coefficient of precision. Prepared by Abt Associates Inc. 34 Chapter Three: Reliability Estimates for the Food Security Scales The overall reliability estimates, summarized in Exhibit 3-1, are similar to thore seen with the prior tests. With extreme scores excluded, the values of o for the 12-month scale are .814 for households with children and .743 for all households. Including the households with extreme scores raises the estimates to .882 for households with children and .856 for all households. For the 30-day scale, the a values are .555 for households with children and .356 for all households when cases with extreme values are excluded. When households with extreme values are included, the values are .799 for households with children and .789 for all households. In addition to assessing the reliability of the total scale, Cronbach's alpha is often used to examine the appropriateness of including individual items in the scale. The usual rule s that if a increases substantially when an item is removed from the scale, the item should be considered for removal. It is also possible to evaluate how the reliability of the scale changes when any one item is removed from the scale. Exhibits 3-2 and 3-3 show that in nearly all instances, removing an item would reduce the estimated reliability of the scale. The only potential exception would be item 53;3 removing this item would generate a small increase in the reliability estimate with extreme scores excluded, but the loss of information at the end of the scale would be more detrimental to scale validity than is justified by this small increase in reliability. 3.4 RASCH MODEL RELIABILITY ESTIMATES The Rasch reliability indices behave in a slightly different manner and yield somewhat lower estimates of reliability than the more traditional indices presented above. The reliability index for the Rasch Scale is defined as: (a2 x-MSE) 3 Removing item 28 with extreme scores included also generates an increase in a, but die difference is tiny (measured in the third decimal). Prepared by Abt Associates Inc. 35 Chapter Three: Reliability Estimates for the Food Security Scales Exhibit 3-2 CRONBACH'S ALPHA FOR THE 12-MONTH SCALE FOR HOUSEHOLDS WITH CHILDREN Item Extreme Scores Included (a * .882; n=7,888) Extreme Scores Excluded (a * .814; *=4,278) Item Mean Correlation with Total Score a with Item Deleted Item Mean Correlation with Total Score a with Item Deleted 43 .017 .338 .882 .028 .309 .812 47 .033 .433 .879 .057 .415 .806 28 .036 .397 .880 .063 .354 .809 40 .037 .433 .879 .064 .408 .806 38 .040 .429 .879 .071 .394 .806 35 .081 .565 .873 .146 .529 .796 57 .098 .587 .872 .177 .540 .795 32 .179 .669 .867 .327 .567 .791 56 .183 .664 .867 .333 .556 .793 24 .182 .642 .868 .332 .522 .796 58 .288 .656 .868 .528 .441 .804 55 .290 .709 .865 .532 .528 .796 54 .338 .692 .866 .621 .462 .801 53 .450 .607 .873 .827 .221 .818 where pr is the reliability index, ax is the variance of the scale, and MSE is the mean square error of the scale. Like the previously described reliability indices, pr is intended to represent the proportion of total variance in household scores that is caused by variance in households "true" scores. In Exhibit 3-4, the reliability estimates for the 12-month and 30-day scale are presented. Separate estimates are presented for two treatments of the variables that involve follow-up questions. For example, the 12-month scale includes an item that indicates that adults have cut or skipped meals in the past 12 months, and a second (answered only by people who responded positively to the first item) that indicates that meals were cut or skipped in three or more months. In one treatment, these are considered as independent dkhotomous items. In the Prepared by Abt Associates Inc. 36 Chapter Three: Reliability Estimates for the Food Security Scales Exhibit 3-3 CRONBACH'S ALPHA FOR THE 12-MONTH SCALE FOR ALL HOUSEHOLDS Item Extreme Scores Included (a » .856; n=18,179) Extreme Scores Excluded (a ■ .743; n=7902) Item Mean Correlation with Total Score a with Item Deleted Item Mean Correlation with Total Score a with Item Deleted 28 .034 .434 .858 .080 .429 .727 38 .040 .459 .855 .092 .451 .723 35 .072 .594 .842 .167 .582 .695 32 .149 .701 .827 .343 .595 .686 24 .157 .682 .829 .362 .545 .697 55 .257 .678 .830 .591 .373 .736 54 .276 .725 .823 .635 .439 .721 53 .349 .646 .837 .803 .206 .760 Exhibit 3-4 RASCH RELIABILITY ESTIMATES FOR THE 12-MONTH AND 30-DAY SCALES Scale Model Type Including Households with Extreme Scores FyrhnHng Households with Extreme Scores 12-month scale Dichotomout .63 .74 Trichotomous .58 •» 30-day scale Dichotomous .00 57 Trichotomous .00 1 second treatment, they are combined into a single trichotomous Hern (no meals cut/skipped in past 12 months; meals cut/skipped in one or two months; meals cut/skipped in three or more months). Treating such question sets as trichotomous items reduces the number of items in the scale, and hence reduces the estimated reliability. With extreme scores excluded, the reliability estimates for the 12-month scale are .74 (dichotomous) and .70 (trichotomous). The reliability estimates for the 30-day scale are .57 and .44. Prepared by Abt Associates he. 37 Chapter Three: Reliability Estimates for the Food Security Scales Unlike the previous reliability indicators, the Rasch reliability estimate decreases when extreme scores are included. Thus, the reliability estimates for the 12-month scale are .63 and .58 with the extreme scores included. For the 30-day scale, because 88 percent of the households that passed the screener responded negatively to all questions, the reliability estimate falls to zero when cases with extreme scores are included. 3.5 RELIABILITY IN IDENTIFYING CASES WITH NO FOOD INSECURITY CONDITIONS As noted earlier, none of the reliability statistics deal adequately with situations in which a large percentage of cases have extreme scores. For present purposes, then, the statistics are primarily useful in indicating the scales' reliability in distinguishing the level of food insecurity among households that experience at least one of the conditions measured by items included in the scales. The statistics provide little information about the scales' reliability in distinguishing between households that experience none of the food insecurity conditions measured and households that experience one or more of the conditions. To provide additional insight on this point, a further analysis was conducted. The analysis follows the split-half procedure: for each scale, we separate the items into two groups to constitute two new scales; we then examine the relationship between the two new scales. The scales are split as described earlier, but each of the new scales is then collapsed into a dichotomous variable. The two categories on the dichotomous variable are (1) "answered all questions negatively," and (2) "answered one or more questions positively." The agreement between the new dichotomous items is then assessed. A simple test of correspondence is the percentage of cases classified similarly by the two variables. When the population is unevenly divided between the two categories of the dichotomous variables, however, a high rate of agreement can occur by chance. The more appropriate test is therefore the Kappa statistic. The Kappa statistic is a measure of the extent to which mere is agreement above and beyond what would be expected by chance. Kappa («) is computed as: (percent observed agreement) - (percent agreement expected by chance alone) 100% - (percent agreement expected by chance alone) Prepared by Abt Associates Inc. 38 Chapter Three: Reliability Estimates for the Food Security Scales To test the hypothesis HQ: * « C VJ. H,: «> 0, we can use the lambda statistic X = _!L. A formula for the estimation of the standard error of « can be found in Rosner (1986). Landis and Koch (1977) suggested that a K below 0.4 represents poor agreement, between 0.4 and 0.75 represents good agreement, and greater than 0.75 represents excellent agreement. The percent agreement between paired subscales and the Kappa statistics are shown in Exhibit 3-5. As expected, the two scales in each pair are in agreement in a high percentage of cases-around 85 percent for the 12-month scale, and around 95 percent for the 30-day scale. More importantly, the K values are all close to .70, which is toward the high end of the range representing "good" agreement.4 Exhibit 3-5 LEVEL OF AGREEMENT BETWEEN DICHOTOMIZED SPLIT-HALF SCALES Households with Children Household! without Children Percent Agreement ■ Petcent Agreement I 12-month scale 84.8% .70 85.8% .69 30-day scale 94.5% .68 95.1% .67 This suggests that the scales provide a reasonable level of reliability in dstiiiguishing between households that have experienced any of the rr-asurcd facets of food insecurity and households that have not experienced any of these conditions. It is particularly worth noting that the i statistics for the 30-day scale are quite siniilar to those for the 12-rnonm scak, even thw the 30-day subacales have very few items and a very high percentage of respondents answering all questions negatively. These factors appear to reduce the 30-day scale's reliability in dftrrimin.ti.ig among households that have experienced one or more of the measured conditions, but the scale remains reasonably strong at distiiigiiisttng thore t^ conditions from those that have not. 4 In all of the comparisons, the X statistic indkawthatttelevdofiaiteinertUi^^ would be expected by chance (p < .001). Prepared by Abt Associates Inc. 39 Chapter Three: Reliability Estimates for the Food Security Scales 3.6 SUMMARY Although there is no absolute rule regarding minimum acceptable levels of reliability, the literature provides at least some rough guidelines. Nunnally (1978), writing in the context of the more traditional measures of reliability, suggests that reliabilities of about .70 can be sufficient to suggest general reliability, particularly in the early stages of measurement development. Nunnally suggests that for basic research, requiring a very high reliability (e.g., above .80) can be counterproductive, as resources are devoted to improving the scale instead of learning about the underlying phenomenon. He also argues, however, that scales used to support decisions regarding the treatment of specific individuals should have reliabilities exceeding .90. Using the three traditional measures and following the conventional practice of including households with extreme scores, both the 12-month scale and the 30-day scale would be judged quite reliable. Estimated reliability values range from .86 to .93 for the 12-month scale, and from .79 to .89 for the 30-day scale. As noted previously, however, this conventional approach yields statistics that can be influenced by the type of highly-skewed distributions that characterize the food insecurity scales. A more conservative approach is to separate two types of reliability. The first considers the scale's reliability in describing the level of food insecurity among households that experience one or more of the food insecurity or hunger conditions measured by items in the scale. The second asks about the scale's reliability in distinguishing between households that have vs. have not experienced any of the measured food insecurity or hunger conditions. The 12-month scale fares quite well on both dimensions of reliability. When households that answered all questions negatively are excluded from the analysis, the Rasch reliability estimate ranges from .70 to .74, and the more traditional indices range from .74 to .88. Using the dichotomous split-half test, the x statistics are .69 to .70. Although this approach is novel, and no established benchmarks provide standards for "good" reliability, all of these scores are in the acceptable range for other uses of the statistics. The 30-day scale is equally reliable at distinguishing households that have vs. have not experienced any of the measured food insecurity and hunger conditions. The x statistics of .67 to .68 are nearly the same as those for the 12-month scale. The 30-day scale, however, seems less reliable at distinguishing among levels of food insecurity for households that experience one Prepared by Abt Associates Inc. 40 Chapter Three: Reliability Estimates for the Food Security Scales or more of the measured conditions. When we consider only the households that answered at least one question affirmatively, reliability estimates range from .36 to .65. Two factors reduce the 30-day scale's estimated reliability in distinguishing levels of food insecurity and hunger among households that experience one or more of the measured conditions. First, the number of independent items on the 30-day scale is small. Second, the 30-day scale measures a narrower range of food insecurity, because some of the less severe questions were not asked in the 30-day time frame. To increase the reliability of the 30-day scale to be more comparable to the 12-month scale, it would probably be necessary to add more 30-day items to the Food Security Survey, and in particular to add items measuring less severe conditions of food insecurity than those currently included in the scale. Prepared by Abt Associates Inc. 41 UD2EBIK A CHAPTER FOUR DEFINING RANGES OF THE FOOD SECURITY SCALE The analyses discussed in earlier chapters provide the basis for concluding that food security can be reliably measured as a unidimensional phenomenon. Households can be ranked on the basis of scale values across a continuous range indicating the severity of food insecurity experienced within the household. The full range of severity measured extends from no measurable food insecurity at all, through increasing levels of severity characterized by reduced food intake and hunger for household members, to some maximum measured level. Although the phenomenon of food insecurity can be viewed as unidimensional and continuous, several distinct ranges of severity are of interest. Identifying these ranges of severity enables one to supplement the continuous food security scale, subdividing it to create a categorical variable providing a comparatively simple measure of food security status in terms of several broad ranges of severity. In this chapter we describe the conceptual and empirical bases for a priori expectations regarding the structure of a categorical food security status variable, and the process leading to definition of categorical ranges within the continuous food security scale. Several specific issues related to selection of threshold levels or scale dividing lines are summarized, and the final categorical food security status variable is described. 4.1 CONCEPTUAL BASIS FOR A CATEGORICAL FOOD SECURITY STATUS VARIABLE The first threshold level of severity, or dividing line, to be identified on the unidimensional food security scale is the point of transition from food secure status to food insecure status. In addition to this threshold, two other cutpoints, deriving from the LSRO/AIN conceptual definitions of food security, food insecurity, and hunger, are of interest.1 As noted 1 The cmcfpwn' rationale underlying the measurement of food insecurity and hunger developed in the present study is described in Bickel, Andrews and Klein (1996). The research background leading to this measurement approach is documented in the U.S. Department of Agriculture report. Food Security Measurement and Research Conference: Papers and Proceedings, Alexandria, VA: USDA Food and Consumer Service, Office of Analysis and Evaluation, June 1995. Prepared by Abt Associates Inc. 43 Chapter Four: Defining Ranges of the Food Security Scale in the main report of this study,2 the LSRO/AIN conceptual clarification provides a working definition of hunger as "the uneasy or painful sensation caused by a lack of food" and identifies hunger as "a potential but not necessary consequence of food insecurity" (Anderson/LSRO, 1990). Previous studies examined by the AIN expert group had led to a consensus view of hunger as "nested" within the broader phenomenon of food insecurity, and occurring at the more severe levels of food insecurity as experienced in U.S. households. Moreover, empirical evidence supports the conceptual view of household-level food insecurity as a managed process involving identifiable patterns or stages of behavioral responses to food insufficiency as the degree of such insufficiency increases (Radimer, Olson and Campbell, 1990; Basiotis, 1992; Cristofar and Basiotis, 1992; Radimer et al., 1992; Wehler, Scott and Anderson, 1992; Burt, 1993; Cohen, Burt and Schulte, 1993). Within this framework, food insecurity in the household begins with an initial stage characterized by adult household members' experiences of food insufficiency, anxiety about their food situation, and adjustments in their budget and food management patterns. These latter behavioral "coping strategies" may involve efforts to augment the household's food supply from emergency or other non-normal sources, and may involve modifications to the variety and quality of food available to household members, but normally do not include reduction in overall quantity of food intake. In this initial stage there is little or no evidence that household members experience actual hunger — "the uneasy or painful sensation caused by a lack of food" — as a result of their household's level of food insecurity. The second stage involves intensification of food economizing behaviors, some of which lead to patterns of reduced food intake among one or more of the adults in the household. When children are present in a household, efforts are made to spare them from food intake reduction through various rationing strategies. If the household's food insecurity persists or worsens, however, a third stage appears in which adult hunger is manifested in more severe forms (e.g., going whole days with no food) and, in households with children, the children experience actual hunger, revealed in patterns of reduced food intake. 2 Hamilton etai. (1997), Household Food Security in the United States in 1995: Summary Report of the Food Security Measurement Project, Alexandria, VA: U.S. Department of Agriculture, Food and Consumer Service, June 1997, Chapters One and Two. Prepared by Abt Associates Inc. Chapter Four: Defining Ranges of the Food Security Scale This conceptual framework suggests four potentially identifiable stages or levels of severity within the continuous food security variable. Those severity-level categories are: (1) Food Secure; (2) Food Insecure with No Hunger Evident; (3) Food Insecure with Moderate (adult) Hunger Evident; and (4) Food Insecure with Severe Hunger (child hunger, and severe adult hunger) Evident. Given these conceptual categories, the question is how best to subdivide the 12-month and 30-day scales into ranges of severity that correspond operationally to the designated conceptual categories. 4.2 DEFINING RANGES AND SELECTING SCALE OUTPOINTS As described in earlier chapters, the Rasch model assigns a scale value to each household based on the number of scale items answered affirmatively relative to the total number of items answered.3 As an interdependent part of its estimation from the data, the model also ranks scale items according to their level of severity on the basis of the actual response patterns of all households in the data. The 18 items in the final 12-month scale are shown in Exhibit 4-1, with items listed by increasing order of severity from top to bottom in the table. If all responses were perfectly ordered, an affirmative response to any scale item would occur only in conjunction with affirmative responses to all prior, or less severe, scale items. Therefore, as perfect scale ordering is approached among the actual sample households, any number "n" of affirmative responses approaches exact correspondence to the first n items in the scale. Although the data are not perfectly ordered for all households, in fact the most common pattern of household responses (the mode) does foUow the sequential order of severity.4 That is, the 3 For ease of explication this discussion is presented without addressing separately the cases of households with and without children. Readers should note that these two typo of households were presented different numbers of items, because questions addressing conditions of children in die household were not presented to households without children. The form of the Rasch measurement model and the BIGSTEPS software that impLHiw^tt the model take these differences into account in calculating household scale scores. 4 For example, among households with no children, 82 percent followed the modal pattern on the 12- month items. Households answering "no" to all questions, however, amount to 65 percent of the total. Among households answering "yes" to at least one question, 49 percent followed the modal pattern. For the non-modal households, responses deviate from the pattern that would be observed under perfect ordering. Some households answer "yes" to items without answering "yes" tc all prior items. A non-modal household with n affirmatives has answered negatively one or more of the n less-severe questions, instead affirming one or more of the more severe questions. The Rasch model implicitly considers them equivalent, in effect treating all households as modal and assigning both households the same scale value. Prepared by Abt Associates Inc. 45 Chapter Four: Defining Ranges of the Food Security Scale modal household that answers n items affirmatively gives "yes" responses to the n least severe items in the scale sequence. Defining ranges on the continuous scale is the operational means of assigning values to the categorical variable measuring households' food security status. This categorical measure identifies the particular range of severity of food insecurity that a given sample household has experienced in the prior 12-month or 30-day period. Defining the appropriate scale ranges for classifying households according to food security status involves identifying subsets of the sequential indicator items that best correspond to the conceptual categories described above. After a subset is identified in general terms, it is necessary to identify the appropriate classification boundaries, or points of transition from one severity range to the next. Each such boundary is marked by a particular "threshold item." The threshold items and their classification boundaries developed in the present study for the purpose of giving operational definition to the categorical food security status variable are depicted by the shaded rows in Exhibit 4-2.5 Thus, the scale itself, with items ranked from least to most severe, provides a meaningful framework within which to identify operationally the designated ranges of behaviors and conditions corresponding to the conceptual construct summarized above. The scale, whose values range from 0 to 10, must be subdivided in terms of numeric values so that a household with a particular scale value can be assigned to a particular food security status category. This subdivision, however, can be accomplished by considering the behaviors and conditions represented by values at each point on the scale. The procedure for subdividing the scale rests on two features of the scaling methodology described above. First, household values on the food security scale are based fundamentally on a simple count of the number of questions to which they respond affirmatively. Second, most households' responses follow the sequential logic of item severity: a household that says "yes" to a particular question typically says "yes" to all less severe questions as well. In general, then, one can characterize households that have a particular scale value as having responded affirmatively to a particular group of questions. Exhibit 4-2, which is organized in terms of increasing severity of the questions, illustrates the point. A household that 3 Exhibits 4-1 and 4-2 in the main report of this study (Hamilton et al., 1997), also illustrate mis division of the scaled indicator items into the respective severity-level classes of the categorical food security measure. Prepared by Abt Associates Inc. 46 Chapter Four: Defining Ranges of the Food Security Scale Exhibit 4-1 ITEMS IN THE FINAL 12-MONTH SCALE LISTED BY INCREASING SEVERITY LEVEL Item Label Item Content (All questions refer to the last 12 months) Q53 Household members worried whether food would run out before they got money to buy more (sometimes or often). Q54 Respondent reports that the food they bought just didn't last, and they didn't have money to get more (sometimes or often). Q55* Household members couldn't afford to eat balanced meals (sometimes or often). Q58 Household relied on a few kinds of low-cost foods to feed children because they were running out of money to buy food (sometimes or often). Q24 Adults in the household cut die size of meals or skipped meals because there wasn't enough money for food. Q56 Household couldn't afford to feed children a balanced meal, because they couldn't afford that (sometimes or often). Q32 Respondent ate less than he/she felt they should because there wasn't enough money to buy food. Q25" Adults in the household cut tbj size of meals or skipped meals because there wasn't enough money for food in at least 3 of the last 12 months. Q57 Children were not eating enough because household couldn't afford enough food (sometimes or often). Q35 Respondent was hungry but didn't eat because couldn't afford enough food. Q38 Respondent lost weight because there wasn't enough food. Q40 Adults cut the size of children's meals because there wasn't enough money for food. Q»- Adults in househoid did not eat for a whole day. Q47 Children were hungry but household couldn't afford more food. Q29 Adults in household did not eat for a whole day in at least 3 of the hut 12 mos. Q43 Children skipped meals because there wasn't enough money for food. Q44 Children skipped meals because there wasn't enough money for food in at least 3 of the last 12 mos. Q50 Children did not eat for a whole day because there wasn't enough money for food. liliram threshold items in die scale. For each designated range of severity comprising the categorical food-security variable, die subset of indicators saajsjassj, win the threshold item and continuing through the successively more severe indicators, up to die next identified threshold, serve operationally to define and characterize that designated range. Prepared by Abt Associates Inc. 47 Chapter Four: Defining Ranges of the Food Security Scale EXHIBIT 4-2 THRESHOLD ITEMS DEFINING RANGES OF THE FOOD SECURITY SCALE Question* (in order of increasing severity) Households with Children Number of Affirmatives Modal Household Value Households without Children Number of Affirmatives Modal Household Value 0.0 0 0.0 Q53 Worried food would run out 0.1 0.9 Adult fed child few low-cost foods Q24 Adult cut size or skipped meals 3.3 3.6 Q56 Couldn't feed child balanced meals 3.8 Q57 Adult eat less than felt Child not eating enough 4.3 *7 5.2 Q35 Adult hungry but didn't eat 10 5.6 5.8 Q38 Adult lost weight 11 6.0 6.5 Q40 Cut size of child's meals 12 m 6.4 6.8 7.5 14 7.3 Adult not eat whole day, 3+ mos. 15 7.8 10 10.0 Child slapped meal 16 8.4 Child slopped meal, 34- mot. 17 9.3 fl)sMlfflTtU9iwlr?lfftY _18_ 10.0 gives one affirmative answer most often answers Q53 affirmatively, a household with two affirmatives most often affirms Q53 and Q54, and so on. For each question, the exhibit shows the number of affirmative responses and the associated scale value for households whose responses follow the sequential logic of item severity. For example, if the most severe question affirmed by a household with children is Q24, that household has also responded affirmatively to the four less severe questions (Q53, Q54, Q55, and Q58) and has a total of five affirmative responses. Its corresponding scale score Prepared by Abt Associates Inc. 48 Chapter Four: Defining Ranges of the Food Security Scale is 3.3. The exhibit also shows parallel, but slightly different, values for a similar household without children. Q58 is not applicable to that household. Thus, if the most severe question it affirms is Q24, it will have a total of just four affirmative responses. Because the Rasch model, however, computes a scale value that takes into account the number and severity of the questions the household was asked, the scale value for the household without children (3.6) is quite close to the value for the household with children (3.3). It is possible to describe any point on the scale in terms of the questions that the "modal" or typical household with that scale value has answered affirmatively. Similarly, one can say that all modal households with values at or above a specified point on the scale have responded affirmatively to at least the group of questions corresponding to the specified point. For example, all modal households with values at or above 2.3 have responded affirmatively to at least the three least severe questions in the scale (QS3, Q54, Q5S). All modal households with values of 4.7 or higher have responded affirmatively at least to Q24 and to all applicable less severe questions.6 Thus, although the scale itself is a continuous measure of a single dimension (i.e., the severity level of food insecurity), it can be subdivided by considering the collection of conditions and behaviors associated with particular ranges of scale values. In this manner, the scale and the severity rankings provided by the Rasch model yield a statistical framework for defining conceptually meaningful categories for the food security status variable. Within this statistical framework, however, the exact location of the category boundaries or scale thresholds depends upon informed judgment about how best to interpret the conceptual constructs based upon the LSRO/AIN definitions and the previous empirical research findings on food security and hunger. The next section reviews those judgments and the reasoning behind them. 6 Non-modal households with a given scale value have, by definition, not responded affirmatively to all of the applicable leas severe questions, but instead have responded affirmatively to more severe questions. For example, a non-modal household (with children) with a scale value of 2.3 must have answered three questions affirmatively. Instead of Q53, Q54, and Q55, however — die three least severe questions — the household might have said "yes" to Q53. Q54, and Q58, although saying "no" to Q55. Prepared by Abt Associates Inc. 49 Chapter Four: Defining Ranges of the Food Security Scale 4.3 EVIDENCE OF FOOD INSECURITY The LSRO/AIN definitions of food security and food insecurity are: • Food security: "Access by all people at all times to enough food for an active healthy life. Food security includes at a minimum: (1) the ready availability of nutritionally adequate and safe foods, and (2) an assured ability to acquire acceptable foods in socially acceptable ways (e.g., without resorting to emergency food supplies, scavenging, stealing, or other coping strategies)" (Anderson/LSRO, 1990, p. 1598). • Food insecurity: "Limited or uncertain availability of nutritionally adequate and safe foods or limited or uncertain ability to acquire acceptable foods in socially acceptable ways" (ibid.). Several dimensions or aspects of food security are apparent in these definitions, of which the most central and fundamental is described as "enough food for an active, healthy life" — i.e., a sufficient quantity of acceptable foods to meet the household's basic needs. A number of additional dimensions are also apparent, including the nutritional quality and safety of available foods, the social acceptability of the means of obtaining food, and the household's assurance or certainty of its ability to obtain needed food. These additional dimensions of the broad conceptual definition of food security, however, are not directly captured in the questions incorporated in the food security scale. Rather, the measure focuses on the simple quantitative dimension of "enough" food. The food quality dimension is represented only to the extent that some particular quality of food (in both nutritional and conventional senses) is perceived and understood by households members to be necessary. The scale consists entirely of items indicating either this quantitative or qualitative aspect of food sufficiency, as experienced and understood by the household respondent, in relation to his or her self-perception of basic needs. Several of the questions included in the CPS Food Security Supplement were intended to capture those aspects of households' food coping behaviors that seek to augment insufficient household food supply through emergency or other non-normal means. These extraordinary coping methods, such as obtaining food from food banks or pantries, borrowing money for food, taking children to others' homes for meals, or getting meals at soup kitchens, have been regarded as good behavioral indicators of a condition of food insecurity or insufficiency within the household, and they may be presumed to reflect the concept of acceptability of sources or means of food-acquisition within U.S. social norms. These food-augmenting coping behavior Prepared by Abt Associates Inc. 50 Chapter Four: Defining Ranges of the Food Security Scale items in the CPS data, however, do not factor together with the indicators that are included in the measurement scale. Thus, they represent a dimension of the conceptual definition of food security — the assurance of access to food through socially-acceptable means — that is not represented within the unidimensional measure of severity of food insecurity.7 Examining the items in the 12-month scale, shown in severity-ranked order in Exhibits 4-1 and 4-2, the basic question is how many items must be answered affirmatively in order to provide clear evidence of food insecurity as defined above. Item Q53 could be interpreted as indicating uncertainty about the household's access to adequate acceptable food, or the ability to acquire it in socially acceptable ways. By itself, however, this subjective item may be considered to lack face validity as a sufficient indicator of food insecurity. An affirmative response to only this one item was therefore judged by the technical analysis team as insufficient to indicate the threshold level of food insecurity. Giving affirmative responses to two items (in the modal case, items Q53 and QS4) indicates worry or anxiety about the household's food position, and also initial perceptions of insufficiency of the household's food supply (food bought just didn't last). Although these two items together provide stronger evidence of household food insecurity, they were still judged insufficient to establish unequivocally that severity has reached the threshold level required for the categorical measure of food insecurity. Including item Q55, however, captures not only reports that the household food supply is substandard, but also efforts to cope with this insufficient food supply in ways that, although they may maintain the quantity of food intake, reduce the perceived quality of diets below the level the respondent understands to be needed to maintain "balanced meals." It is useful to consider the relative severity of items as well as the simple rankings shown in prior exhibits. Exhibit 4-3 therefore maps the relative severities, using the item calibrations presented in Chapter Two. The three least-severe hems in the scale (QS3, QS4, and Q55) appear just prior to a substantial gap in the spacing of item calibrations, indicating a large difference in severity between these items and the group comprised by items 024, Q56, and Q32. Although item Q58 (child fed few low-cost foods) is very close in severity to the item Q55 and consistent in 7 See Chapter Five for further discussion of these indicators of coping behaviors. Prepared by Abt Associates he. 51 Chapter Four: Defining Ranges ofthe Food Security Scale Exhibit 4-9 SEVERITY RANKING OF QUESTIONS IN FOOD SECURITY SCALE 10 < I QBO Chid net aat tor whola day 7- 8 H • Q44 chid aktoped meal. 3* montha < Q43 Chid aktoped meal < Q29 Adut not aat tor whola day. 3* month* 5H 1 Q47 Chid hungry Oft A•*«» mot ear far wkoto day Q40 Chid maal aka out Q38 Raapondant toat weight { Q86 Respondent hungry but did net eat QS7 Chid net eating enough Off Adft-ff «*fe MMto, *♦ MOJI0M 4- 8- JH ll Q32 Reapendent eat bee tun aheuld Q56 Chid net tod batonoed meafc Q24 Adul outtokto meab) Q58 CUM tod tow. low-coat toede off ftaa>owa*aiit not aat Q54 Food bought did not toat Q53 Worrtod toed would run out tJ piaw i wwn) to >■! (wart aaaajaj. of toe toed eaaeily etotoa laabator. Prepared by Abt Associates he. 52 Chapter Four: Defining Ranges of the Food Security Scale conceptual content, selection of the threshold or cutpoint item aims at identifying the point of transition from food security into food insecurity. Thus, the first item completing a group that is conceptually and statistically consistent with food insecurity was judged most appropriate for identifying the threshold. Item Q55 meets this criterion, and the set of three household- or adult-level items answered affirmatively by modal households responding "yes" to item Q55, taken together, was judged to provide sufficient evidence that the household has experienced food insecurity, although at a level not yet showing evidence of actual hunger among household members. 4.4 SUBJECTIVE REPORTING OF HUNGER As summarized above, this research has aimed to develop both a continuous measure of severity and a broad categorical measure of resource-constrained food insecurity that can differentiate three broad ranges of severity, the two most severe of which involve actual hunger for household members. This measurement task is guided by the LSRO/AIN conceptual definitions of food insecurity and hunger, where hunger is nested as "a potential but not necessary consequence" of food insecurity, and is defined as "the uneasy or painful sensation caused by a lack of food." Therefore, an essential measurement task is to identify households whose members have experienced actual hunger — the "uneasy or painful sensation caused by a lack of food" — as a result of constrained or insufficient household financial resources. Food insecurity or hunger resulting from eating disorders, dieting, or causes other than household resource constraints are not being measured. Three related factors enter into the conceptual consideration of what constitutes the specific phenomenon being measured. These are access to adequate food, the physiological sensation of hunger, and potential malnutrition. The relationships between the first two of these — die bask dimension of food insecurity and hunger as experienced within households — constitute the focus of the present research. The relationship of this basic experiential dimension to malnutrition (which is also defined as nested — a "potential but not necessary consequence" — within food insecurity) is not addressed in this research. All items in the CPS Food Security Supplement addressing aspects of food insecurity or hunger contain explicit language making it clear to respondents that the condition being asked about is specifically caused by constrained household financial resour ;s. For example, item Prepared by Abt Associates Inc. 53 Chapter Four: Defining Ranges of the Food Security Scale Q53 states "I/We worried whether (my/our) food would run out before (I/we) got money to buy more." Item Q54 states "The food (I/we) bought just didn't last, and (I/we) didn't have money to get more," whereas item Q55 states "(I/We) couldn't afford to eat balanced meals." Such qualifying language is included consistently in all food insecurity and hunger items in the CPS instrument, including all those appearing in the food security scales. As a result, within the limits of unidentifiable measurement error, affirmative responses to scale items can be expected to reflect clear understanding by respondents that such answers are identifying resource-constrained conditions. Although the possibility of respondents' intentional misreporting exists, as in every survey, the history and nature of the CPS, the high degree of preparedness of CPS interviewers, and the careful design and testing of the Food Security Supplement items all tend to reduce this and other types of measurement error. This point is important because identifying the second classification boundary — the transition from food insecurity with no hunger evident into food insecurity with moderate hunger (adult hunger) evident — relies primarily on evidence that reduced food intake consistent with hunger has occurred within the referenced time period among adults in the household, and that this hunger has resulted specifically from the resource-constrained food insecurity of the household. The task faced by the analysis team of determining the most appropriate severity level of the initial boundary for the severity range of food insecurity with hunger present involved two kinds of judgment. First, it was necessary to decide which specific items available in the scale should be taken to indicate actual hunger for one or more adults in the household attributable to resource constraint. These potentially include measures of reduced quantities of food intake for adult household members (e.g., Q24, Q25), respondents' subjective assessment of intake adequacy (Q32), or direct perception and report of personal hunger (Q3S). Second, given the scale items available, a judgment is required as to bow many such items are needed to provide sufficient evidence that household members have experienced actual hunger due to resource constraint. As explained below, the threshold ultimately chosen relies on evidence of a repeated pattern of reductions in food intake by adults over the referenced time period. The physiological sensation of hunger is experienced universally by all humans, and a large research literature exists examining the nature of the experience in the context of basic Prepared by Abt Associates Inc. 54 Chapter Four: Defining Ranges of the Food Security Scale human physiology and clinical nutrition.8 Several articles from this research literature are summarized in Appendix A of the present volume. The studies described in this literature provide strong support for the validity of subjective reporting of the sensation of hunger (see, for example, Mattes and Friedman, 1993), although they find considerable variation in how the sensation is experienced and described. These studies seem to provide clear evidence that when usual patterns of eating are interrupted by reducing food intake through actions such as cutting the size of meals or dripping meals, the "uneasy or painful sensation caused by a lack of food" is the natural result. The intensity of the sensations experienced is positively associated with the length of the perod of abstinence, although they diminish and may disappear altogether after an extended period of fasting (usually several days). The results reported in this literature are thus consistent with tie use of items indicating that reduced food intakes below usual or normal meal patterns, due to resource stringency, are evidence that hunger has been experienced. Referring to Exhibit 4-3 above, after Q55 the next most severe item to indicate reduction of food intake among adults is item Q24 (Adults cut/skip meals). Note that this item appears in Exhibit 4-3 at virtually the same level as child item Q56 (Child not fed balanced meals), which indicates reduction in the quality of diets provided to children in the household at this level of severity of food insecurity. The next item (Q32, Respondent eat less than should) indicates that food intake has fallen below the respondent's own normative standard for the amount of food he or she should be eating. An affirmative response to item Q25 indicates that, in addition to all of the foregoing conditions, adults in the household cut the size of or skipped meals in three or more of the previous twelve months due to constrained resources, indicating a pattern of repetition of reduced food intakes among adult household members. This item was judged to provide sufficient additional evidence for the presence of adult hunger in the household, and was chosen, therefore, as the item inditing the point of transition from the category of food insecurity with hunger not evident to the category of food insecurity with adult hunger evident. Households in which the respondent answered affirmatively to item Q25 will, in the modal case, also have 8 See Mattes and Friedman (1993) and Read, French and Cunningham (1994) for two general reviews covering much of this research (see References, Appendix A). Prepared by Abt Associates Inc. 55 Chapter Four: Defining Ranges of the Food Security Scale answered affirmatively to all previous items, indicating the household has experienced a comparatively severe level of food insecurity. The affirmative answer to item Q25 indicates that adults in the household have experienced, in addition, a pattern of repeated reductions in food intakes of a type that the physiological research literature indicates is normally accompanied by the "uneasy or painful sensation caused by a lack of food," or hunger. When considering the selection or identification of outpoint items, and when deciding whether affirmative responses to items or sets of items yielded sufficiently clear evidence of a particular condition (e.g., resource-constrained adult hunger), the study team employed a general principle of requiring a pattern of repetition of either behaviors or items, or both. Thus, in considering items indicating reduced food intake among adults, Q25 was viewed as providing sufficient evidence because it involved occurrence of the behavior "cutting or skipping meals" in a recurring pattern over the previous twelve months. Similarly, when considering items indicating the existence of food insecurity with no hunger evident, a pattern of affirmative responses to a sequential series of items was considered stronger evidence than affirmation of only one or two pertinent items. This principle was employed to provide additional assurance against response error.9 4.5 EVIDENCE OF CHILD HUNGER AND SEVERE ADULT HUNGER Exhibit 4-3 shows items Q38, Q40, Q28, and Q47 all grouped at nearly the same level of severity and located at a considerably increased level of severity beyond items Q25, Q57, and Q3S. The logic described above for selection of item Q25 as the threshold item for food insecurity with adult hunger evident might suggest item Q40 (size of children's meals cut) as a likely candidate for the best item indicating the transition into food insecurity with severe hunger, because children's hunger is conceptually the most salient aspect of severe hunger in the household. For reasons similar to those outlined above, however, a more severe item was chosen. The wording of item Q40 allows the respondent to answer affirmatively if children in die household had their meal size cut due to resource constraint only once or a small number of times within the previous twelve months. Here again, sufficient evidence of hunger among 9 lames of response error sxe discussed further in Chapter Eight. Prepared by Abt Associates Inc. Chapter Four: Defining Ranges of the Food Security Scale children was thought to require either a repetitive pattern of reduced food intake or a multiple series of responses indicating such a condition. Note that the child items indicating meals being cut and skipping meals occur as two separate items, unlike the adult version, in which these two conditions are combined as one item. The item addressing children skipping meals appears in Exhibit 4-3 at a much higher level of severity than the item regarding size of children's meals being cut. Skipping meals, as would be expected, reflects a more severe condition than cutting the size of meals. In addition, adult items Q38, Q28, and Q29, all of which indicate comparatively severe levels of adult hunger, appear prior to child item Q44, which indicates a pattern of repeatedly skipped meals among children. These circumstances led team members initially to choose item Q47 (child hungry but couldn't afford more food) as the cutpoint indicating the beginning of food insecurity with child or severe adult hunger evident. Assignment of household food security status using item Q47 as this cutpoint, however, led to anomalous results due to the different numbers of items presented to households with and without children. This anomaly was avoided by choosing item Q28, which appears at virtually the same severity level as item Q47 in Exhibit 4-3, as the cutpoint item indicating the transition from food insecurity with adult hunger evident into food insecurity with child and severe adult hunger evident. In modal households with children responding affirmatively to item Q28, two items related to reduction of food intake among children receive "yes" answers: item Q57 (children were not eating enough) and item Q40 (children had meal size cut). Moreover, respondents in all household types respond affirmatively to Q3S, Q38, and Q28, indicating that adults in the households "were hungry but did not eat because they couldn't afford food," "lost weight because there wasn't enough food," and did "not eat for a whole day because there wasn't enough money for food." Affirmative responses to these items, taken together with affirmative responses to all less severe items, appear to provide clear and strong evidence of child hunger and severe adult hunger. Prepared by Abt Associates Inc. 57 Chapter Four: Defining Ranges of the Food Security Scale 4.6 SUMMARY The primary task of the food security measurement study was to identify, test, and develop a unidimensional measure of food insecurity and hunger based on the CPS food security data, if a statistically strong and sound measure of this kind could be found. The Rasch measurement method was successful in producing a unidimensional, continuous-variable measure of severity of food insecurity and hunger from the CPS data that met these requirements. The second task of the project, which was dependent upon the success of the underlying continuous measure, was to develop a categorical-variable measure of several designated ranges of severity of food insecurity, and the classification of households into these designated severity ranges-or categories, as follows: • food secure • food insecure with hunger not evident • food insecure with moderate hunger • food insecure with severe hunger The conceptual construct for these designated ranges of severity was drawn from the AIN/LSRO conceptual definitions of food insecurity and hunger, from other prior research on food security measurement, and from limiting the measurement effort to one of the central elements of the broad food security concept that is amenable to direct measurement, the direct household experience of insufficient food to meet basic needs. Other elements of the broad conceptual definition, such as safety of food, actual nutritional adequacy of diets, and social acceptability of food acquisition, are not encompassed in the present measure of severity of food insecurity. The categorical measure of food security status depends on classifying households into identifiable ranges of severity on the underlying continuous severity measure. The aim in identifying or selecting the appropriate ranges of severity on the continuous measure was to achieve acceptably close correspondence to the conceptual bases of the designated broad food security status categories described above. The operational means of establishing the several severity ranges was to select the most appropriate indicator items from among those available in the continuous measurement scale to identify, or define operationally, the classification boundaries, or thresholds, separating each designated severity range category from the next. Prepared by Abt Associates Inc. 58 Chapter Four: Defining Ranges of the Food Security Scale This task involved judgment as to which items best reflect the transition from one broad range or category of severity to the next. Identification of the threshold items and their associated scale cutpoint scores for each level of the categorical food security status variable involved use of statistical results from the Rasch model, guided by the LSRO/AIN conceptual definitions of hunger and the results of previous research in the areas of physiology, clinical nutrition, and food security measurement. Team members combined these factors to select thresholds or cutpoint items that are most consistent with the statistical results, empirical evidence, and the conceptual framework representing the predominant understanding of food insecurity and hunger within the nutrition science community. Prepared by Abt Associates Inc. 59 OEMlEEaigg rVT Q' f' CHAPTER FIVE THE RESOURCE AUGMENTATION QUESTIONS In fitting the model for the 12-month food security scale, one group of questions was conspicuously not included because they did not meet the statistical criteria for inclusion in the scale. Theue questions involve actions that households might take to deal with a problem of constrained food resources, and specifically actions other than reducing food intake or < herwise modifying the internal household management of food resources. The questions refer to actions such as putting off other bills in order to buy food, or obtaining meals from soup kitchens. The class of actions has variously been termed "coping" or "resource augmentation" behaviors. Because resource augmentation behaviors are pertinent to one dimension of the LSRO/ AIN definition of food insecurity — the ability to acquire food in "socially acceptable ways" — the research team considered it important to explore the possibility of supplementing the primary food security scale with some composite based on the resource augmentation questions. For example, the food security status variable, rather than simply being based on a subdivision of the primary scale, might also take into account the household's value on the resource augmentation composite. Ultimately it was concluded that, although such a composite might be useful for some researchers in particular situations, it does not add significant value to the food security status variable. This chapter reviews both the conceptual underpinnings of the effort to construct a composite, the procedures that were implemented, and the likely effect of using a composite such as that described. S.l Two DIMENSIONS OF FOOD INSECURITY The LSRO/AIN conceptual definition of food insecurity includes several diverse aspects or dimensions of households' food situations, of which only one central element — the direct experience of insufficient food to meet basic needs — is captured in the measure developed from the CPS food security data. Households can, however, be food insecure either because they are unable to obtain enough food (for discussion, call this food insecurity "type A"), or because they have to resort Prepared by Abt Associates Inc. 61 Chapter Five: The Resource Augmentation Questions to socially unacceptable ways of obtaining food (call this "type B"). They may also be food insecure for both these reasons. That is, they may resort to socially unacceptable ways of obtaining food and still not obtain access to sufficient food (call this "type A&B"). Because resource-constrained hunger is understood to be nested within food insecurity, it will not occur in a household unless that household is food insecure. If a household is food insecure type A (unable to obtain enough food) at a sufficient level of severity, then hunger may result. Likewise, if a household is food insecure type A&B, hunger may still emerge, despite the household's efforts to augment its available food through various coping measures. If a household's food insecurity is limited to type B only, however, the presence of basic food insufficiency and hunger within the household cannot be inferred from this information. This relationship is illustrated in Exhibit 5-1. Exhibit 5-1 ILLUSTRATION OF ROLE OF RESOURCE AUGMENTATION BEHAVIORS Food Availability Mode of Acquisition Food Security Status Sufficient food available AND Socially acceptable acqui- «■" Food secure lition Limited or uncertain availability (anxiety, adjustments to budget management, adjustments to food quality) OR Resource augmentation via socially unacceptable means Food insecure with hun-ger not evident Severely limited availability (reduced food intake and other indicators) Food insecure with evi-dence of hunger The availability of sufficientfoods to meet basic needs (food insecurity type A). This dimension is well represented in the final unidimensional 12-month scale. As described in the previous chapter, scale development activities demonstrated that it is possible to define a range of values on this scale that can be used to classify households as "food insecure" on the basis Prepared by Abt Associates Inc. 62 Chapter Five: The Resource Augmentation Questions of limited availability of foods relative to household need, operationally indicated by a pattern of anxiety about the adequacy of the household's food supply, and deterioration in the quality and quantity of food available in the household. The ability to acquire foods in socially acceptable ways, or via normal channels (food insecurity type B). The scale development models employed do not capture this dimension. Using the final 12-month scale to classify households as food insecure leaves open the possibility that some households relying on extraordinary coping methods to acquire food in socially unacceptable ways will be classified as food secure. This situation emerges because the items in the CPS Food Security data that address this latter dimension of food insecurity do not fit the measurement models leading to the final 12- month scale. Two sets of items ask questions that provide indications of whether households obtained food in ways that might be considered socially unacceptable. One set of items asks whether households undertook actions to augment thtif food supply or other household resources within the previous 12 months. These items are summarized in Exhibit 5-2. Exhibit 5-2 RESOURCE AUGMENTATION ITEMS IN THE FOOD SECURITY SURVEY INSTRUMENT Item Label Q18 Q19 Q21 Q22 Q23 Item Summary/Description "get food or borrow money for food from family or friends?* "•end or take children to the homes of friends or relatives for a meal?' "put off paying a bill so you would have money to buy food?" "get emergency food from a church, food pantry, or food bank?" "eat meals at a soup kitchen?" A second set of items asks whether members of the household obtained food through federal food assistance programs. These programs include food stamps, elderly feeding programs, the child and adult care feeding program, school feeding programs, and WIC. There are two strong arguments, however, for not using these items to classify households as food insecure. Prepared by Abt Associates Inc. 63 Chapter Five: The Resource Augmentation Questions First, participation in such programs may not be considered "socially unacceptable" by many of the participants. There is some evidence to that effect, although this point has not been adequately researched (Trippe and Beebout, 1988; Fraker, 1990; Radimer, Olson and Campbell, 1990; Trippe, Doyle and Asher, 1992; Olson, Frongillo and Kendall, 199S). Second, there is a problem of logical circularity that could diminish the usefulness of the food insecurity measures for policy considerations. The food insecurity measures are potentially useful in helping policy makers assess the need for government food assistance programs. Including program participation in the food insecurity measures, however, permits the following potentially perverse result: If the government makes programs more available (for example, by increasing the income eligibility threshold for free school lunches, or food stamps), more people will participate and the experienced level of food insecurity would be expected to decline. The measured level, however, may either decline or increase, depending on how the participation indicator interacts with other indicators of the condition. Conversely, if the government cuts back on programs, participation will decline and the effect of the participation indicator may cause the measured level of food insecurity to go down (i.e., the food insecurity problem can be "solved" by taking away the programs). Because of this situation, participation in government food assistance programs was not included in the candidate pool of items for a resource-augmentation index. For the classification of households as food insecure to be more fully consistent with the LSRO/AIN definitions, there would need to be a way to include information on food acquisition through ways that are not socially acceptable (non-normal channels). An important part of the indicator items used in earlier efforts to develop measures of food insecurity and hunger reflect actions or behaviors undertaken by household food managers to avoid or ameliorate hunger when food or financial resou |
OCLC number | 888048062 |
|
|
|
A |
|
C |
|
G |
|
H |
|
I |
|
N |
|
P |
|
U |
|
W |
|
|
|