This file was prepared for electronic distribution by the inforM staff. Questions or comments should be directed to inform-editor@umail.umd.edu. The Economic Status of Black Women: An Exploratory Investigation Staff Report United States Commission on Civil Rights October 1990 APPENDICES APPENDIX A A Comparison of the Wage and Salary Income Data from the 1980 Census of Population and the March 1980 Current Population Survey Two of the most important contemporary sources of data on the wage and salary income of Americans are the 1980 Census of Population and the annual Current Population Surveys. A comparison of the 1980 census and the March 1980 Current Population Survey (CPS) reveals that these two data sets yield very different estimates of the average wage and salary income for different socioeconomic groups. This appendix examines these discrepancies and considers the implications for attempts to determine the relative wages of different socioeconomic groups. Both the 1980 census and the March 1980 CPS ask the respondent's total wage and salary income during the previous calendar year. Since both data sets record the number of weeks worked and the "usual" number of hours worked by the respondent during 1979, lt ls also possible to calculate weekly and hourly wage rates for respondents from both data sets. This appendix compares the estimates of annual. week}y, and hour}y wages derived from the 1980 Census of Population with those derived from the March 1980 CPS. The census estimates in this report are based on a 1/1000 sample of whites and a 1/100 sample of blacks. The CPS data are based on the entire March 1980 Annual Demographic File. The final samples from both data sets include only non-Hispanic whites and blacks between the ages of 18 and 64 who reported positive wage and salary income in the previous year. Students, unpaid workers, self-employed workers, and members of the armed forces were excluded from both samples. The final census sample contains: 48,191 white women, 68,248 black women, 41,000 white men, and 52,848 black men. The final CPS sample contains: 41,360 white women, 4,914 black women, 39,614 white men, and 3,682 black men. Table A.1 reports mean annual, weekly, and hourly wages for white women, black women, white men, and black men for the census and CPS samples. Average weeks worked last year and average usual hours worked per week are also shown. Although the 1980 census and the March 1980 CPS cover the same time period and were collected a month apart, the average annual wage and salary incomes calculated from the 1980 census data range from 5 to 10 percent higher than those derived from the March 1980 CPS. The difference ls larger for women than for men and larger for blacks than for whites. Comparisons between the two data sets reveal even larger differences when weekly and hourly wages are considered. There are small discrepancies between the average numbers of weeks worked and the average numbers of hours worked per week calculated from the census data and those calculated from the CPS data. For all groups, the average weeks and hours worked are higher in the census than in the CPS. Further examination of the data reveals that the discrepancies between the two data sets disappear almost entirely if the sample is restricted to workers who work full time year round. Table ~2 reports average annual, weekly and hourly wage rates and weeks and hours worked by group for full-time, year-- round workers only. For full-time, year-round workers the census and CPS wage rates, hours, and weeks are remarkably close for all four groups. Although the discrepancies are slightly larger for blacks than for whites, this can probably be explained by the small sample size of the CPS for blacks. The discrepancies between the two data sets are very large, however, for workers who did not work full time year round (see table A.3). The average wage and salary incomes of part time or part year workers calculated from the census data are over 15 percent higher for whites (both men and women) and over 25 percent higher for blacks (both men and women) than the comparable figures calculated from the CPS data. In addition, the average hours and weeks for these workers calculated from the census data are quite a blt higher than those calculated from the CPS data. The census data show weekly and hourly wage rates that are much larger than those derived from the CPS data. The aver- age hourly wage of black women who did not work full time year round calculated from census data is more than twice the hourly wage calculated from CPS data.(1) [Tables A.1 and A.2 are not available in electronic format] Controlling for whether work is full-time, year-round significantly reduces the difference between men and women in the census-CPS discrepancies. The difference between men and women disappears entirely for full-time, year-round workers (table A.2) and is much less for part-time, part-year workers (table A.3). On the other hand, the racial differences in the census-CPS discrepancies observed in table A.1 persist in tables A.2 and A.3. Even when full-time, year-round status is controlled for, the gap between census and CPS wages is larger for blacks than for whites. Although the census produces much higher wage estimates for all groups than the CPS, the two data sets might, nonetheless, yield similar estimates of relative wages across groups. Table A.4 shows that the two data sets yield very similar relative wages for full-time, year-round workers, but that there are important discrepancies in the relative wages implied by the two data sets when all workers are considered. These discrepancies are larger for hourly and weekly wage ratios than for annual wage ratios. Most affected are estimates of the ratio of black women's to white women's wages. The census data show a relative hourly wage for black women of 111.2 percent (2), whereas the CPS data show the much smaller figure of 89.2 percent. Large differences between the census and CPS ratios of black women's to white men's and to black men's wages and of black men's to white men's wages are also observed. What factors might explain the observed discrepancies between the 1980 census and the March 1980 CPS? Since the discrepancies appear to be limited to part-time or part-year workers, it is natural to suppose that, unlike full-time, year-round workers, workers who do not work full time year round find it difficult to recall accurately their annual wage and salary income and the number of weeks and hours they worked during the year. It is not apparent, however, how this can explain why the wage and salary. hours, and weeks figures calculated from the census are systematically higher than those calculated from the CPS, nor does it explain why the size of the discrepancies varies by race and by gender. There are several differences between the census and the CPS that might cause the systematic discrepancies between the wage and salary incomes, weeks, and hours reported in the two data sets. For one, the census is a much larger data set than is the CPS. In particular, the CPS sample used in the report contained only about 10 percent as many blacks as the census sample. Thus, sample variance could be a problem in the case of the CPS. It is not clear, however, how sample variance could lead to the systematic differences observed between the two data sets. A more promising possible explanation for the systematic differences between the two data sets is that the CPS questions were asked in March, before most people filled out their income tax forms, whereas the census data were collected in April, at a time when people were filling out or had just recently completed their forms. This suggests that wage and salary income and perhaps weeks and hours reported in the census are more accurate than those reported in the CPS. On the other hand, the census data were largely self-reported, whereas the CPS data were collected by trained enumerators, which should lead to more accurate reporting. For both data sets, nonresponses to wage and salary income questions were common, and in both cases values were imputed for missing data. This factor, too, could conceivably lead to the observed discrepancies. (3) In the absence of research comparing the two data sets and evaluating their accuracy, it is impossible to know which, if any, of the above factors the discrepancies between the two data sets. (4) For researchers concerned with the relative economic positions of various demographic groups, the discrepancies in wage and salary data between the 1980 census and the 1980 CPS are serious. Not only are the levels of wage and salary income implied by the two data sets inconsistent, but the two data sets give a very different picture of the relative economic positions of the different social groups. The researcher who relies on the 1980 census finds that black women have hourly wages that are substantially higher than those of white women. Yet the March 1980 CPS data imply that black women still earn only 90 percent as much as white women per hour. The 1980 census and the March 1980 CPS provide very different answers to questions of great importance to labor economists and to the society at large. It is important to look carefully at both data sets to determine which, if either, is the more accurate, and to understand the sources and extent of biases in the wage and salary data as well as the data on hours and weeks worked for both data sets. A thorough study comparing the two data sets is necessary if an answer is to be found to even the most basic questions concerning the relative economic position of different social groups in the United States today. [Tables A.3 and A.4 are not available in electronic format] ENDNOTES (1) The remarkable similarities between the census and the CPS for full-time, year-round workers and the remarkable differnces betwen the two data sets for part-time or part-year workers held up when age and education were controlled for in addition to full-time, year-round status. The discrepancies between the census and the CPS were equally large and exhibited the same patterns for part-time workers and for part-year workers when these two groups were considered separately. (2) The discrepancy between the 1980 census black-white wage ratio reported here and the one reported in the body of the report arises because different measures of hours per week worked last year are used. In the bpdy of the report, predicted usual hours last year were used (see the discussion in app. E), but in this appendix, reported usual hours last year are used. (3) Eliminating persons with imputed values from our sample, does not get rid of the Census-CPS discrepancies, however. (4) One study comparing the census and the CPS personal income data found that, compared to an independent benchmark, the census data overreported wage and salary income by 3 percent and the CPS underreported wage and salary income by 1 percent. No effort was made to look at misreporting by race or sex group or by full-time, year-old status. See George Patterson, "Quality and Comparability of Personal Income Data from Surveys and the Decennial Census." Paper presented at the Plenary Session of the Joint Advisory Committee Meeting, Apr. 25-26, 1985, in Rosslyn, Virginia. APPENDIX B Occupations -- Appendix to Chapter 5 This appendix provides supplementary material for chapter 5 on women's occupations from 1940 to 1980. The first section defines and discusses the index of occupational dissimilarity used throughout chapter 5 as a measure of occupational segregation (5) by race. The second section supplies a more detailed description of the method of generating the hypothetical occupational distributions discussed in chapter 5. The final section presents supplementary tables. Index of Occupational Dissimilarity To measure the degree of occupational segregation by race, this report calculates an index of occupational dissimilarity, S, defined by: S=1/2 [Sigma] |b(i)-w(i)| i where i refers to occupation, and b, and w, are the percentages of black and white women in occupation i, respectively. Possible values of S range from 0, which indicates no racial differences in occupations, to 100, which indicates complete occupational segregation. For example, if there are two occupations and 30 percent of both blacks and whites are employed in occupation 1, and 70 percent of both groups work in occupation 2, the value of the index of occupational dissimilarity would be: S=1/2 [(30-30) + (70-70)] = 0 showing no dissimilarities. If, however, all blacks work in occupation 1 and all whites work in occupation 2, the value of the index would be: S=1/2[|100-0| + |0-100|] = 100 showing complete occupational segregation by race. It is important to note that, for a given distribution of persons across jobs, the value of the index of occupational dissimilarity depends on how jobs are grouped into occupational categories. In general, broader categories yield lower values of S, and narrower categories yield higher values of S. For instance, if all jobs are grouped into three occupational categories, professional, other white collar, and blue collar, then occupations such as doctor and nurse, or operative and servant, would not be differentiated. The index of dissimilarity would not capture racial differences in distribution across the more narrowly defined occupations and. thus, would find a lower degree of occupational segregation than if more narrowly defined categories were used. Because the occupational categories used usually differ from one study to another, generally it is not possible to compare values of the index of occupational dissimilarity across studies. In choosing the occupational categories used in this report, care was taken to find categories that could be defined consistently across census years, so that trends in the degree of occupational segregation over time could be observed. The resulting occupational categories are relatively broad. One advantage of broad categories is that they are more likely to reflect a person's job accurately. (6) On the other hand, they may lead to underestimation of the true extent of racial segregation in jobs. Generation of the Hypothetical Occupational Distributions used in Chapter 5 Chapter 5 refers to hypothetical occupational distributions for black and white women assuming each group had the other's characteristics. The purpose of these generating these hypothetical occupational distributions is to determine the extent to which occupational segregation by race can be accounted for by racial differences in characteristics. A first step in generating these hypothetical occupational distributions was to classify women of each race into various cells according to their educational level (0-7, 8-11, 12, 13-15, 16+), region of residence (South, non- South), urban or rural residence (urban, rural) and age (20-24, 25-34, 35-44, 45-54, 55-64). The proportion of women in each cell and the occupational distribution of women within each cell (i.e., the proportion of women in the cell who were in each occupation) were then determined separately for black and white women. The hypothetical occupational distributions reported in tables 5.11-5.13 giving black women white women's characteristics were generated by determining the overall occupational distribution that would exist in each year if women had the black occupational distribution within cells and the white distribution across cells. Similarly, the hypothetical occupational distributions giving white women black women's characteristics were generated by determining the overall occupational distribution that would exist if women had the white occupational distribution within cells and the black distribution across cells. The hypothetical occupational distribution reported in table 5.15 giving black women in 1940 their 1980 characteristics was generated by determining the occupational distribution that would exist if women had black women's 1940 occupational distribution within cells but their 1980 distribution across cells. The hypothetical occupational distribution giving black women in 1980 their 1940 characteristics was generated in a similar manner, as were the corresponding hypothetical occupational distributions for white women reported in this appendix, table B.4. Finally, the hypothetical occupational distributions by region reported in tables B.6-B.7 are generated by separating women according to their region of residence (South/non-South) and then generating hypothetical occupational distributions for each region for the remaining characteristics in a similar manner to those described above. [Tables B.1 through B.12 are not available in electronic format] ENDNOTES (5) Throughout this appendix, following standar social science terminology, the term "occupational segregation" refers to differences in two groups' occupational distributions. The term is not intended to imply anything about the cause of racial differences in occupations. (6) When occupationa are defined very narrowly, distinctions between them become unclear, and persons working in the same job may be coded in different occupations. APPENDIX C An Attempt to Measure Differences in the Quality of Education by Race, Region, and Educational Level This report has found gaps between the wages and occupations of black and white women that cannot be explained by racial differences in measured characteristics. The report has also found that the unexplained gaps between the wages and occupations of black and white women are generally larger in the South and at low levels of education. Throughout, the report speculates that differences in the quality of education received by black and white women might account for some of the black-white differences in wages and occupations that cannot be explained by racial differences in measured characteristics. If black women receive generally lower quality education, then their educational achievement will generally be lower than that of white women with the same number of years of education. If employers consider educational achievement rather than years of schooling completed when making hiring decisions and setting pay, the lower educational achievement of black women might be partially responsible for their lower relative wages and occupational status. This appendix constitutes an attempt to determine whether measured educational achievement varies by race consistently with the patterns in women's wages and occupations noted above. Is the educational achievement of black women lower than that of white women with the same number of years of school? Is the educational achievement of black women relative to white women lower in the South than in the rest of the country? Do black women have lower educational achievement relative to white women at low levels of education? If the answers to these three questions are yes, the argument that part of the unexplained racial gaps in women's wages and occupations is due to differences in the quality of education rather than to direct labor market discrimination becomes more credible. Since measures of educational achievement are not available in the data sources used for the bulk of this report, this appendix uses data from the National Longitudinal Survey of Youth (NLSY) to study patterns of educational achievement by race. As part of a larger survey, the NLSY administered the Armed Forces Qualifications Test "AFQT" to a large national sample of young persons (male and female) between the ages of 14 and 21 in 1979. The AFQT is a test routinely administered to inductees to the armed forces for placement purposes. The test evaluates skills in the areas of science, mathematical reasoning, work knowledge, paragraph comprehension, numerical operations, coding speed, automobile and shop, mathematical knowledge, mechanical comprehension. and electronics. The scores on these subtests are then collapsed into a unidimensional overall AFQT score. This appendix analyses the relationship between final AFQT score, race, region, and education for the 3,085 white women and 1,477 black women in the NLSY sample for whom test scores are reported. A woman's final AFQT score is taken as a measure of her educational achievement. Since persons in the NLSY sample varied in age considerably when they were taking the test (from 14 to 21), their AFQT scores were age adjusted. (7) Many of the persons had not completed their education when they took the test. Rather than their 1979 educational level, the education level women had achieved by 1986 was used as a measure of their education. (8) Table C.1 shows the average AFQT percentiles (age adjusted) for the black and white women in the NLSY sample overall and by region and educational level. On average. the black women in the sample scored in the 36th percentile and the white women scored in the 68th percentile on the AFQT. Black women scored lower than white women at every educational level. These results confirm that black women today may have generally lower educational achievement than white women with the same number of years of school and suggest that black women's lower educational achievement may "explain" part of the "unexplained" racial gap in wages and occupations. On the other hand, the pattern of racial differences in AFQT percentiles by region and educational level are not consistent with the patterns in the unexplained gaps noted above. Although both blacks and whites have lower AFQT scores at lower educational levels blacks appear to perform better relative to whites at lower than at higher educational levels. Thus, educational achievement patterns cannot explain why wage and occupational gaps between black and white women are greater at lower educational levels. Similarly, although both black and white women have lower AFQT scores in the South than in the rest of the country, black women do not perform relatively worse in the South. In fact, the overall regional gap in scores is larger for white women than for black women. There is some apparent tendency for the regional gap within educational groups to be greater at higher educational levels for black women, whereas the reverse is true for white women. Thus, educational achievement patterns cannot explain why gaps in the wages and occupations of black and white women are larger in the South than in the rest of the country. These results are confirmed in the regression analysis reported in table C.2. In column (1) of table C.2, persons' age-adjusted AFQT percentiles are regressed on their education and on dummy variables indicating whether they are black, whether they lived ln the South when they were 14 years old, and whether they lived in a rural area when they were 14 years old. An interaction between the dummy variables for living in the South and being black was also included to determine whether black women's relative test scores differed by region. The regression results indicate that women's test scores rise with education and are lower overall for black women and for women living in the South. Living in a rural area has no effect on women's test scores. Furthermore, since the coefficient on the interaction between being black and living in the South is not statistically significantly different from zero, black women living in the South do not score relatively worse than their northern counterparts. A second specification, reported in column (2) of table C.2, adds an interaction term between being black, living ln the South, and education. The results from estimating this regression confirm that black women may do relatively worse in the South at higher educational levels, even though they do not do relatively worse in the South in general. (9) A third specification, reported ln column (3) of table C.2, adds an interaction between being black and educatlon. The results from estimating this regression confirm that black women do relatively worse at higher, not lower, educational levels. (10) The results that southern black women score relatively lower at higher educational levels might conceivably explain the finding in chapter 4 that southern black women are underrepresented in the clerical sector even after controlling for educatlon, among other factors. Since clerical jobs require relatively high educational levels, it is possible that southern black women's relatively low educational achievement at high levels of educatlon prevents them from entering clerical jobs in the same numbers as their white counterparts. The results from this investigation lend some support to the hypothesis that lower educational quality explains some of the overall black-white wage and occupational gap. They further support the hypothesis that southern black women may be prevented from entering clerical lobs by their relatively low educational achievement at higher educational levels. However, lower educational quality appears to explain neither the wider black-white wage and occupational gap in the South as a whole nor the wider black-white wage and occupational gap at lower educatlon levels. Since the NLSY administered the AFQT to young persons in 1979, the conclusions drawn in this appendix are only valid for the current day and may not characterize accurately historical patterns ln educational achievement by race. ENDNOTE (7) A regression of test scores on age and age-squared wad estimated, and the results were used to adjust women's scores upward or downward, depending on their age. (8) As such, the research presented in this appendix does not constitute a perfect test of the hypothesis that the quality of education varies by race according to the patterns noted above. A better test of the hypothesis would rely on test scores for persons who had actually completed their education. (9) The coefficient on the additional interaction term is negative and statistically significantly different from zero. (10) The coefficient on the additional interaction term is negative and statistically significantly different from zero. APPENDIX D Census Wage Regressions: Supplementary Tables for Chapter 6 [Tables D.1 through D.7 are not available in electronic format] APPENDIX E Notes on the Construction of Wage Variables for the Census of Population, Current Population Survey, and Survey of Income and Program Participation Data This appendix provides a set of detailed notes showing how the wage variables used in this report were constructed. Wage Construction for the 1940 80 Censuses of Population This report uses data from the 1940, 1950, 1960, 1970, and 1980 Censuses of Population. The wage variables for the Censuses of Population are derived from information provided by individuals regarding their total annual wage and salary income and weeks worked for the previous calendar year and information regarding the hours they worked per week. Persons with any self-employment or farm income, unpaid workers, students, and members of the armed forces were excluded from the wage calculations. Annual Wages An annual wage is a person's total wage and salary income for the previous calendar year except for persons whose wage and salary income was top-coded. (11) For persons whose wage and salary income was top-coded, the annual wage was estimated using the Pareto method (see Technical Documentation, 1980 Census of Population, p. 164). Assuming that the distribution of wage and salary income within demographic groups given by age, race, sex and educational levels can be described by a Pareto distribution, the annual wage for top-coded individuals was calculated as the conditional mean wage and salary income estimated for all top-coded individuals in the same demographic group. Weekly Wages A person's weekly wage was calculated as the annual wage divided by the number of weeks worked in the previous calendar year. For 1980 and 1950, determining the number of weeks worked was derived directly from persons' responses to the question: "How many weeks did you work last year?" In 1960 and 1970, however, persons' responses regarding the number of weeks they had worked were coded in lntervals rather than continuously. For these years, the number of weeks a person worked was estimated using the interval midpoint. In 1940 persons were asked how many "full-time equivalent" weeks they had worked in the previous year. Lacking any better estimate of the number of weeks they had worked, persons' answers to this question were used to calculate their weekly wages. Hourly Wages A person's hourly wage was calculated as the weekly wage divided by the number of hours usually worked per week in the pre- vious year. In 1980 the number of hours usually worked per week was asked directly. In the earlier years, this question was not asked. Instead, only hours worked during the survey week were reported. Survey week hours are not likely to be the same as hours usually worked per week in the preceding year. First, since the hours worked can vary widely from week to week, survey week hours may not be typical. Second, survey week hours refer to hours worked in the current year, not hours worked in the previous year, when the wage and salary income was earned. To alleviate this problem, since both usual hours and survey week hours were reported in the 1980 census, a regression was estimated of usual hours worked last year on survey work hours and various demographic characteristics using 1980 census data; the coefficients obtained in this regression were used to derive predicted values of usual hours worked for the earlier years. (12) To ensure comparability across census years, predicted usual hours were also used rather than actual reported usual hours in calculating persons' 1980 hourly wages. (13) Wage Construction for the March Current Population Surveys This report uses data from the 1970, 1971, 1980, 1983, 1985, 1986, and 1987 March Current Population Surveys. Like the Censuses of Population, the March Current Population Surveys report persons' wage and salary income earned and weeks worked during the previous calendar year. Starting in 1980, the Current Population Surveys reported both survey week hours and hours usually worked in the previous year. In 1970 and 1971, only survey week hours were reported. As for the censuses, persons with any self-employment or farm income, unpaid workers, students, and members of the armed forces were excluded from the wage calculations. Annual Wages As for the censuses, annual wages are reported annual wage and salary income except for individuals whose wage and salary income was top-coded. For top-coded individuals, the Pareto method described above was used to estimate their annual wages. Weekly Wages An individual's weekly wage is calculated as the annual wage divided by the number of weeks worked in the previous year. Hourly Wages For all years except for 1970 and 1971, an individual's hourly wage is calculated as the weekly wage divided by the number of hours usually worked per week in the previous year. In 1970 and 1971, the weekly wage was divided by the predicted number of hours usually worked, derived in the same way as for the 1940-70 censuses, except that the prediction was based on a regression estimated using data from the 1980 Current Population Survey. Wage Construction for the Survey of Income and Program Participation Hourly wages for the Survey of Income and Program Participation (SIPP) were constructed using data from the third wave of SIPP's 1984 panel. SIPP's third wave contains 4-months of data, including a detailed work history module. The principal advantages of SIPP's work history module are its measures of lifetime work experience, but it also provides more detailed information on the beginning and ending dates of jobs. The SIPP reports information on earnings, hours, and beginning and ending dates for up to two jobs held during the 4-month period. For persons with only one job during the 4-month period, the hourly wage was calculated as their earnings over the 4-month period divided by the product of the total number of weeks they worked and their usual weekly hours. For persons with two jobs during the period, the hourly wage was calculated as follows. If the jobs overlapped by fewer than 7 days, the hourly wage was calculated using total earnings and total hours at both jobs. If the two jobs overlapped by 7 or more days, the hourly wage was set to their hourly wage at their "primary job." Their "primary job" was determined by comparing, in order, hours worked and earnings in the two jobs. If usual hours worked on one job were 5 or more hours greater than on the other, it was considered the primary job. Failing this, the job with the greater earnings was considered the primary job. Finally, if both jobs were identical in hours and earnings, the job termed "Job 1" was considered the primary job. ENDNOTES (11) Top-coded wage and salary levels were: $5,000 in 1940, $10,000 in 1950, $25,000 in 1960, $50,000 in 1970, and $75,000 in 1980. (12) Separate regressions were estimated for black and white women. Besides survey week hours, variable included in the regression were educational level, age, marital status, and presence of children. Interactions between survey week hours and these other variables were also included. (13) using predicted usual hours rather than actual reported hours in 1980 has large consequences for estimates of black women's relative hourly wages in that year. When predicted usual hours are used, black women appear to earn an average of 99 percent as much per hour as white women. When actual reported usual hours are used, this figure is 111 percent. Using survey week hours yields a figure of 102 percent. (14) Top-coded income levels were $50,000 in 1970, 1971, and 1980, $75,000 in 1983, and $100,000 in 1985, 1986, and 1987. APPENDIX F SIPP and CPS Wage Regressions: Supplementary Tables for Chapter 7 [Tables F.1 through F.11 are not available in electronic format]