- Data 15 to the difficulty of choosing wage equation (simply, omitted variable bias). Ponthieux and Meurs (2015) argue that apart from the observed characteristics, there are some unobserved ones that determine the employment status of an indi- vidual, and they might be correlated with the productivity and the wage. Neuman and Oaxaca (1998) propose treatment for selection bias, which is arisen due to the fact that workers are not a random sample of the working-age population, by the inclusion of the inverse Mills ratio (Heckman, 1976, 1979) in the wage equation. The inverse Mills ratio, which is sometimes called as a correction term, is derived from the probit model of the probability of being employed. Theoretically, this proposal seems to work rather well, however, it has practical limitations (Vella, 1998). The analysis is conducted in Stata 14.2. For the RIF -regression command rifreg3 was used, and for Oaxaca-Blinder decomposition the command oaxaca84. 4 Data The methodology, described in Section 3, is applied to the dataset of European Union Statistics on Income and Living Conditions (EU-SILC, hereafter) provided by Euro- stat. The dataset collects information at both personal and household levels for the year 2016. The EU-SILC data has two important features that distinguish it from the other datasets. First, it collects data on income for personal and household levels, which makes it more desirable for income analysis compared to strictly personal or household level datasets. Secondly, it contains information on 25 countries (23 EU and 2 Non-EU countries) for individuals aged 16 and above. Such rich dataset al- lows researchers to study income distribution patterns over age groups and make cross-country analysis, which itself reveals institutional effects to some extent. The dataset includes 420,520 observations over 25 countries. The sample size of females within each country is systematically larger than the sample of males. The only ex- ceptions are Finland, where the males’ sample exceeds females’ sample size, and Sweden, where the sample sizes are almost equal. In addition, there is a variation in males’ and females’ sample sizes within each-country’s each age group: despite the fact that females’ sample is larger than males’ within a country, this does not imply that number of females systematically exceeds number of males in an age-group (the detailed information regarding the sample size of each country is given in appendix, Table A.1). However, the difference is rather small for each country. In addition, I applied survey weights so that the results are the representative of the whole distri- bution. The income of an individual is computed as a sum of household level income per household member and personal level income. To compute the household level income per its member, aggregate household income has been divided by the num- ber of its members. In other words, it is assumed a priori that the household pools and equally distributes income among its members. This approach can be criticized as it precludes intrahousehold inequality. However, to the author’s best knowl- edge, there is no other consistent way of redistributing household income among its members. As for the personal level income, the dataset allows inclusion of other 3Nicole M. Fortin based on Firpo et al. (2009) - ri f reg 4Jann (2005, revised in 2008) - oaxaca 8 16 TABLE 1: Countries in the study EU: Austria, Belgium, Bulgaria, Czechia, Germany, Denmark, Estonia, Greece, Spain, Finland, France, Croatia, Hungary, Latvia, Lithuania, The Netherlands, Poland, Portugal, Romania, Sweden, Slovenia, Slovakia, UK1. Non-EU: Norway, Serbia. 1 At the time the paper was written, the United Kingdom was in EU. sources of income than just employment income. Moreover, the EU-SILC data col- lect information on income during the previous 12 months. In the data, the income components were initially given in local currencies. To make cross-country results comparable, all components have been converted into Euros 5. The Table 2 shows the income sources included at both, household and personal levels. Similar to wages, the distribution of income is positively skewed. However, un- like the wages, to rescale income, it is impossible to apply the logarithmic trans- formation. The reason lies in different income sources that may be positive as well as negative, which is not the case for wages. For instance, gross cash losses from self-employment (Table 2) may outweigh income from other sources and result in negative total income. To deal with such problems, Johnson (1949) proposed inverse hyperbolic sine (IHS, hereafter) transformation. The importance of IHS transforma- tion has been highlighted by Pence (2006) (also see e.g. Poterba et al. (1995)). The IHS of income is written in the following manner: θ−1sinh−1 (θY) = θ−1ln ( θY + ( θ2Y2 + 1 ) 1 2 ) (4.1) where θ6is a scaling parameter and Y is total income. The main advantage of the IHS transformation is its linearity around the origin. This feature is especially important for very low income. The logarithmic transformation would treat 100% change at the lower and upper tails of the distribution in the same way (Pence, 2006). The fact that the IHS transformation approximates logarithm in the right tail of the distribution can be considered as another advantage. Table A.2 presents the average share of each income component in the total income. It is observed that apart from employment income, which constitutes a substantial portion of the total income (on average 49.25% for men and 42.22% for women), there are other sources that contribute to the composition of the total in- come. The high share of unemployment income is not surprising. For some time, an individual could have been unemployed, thus receiving unemployment income, and after that time they could become employed and started receiving employment income. Also, Table A.2 shows that employment income has a bigger share of men’s income compared to women’s. On the contrary, a share of household income is 5Average annual exchange rate of 2015 is used for conversion. 6In this study the scaling parameter is set to θ = 1 as it made the distribution closer to normal, therefore, IHS transformation could be written as sinh−1 (Y) = ln ( Y + ( Y2 + 1 ) 1 2 )
- Data 17 TABLE 2: Income components Level Components Personal: (1) Gross employee cash or near cash income; (2) Company car; (3) Gross cash benefits or losses from self-employment1; (4) Pensions received from individual private plans;2 (5) Unemployment benefits; (6) Old-age benefits; (7) Survivor’s benefits; (8) Sickness benefits; (9) Disability benefits; (10) Education-related allowances. Household: (11) Income from rental of a property or land; (12) Family/children related allowances; (13) Social exclusion not elsewhere classified; (14) Housing allowances; (15) Regular inter-household cash transfers received; (16) Interests, dividends, profit from capital investments; 3 (17) Income received by people aged under 16. 1 Includes royalties. 2 Includes only those sources that are not classified in European System of in- tegrated Social Protection Statistics (ESSPROS). 3 Investments in unincorporated business. Note: Income components are assigned to either personal or household level by the survey. always higher for women. The assumption of the absence of intrahousehold in- equality, implies that the variation in individual household level income is primar- ily driven by single-headed households. Moreover, in all countries (excluding Swe- den), the share of profits and losses from self-employment is higher for men. Since many income components are reported on the annual basis (for example, profits and losses from self-employment and interests, dividends, and profits from capital investments), in this paper, I study the gap in the annual income rather than monthly. The set of explanatory variables includes age, education, employment status, oc- cupation, marital status, citizenship status, and children under 3 or 15 years. A more tentative classification of the explanatory variables is displayed in Table 3. Individuals are grouped into 4 age categories: 1) <25; 2) 26-45; 3) 46-65; and 4)
- From the study, I excluded individuals who are below 24, living with their parents and reported their occupation as student, i.e. I dropped economically de- pendent household members from the study data. As for the education, the vari- able provided in the EU-SILC data initially had several categories, which later have been grouped into the following 3 broader categories. Primary education includes individuals with less than primary or primary, and those with lower secondary ed- ucation. Secondary education group consists of individuals with either, upper sec- ondary, or post-secondary (non-tertiary) education. And individuals with short cy- cle tertiary, bachelor, master, or doctorate degrees are grouped in tertiary education
18 group7. TABLE 3: Classification of the explanatory variables Variable Components Age Individuals aged 16 - 81 Education Primary, secondary, tertiary. Employment Status Full- and part-time worker, unemployed, inactive. Occupation Managers; professionals; technicians and associate professionals; clerical support workers; services and sales workers; skilled agricultural, forestry and fishery workers; craft and related trades workers; elementary occupations; plant and machine operators and assemblers. Marital status Single, married, cohabitants. Citizenship status Citizen, non-citizen. Children younger than 3 Whether there are children below 3 in the household Children younger than 15 Whether there are children below 15 in the household Another group of explanatory variables that have been grouped into broader cat- egories is self-defined economic status. Those individuals, who reported that they were working full-time (either employed or self-employed) have been assigned to full-time workers, while those working part-time (either employed or self-employed) have been assigned to part-time workers. The group of unemployed individuals in- cludes those, who reported their current economic status as unemployed. Pupils, students, trainees, interns, permanently disabled or unfit to work, compulsory mil- itary and community service workers, also those fulfilling domestic tasks and care responsibilities were assigned to inactive group. The data on occupation is collected in accordance to ISCO-088 classification. Indi- viduals who participated in the EU-SILC survey were asked to report the occupation of most recent main job. If an individual was unemployed, occupation for the last main job was reported. Most individuals reported detailed codes for their occupa- tion (either for sub-major, or sub-minor), however, some part of the total population reported more generalized occupational fields. Generalizing more specified cate- gories seems to be more reasonable rather than specifying generalized categories into narrower ones without any knowledge of the real occupation of the individ- ual. Therefore, to achieve one format across the countries’ samples, detailed occupa- tions have been grouped into broader groups. The armed forces occupations were grouped together with technicians. Marital status includes three categories: single, married, and cohabitants. Sin- gle individuals include those who have never been married, as well as separated, divorced, and widowed individuals. Those who reported their marital status as married have been assigned to the group of married people, and group of cohabit- ing individuals includes those living in a consensual union without a legal basis. 7This approach follows ISCED 2011 methodology, implemented by Eurostat. 8 ISCO-08 Structure, index correspondence with ISCO-88 is available at https://www.ilo.org/ public/english/bureau/stat/isco/isco08/index.htm
- Data 19 The low response rate to the questions regarding the industry, firm size, and health conditions, does not allow their inclusion in the regression model. The in- clusion of these variables would result in losing the substantial portion of the total observations (approx. 54% in case of industry, approx. 55% in case of firm size, and more than 15% in case of health conditions). For robustness, total income is divided into three groups: employment income, private transfers and capital income, and public transfers. Employment income in- corporates all income sources of either employee or self-employed ((1), (2), and (3) components from Table 2). The private transfers and capital income include: private pensions, received from individual plans; rent income; inter-household transfers; interests and profits from capital investments, and income of individuals below 16 ((4), (11), (15), (16) and (17) from Table 2). And lastly, public transfers include the rest of both, personal and household income components ((5), (6), (7), (8), (9), (10), (12), (13), and (14) from Table 2)). Figure 1 shows average share of each these income group in total income for all countries across age groups. The share of employment income peaks for age group 25-44 almost for all countries. In Sweden, it increases gradually for all age groups and then drops for people over 65, as in every country. FIGURE 1: Average share of different income sources in total income 0 .2 .4 .6 .8 1 <25 25−44 45−65
65 Austria 0 .2 .4 .6 .8 1 <25 25−44 45−65 65 Belgium 0 .2 .4 .6 .8 1 <25 25−44 45−65 65 Bulgaria 0 .2 .4 .6 .8 1 <25 25−44 45−65 65 Croatia 0 .2 .4 .6 .8 1 <25 25−44 45−65 65 Czechia 0 .5 1 <25 25−44 45−65 65 Denmark 0 .2 .4 .6 .8 1 <25 25−44 45−65 65 Estonia 0 .2 .4 .6 .8 1 <25 25−44 45−65 65 Finland 0 .2 .4 .6 .8 1 <25 25−44 45−65 65 France 0 .2 .4 .6 .8 1 <25 25−44 45−65 65 Germany 0 .2 .4 .6 .8 1 <25 25−44 45−65 65 Greece 0 .2 .4 .6 .8 1 <25 25−44 45−65 65 Hungary 0 .2 .4 .6 .8 1 <25 25−44 45−65 65 Latvia 0 .2 .4 .6 .8 1 <25 25−44 45−65 65 Lithuania 0 .2 .4 .6 .8 1 <25 25−44 45−65 65 Netherlands 0 .2 .4 .6 .8 1 <25 25−44 45−65 65 Norway 0 .2 .4 .6 .8 1 <25 25−44 45−65 65 Poland 0 .2 .4 .6 .8 1 <25 25−44 45−65 65 Portugal 0 .2 .4 .6 .8 1 <25 25−44 45−65 65 Romania 0 .2 .4 .6 .8 1 <25 25−44 45−65 65 Serbia 0 .2 .4 .6 .8 1 <25 25−44 45−65 65 Slovakia 0 .2 .4 .6 .8 1 <25 25−44 45−65 65 Slovenia 0 .2 .4 .6 .8 1 <25 25−44 45−65 65 Spain 0 .2 .4 .6 .8 1 <25 25−44 45−65 65 Sweden 0 .2 .4 .6 .8 1 <25 25−44 45−65 65 UK Employment income Private transfers and capital income Public transfers Source: author’s calculation from EU-SILC 2016. In all countries public transfers constitute the largest share of total income for individuals above 65. Unlike from employment income and public transfers, there is a heterogeneity in income from private transfers and capital. Interesting trends