MULTICOLLINEARITY:WHAT HAPPENS IFTH

MULTICOLLINEARITY:
WHAT HAPPENS IF
THE REGRESSORS
ARE CORRELATED?
There are several sources of multicollinearity. As Montgomery and Peck
note, multicollinearity may be due to the following factors7:
1. The data collection method employed, for example, sampling over a
limited range of the values taken by the regressors in the population.
2. Constraints on the model or in the population being sampled. For
example, in the regression of electricity consumption on income (X2) and
house size (X3) there is a physical constraint in the population in that families
with higher incomes generally have larger homes than families with
lower incomes.
3. Model specification, for example, adding polynomial terms to a regression
model, especially when the range of the X variable is small.
4. An overdetermined model. This happens when the model has more
explanatory variables than the number of observations. This could happen
in medical research where there may be a small number of patients about
whom information is collected on a large number of variables.
An additional reason for multicollinearity, especially in time series data,
may be that the regressors included in the model share a common trend,
that is, they all increase or decrease over time. Thus, in the regression of
consumption expenditure on income, wealth, and population, the regressors
income, wealth, and population may all be growing over time at more
or less the same rate, leading to collinearity among these variables.
PRACTICAL CONSEQUENCES OF MULTICOLLINEARITY
In cases of near or high multicollinearity, one is likely to encounter the following
consequences:
1. Although BLUE, the OLS estimators have large variances and covariances,
making precise estimation difficult.
2. Because of consequence 1, the confidence intervals tend to be much
wider, leading to the acceptance of the “zero null hypothesis” (i.e., the true
population coefficient is zero) more readily.
3. Also because of consequence 1, the t ratio of one or more coefficients
tends to be statistically insignificant.
4. Although the t ratio of one or more coefficients is statistically insignificant,
R2, the overall measure of goodness of fit, can be very high.
5. The OLS estimators and their standard errors can be sensitive to
small changes in the data.
SUMMARY AND CONCLUSIONS
1. One of the assumptions of the classical linear regression model is
that there is no multicollinearity among the explanatory variables, the X’s.
Broadly interpreted, multicollinearity refers to the situation where there
is either an exact or approximately exact linear relationship among the X
variables.
2. The consequences of multicollinearity are as follows: If there is perfect
collinearity among the X’s, their regression coefficients are indeterminate
and their standard errors are not defined. If collinearity is high but not
perfect, estimation of regression coefficients is possible but their standard
errors tend to be large. As a result, the population values of the coefficients
cannot be estimated precisely. However, if the objective is to estimate linear
combinations of these coefficients, the estimable functions, this can be done
even in the presence of perfect multicollinearity.
3. Although there are no sure methods of detecting collinearity, there
are several indicators of it, which are as follows:
(a) The clearest sign of multicollinearity is when R2 is very high but
none of the regression coefficients is statistically significant on the
basis of the conventional t test. This case is, of course, extreme.
(b) In models involving just two explanatory variables, a fairly good
idea of collinearity can be obtained by examining the zero-order,
or simple, correlation coefficient between the two variables. If
this correlation is high, multicollinearity is generally the culprit.
(c) However, the zero-order correlation coefficients can be misleading
in models involving more than two X variables since it is possible
to have low zero-order correlations and yet find high multicollinearity.
In situations like these, one may need to examine the
partial correlation coefficients.
(d) If R2 is high but the partial correlations are low, multicollinearity
is a possibility. Here one or more variables may be superfluous.
But if R2 is high and the partial correlations are also high, multicollinearity
may not be readily detectable. Also, as pointed out by
C. Robert, Krishna Kumar, John O’Hagan, and Brendan McCabe,
there are some statistical problems with the partial correlation
test suggested by Farrar and Glauber.
(e) Therefore, one may regress each of the Xi variables on the remaining
X variables in the model and find out the corresponding
coefficients of determination R2
i . A high R2
i would suggest that Xi is highly correlated with the rest of the X’s. Thus, one may drop
that Xi from the model, provided it does not lead to serious specification
bias.
4. Detection of multicollinearity is half the battle. The other half is concerned
with how

MULTICOLLINEARITY:
WHAT HAPPENS IF
THE REGRESSORS
ARE CORRELATED? 
There are several sources of multicollinearity. As Montgomery and Peck
note, multicollinearity may be due to the following factors7:
1. The data collection method employed, for example, sampling over a
limited range of the values taken by the regressors in the population.
2. Constraints on the model or in the population being sampled. For
example, in the regression of electricity consumption on income (X2) and
house size (X3) there is a physical constraint in the population in that families
with higher incomes generally have larger homes than families with
lower incomes.
3. Model specification, for example, adding polynomial terms to a regression
model, especially when the range of the X variable is small.
4. An overdetermined model. This happens when the model has more
explanatory variables than the number of observations. This could happen
in medical research where there may be a small number of patients about
whom information is collected on a large number of variables.
An additional reason for multicollinearity, especially in time series data,
may be that the regressors included in the model share a common trend,
that is, they all increase or decrease over time. Thus, in the regression of
consumption expenditure on income, wealth, and population, the regressors
income, wealth, and population may all be growing over time at more
or less the same rate, leading to collinearity among these variables.
PRACTICAL CONSEQUENCES OF MULTICOLLINEARITY
In cases of near or high multicollinearity, one is likely to encounter the following
consequences:
1. Although BLUE, the OLS estimators have large variances and covariances,
making precise estimation difficult.
2. Because of consequence 1, the confidence intervals tend to be much
wider, leading to the acceptance of the “zero null hypothesis” (i.e., the true
population coefficient is zero) more readily.
3. Also because of consequence 1, the t ratio of one or more coefficients
tends to be statistically insignificant.
4. Although the t ratio of one or more coefficients is statistically insignificant,
R2, the overall measure of goodness of fit, can be very high.
5. The OLS estimators and their standard errors can be sensitive to
small changes in the data.
SUMMARY AND CONCLUSIONS
1. One of the assumptions of the classical linear regression model is
that there is no multicollinearity among the explanatory variables, the X’s.
Broadly interpreted, multicollinearity refers to the situation where there
is either an exact or approximately exact linear relationship among the X
variables.
2. The consequences of multicollinearity are as follows: If there is perfect
collinearity among the X’s, their regression coefficients are indeterminate
and their standard errors are not defined. If collinearity is high but not
perfect, estimation of regression coefficients is possible but their standard
errors tend to be large. As a result, the population values of the coefficients
cannot be estimated precisely. However, if the objective is to estimate linear
combinations of these coefficients, the estimable functions, this can be done
even in the presence of perfect multicollinearity.
3. Although there are no sure methods of detecting collinearity, there
are several indicators of it, which are as follows:
(a) The clearest sign of multicollinearity is when R2 is very high but
none of the regression coefficients is statistically significant on the
basis of the conventional t test. This case is, of course, extreme.
(b) In models involving just two explanatory variables, a fairly good
idea of collinearity can be obtained by examining the zero-order,
or simple, correlation coefficient between the two variables. If
this correlation is high, multicollinearity is generally the culprit.
(c) However, the zero-order correlation coefficients can be misleading
in models involving more than two X variables since it is possible
to have low zero-order correlations and yet find high multicollinearity.
In situations like these, one may need to examine the
partial correlation coefficients.
(d) If R2 is high but the partial correlations are low, multicollinearity
is a possibility. Here one or more variables may be superfluous.
But if R2 is high and the partial correlations are also high, multicollinearity
may not be readily detectable. Also, as pointed out by
C. Robert, Krishna Kumar, John O’Hagan, and Brendan McCabe,
there are some statistical problems with the partial correlation
test suggested by Farrar and Glauber.
(e) Therefore, one may regress each of the Xi variables on the remaining
X variables in the model and find out the corresponding
coefficients of determination R2
i . A high R2
i would suggest that Xi is highly correlated with the rest of the X’s. Thus, one may drop
that Xi from the model, provided it does not lead to serious specification
bias.
4. Detection of multicollinearity is half the battle. The other half is concerned
with how

5000/5000

Dari: Inggris

Ke: Bahasa Indonesia

Hasil (Bahasa Indonesia) 1: [Salinan]

Disalin!

MULTICOLLINEARITY:WHAT HAPPENS IFTHE REGRESSORSARE CORRELATED? There are several sources of multicollinearity. As Montgomery and Pecknote, multicollinearity may be due to the following factors7:1. The data collection method employed, for example, sampling over alimited range of the values taken by the regressors in the population.2. Constraints on the model or in the population being sampled. Forexample, in the regression of electricity consumption on income (X2) andhouse size (X3) there is a physical constraint in the population in that familieswith higher incomes generally have larger homes than families withlower incomes.3. Model specification, for example, adding polynomial terms to a regressionmodel, especially when the range of the X variable is small.4. An overdetermined model. This happens when the model has moreexplanatory variables than the number of observations. This could happenin medical research where there may be a small number of patients aboutwhom information is collected on a large number of variables.An additional reason for multicollinearity, especially in time series data,may be that the regressors included in the model share a common trend,that is, they all increase or decrease over time. Thus, in the regression ofconsumption expenditure on income, wealth, and population, the regressorsincome, wealth, and population may all be growing over time at moreor less the same rate, leading to collinearity among these variables.PRACTICAL CONSEQUENCES OF MULTICOLLINEARITYIn cases of near or high multicollinearity, one is likely to encounter the followingconsequences:1. Although BLUE, the OLS estimators have large variances and covariances,making precise estimation difficult.2. Because of consequence 1, the confidence intervals tend to be muchwider, leading to the acceptance of the “zero null hypothesis” (i.e., the truepopulation coefficient is zero) more readily.3. Also because of consequence 1, the t ratio of one or more coefficientstends to be statistically insignificant.4. Although the t ratio of one or more coefficients is statistically insignificant,R2, the overall measure of goodness of fit, can be very high.5. The OLS estimators and their standard errors can be sensitive tosmall changes in the data.SUMMARY AND CONCLUSIONS1. One of the assumptions of the classical linear regression model isthat there is no multicollinearity among the explanatory variables, the X’s.Broadly interpreted, multicollinearity refers to the situation where thereis either an exact or approximately exact linear relationship among the Xvariables.2. The consequences of multicollinearity are as follows: If there is perfectcollinearity among the X’s, their regression coefficients are indeterminateand their standard errors are not defined. If collinearity is high but notperfect, estimation of regression coefficients is possible but their standarderrors tend to be large. As a result, the population values of the coefficients
cannot be estimated precisely. However, if the objective is to estimate linear
combinations of these coefficients, the estimable functions, this can be done
even in the presence of perfect multicollinearity.
3. Although there are no sure methods of detecting collinearity, there
are several indicators of it, which are as follows:
(a) The clearest sign of multicollinearity is when R2 is very high but
none of the regression coefficients is statistically significant on the
basis of the conventional t test. This case is, of course, extreme.
(b) In models involving just two explanatory variables, a fairly good
idea of collinearity can be obtained by examining the zero-order,
or simple, correlation coefficient between the two variables. If
this correlation is high, multicollinearity is generally the culprit.
(c) However, the zero-order correlation coefficients can be misleading
in models involving more than two X variables since it is possible
to have low zero-order correlations and yet find high multicollinearity.
In situations like these, one may need to examine the
partial correlation coefficients.
(d) If R2 is high but the partial correlations are low, multicollinearity
is a possibility. Here one or more variables may be superfluous.
But if R2 is high and the partial correlations are also high, multicollinearity
may not be readily detectable. Also, as pointed out by
C. Robert, Krishna Kumar, John O’Hagan, and Brendan McCabe,
there are some statistical problems with the partial correlation
test suggested by Farrar and Glauber.
(e) Therefore, one may regress each of the Xi variables on the remaining
X variables in the model and find out the corresponding
coefficients of determination R2
i . A high R2
i would suggest that Xi is highly correlated with the rest of the X’s. Thus, one may drop
that Xi from the model, provided it does not lead to serious specification
bias.
4. Detection of multicollinearity is half the battle. The other half is concerned
with how

Sedang diterjemahkan, harap tunggu..

Hasil (Bahasa Indonesia) 2:[Salinan]

Disalin!

Sedang diterjemahkan, harap tunggu..

Hasil (Bahasa Indonesia) 3:[Salinan]

Disalin!

Sedang diterjemahkan, harap tunggu..

Bahasa lainnya

Dukungan alat penerjemahan: Afrikans, Albania, Amhara, Arab, Armenia, Azerbaijan, Bahasa Indonesia, Basque, Belanda, Belarussia, Bengali, Bosnia, Bulgaria, Burma, Cebuano, Ceko, Chichewa, China, Cina Tradisional, Denmark, Deteksi bahasa, Esperanto, Estonia, Farsi, Finlandia, Frisia, Gaelig, Gaelik Skotlandia, Galisia, Georgia, Gujarati, Hausa, Hawaii, Hindi, Hmong, Ibrani, Igbo, Inggris, Islan, Italia, Jawa, Jepang, Jerman, Kannada, Katala, Kazak, Khmer, Kinyarwanda, Kirghiz, Klingon, Korea, Korsika, Kreol Haiti, Kroat, Kurdi, Laos, Latin, Latvia, Lituania, Luksemburg, Magyar, Makedonia, Malagasi, Malayalam, Malta, Maori, Marathi, Melayu, Mongol, Nepal, Norsk, Odia (Oriya), Pashto, Polandia, Portugis, Prancis, Punjabi, Rumania, Rusia, Samoa, Serb, Sesotho, Shona, Sindhi, Sinhala, Slovakia, Slovenia, Somali, Spanyol, Sunda, Swahili, Swensk, Tagalog, Tajik, Tamil, Tatar, Telugu, Thai, Turki, Turkmen, Ukraina, Urdu, Uyghur, Uzbek, Vietnam, Wales, Xhosa, Yiddi, Yoruba, Yunani, Zulu, Bahasa terjemahan.