INCREASING THE STABILITY OF LINEAR REGRESSION
Abstract
The paper considers a method for increasing the stability of a linear regression equation, the coefficients of which are the solution of a system of normal equations. The instability of the equation is determined by the presence of a linear relationship between the experimental data; the quantitative characteristics of the dependence are determined from the correlation matrix, which can also serve as a matrix of the system of equations. To reduce the correlation coefficients, Ridge (ridge) regression is traditionally used, which involves an increase in the diagonal members of the matrix by the same positive number. As a result, the matrix condition number decreases and the regression equation becomes more stable: a small change in the input results in a small change in the solution. The number by which the diagonal terms of the matrix increase is called the penalty imposed in ridge regression on all regression coefficients. In the proposed method, penalties, and different ones, are imposed only on those coefficients that correspond to data with high correlation. This leads to an increase in the stability of the equation due to a decrease in the values of the coefficients corresponding to correlated data. The choice of elements to be increased is based on the analysis of the correlation matrix of the original data set by decomposing it into diagonal matrices using the square root method. In addition to increasing the stability using the proposed method, a reduction in the dimension of the regression model can be achieved - a decrease in the number of terms of the corresponding equation, for which the LASSO and LARS algorithms are usually used. The effectiveness of the method is tested on a known data set, and a comparison is made not only with Ridge regression, but also with the results of known dimensionality reduction algorithms.
References
regularization, Automatica, 2013, Vol. 149 (4), pp. 1045-50.
2. Hahn P.R., Carvalho C.M., Puelz D., He J. Regularization and confounding in linear regression
for treatment effect estimation, Bayesian Analysis, 2018, Vol. 13 (1), pp. 163-82.
3. Cui Q., Xu Y., Zhang Z., Chan V. Max-linear regression models with regu-larization, Journal
of Econometrics, 2021, Vol. 222 (1), pp. 579-600.
4. Yan X. Linear Regression Analysis: Theory and Computing. World Scientific Publishing
Company, 2009.
5. Al'bert A. Regressiya, psevdoinversiya i rekurrentnoe otsenivanie [Regression, pseudo–
inversion and recurrent estimation]: trans. from engl. Moscow: Nauka. Glavnaya redaktsiya
fiziko-matematicheskoy literatury, 1977, 224 p.
6. Seber Dzh. Lineynyy regressionnyy analiz [Linear regression analysis]: trans. from engl., ed.
by M.B. Malyutova. Moscow: Mir.
7. Nosko V.P. Ekonometrika: uchebnik [Econometrics: textbook]. Moscow: ID Delo
RANKHiGS, 2011, 672 p.
8. Uspenskiy A.B., Fedorov V.V. Vychislitel'nye aspekty metoda naimen'shikh kvadratov pri
analize i planirovanii regressionnykh eksperimentov [Computational aspects of the least
squares method in the analysis and planning of regression experiments]. Moscow: Izd-vo
MGU, 1975.
9. Gorid'ko N.P., Nizhegorodtsev R.M. Sovremennyy ekonomicheskiy rost: teoriya i
regressionnyy analiz: monografiya [Modern economic growth: theory and regression analysis:
monograph]. Moscow: Infra-M, 2017, 444 p.
10. Esaulov I.G. Regressionnyy analiz dannykh v pakete Mathcad: ucheb. posobie [Regression
analysis of data in the Mathcad package: tutorial]. Saint Petersburg: Lan' P, 2016, 224 p.
11. Karlberg K. Regressionnyy analiz v Microsoft Excel [Regression analysis in Microsoft Excel].
Moscow: Dialektika, 2019, 400 p.
12. Sokolov G.A. Vvedenie v regressionnyy analiz i planirovanie regressionnykh eksperimentov v
ekonomike: ucheb. posobie [Introduction to regression analysis and planning of regression experiments
in economics: textbook]. Moscow: Infra-M, 2016, 352 p.
13. Faktornyy analiz prestupnosti: korrelyatsionnyy i regressionnyy metody: monografiya [Factor
analysis of crime: correlation and regression methods: monograph], ed. by S.M. Inshakova.
Moscow: Yuniti, 2014, 127 p.
14. Trevor H.,Tibshirani, R,Wainwright,M. Statistical Learning with Sparsity: The Lasso and Generalizations.
Chapman&Hall, 2015.
15. Melkumova L.E., Shatskikh S.Ya. Sravnenie metodov Ridzh-regressii i LASSO v zadachakh
obrabotki dannykh [Comparison of Ridge regression and LASSO methods in data processing
problems], Cb. trudov III mezhdunarodnoy konferentsii i molodezhnoy shkoly [Proceedings of
the III International Conference and Youth School]. Samara National Research University
named after Academician S.P. Korolev, 2017, pp. 1755-62.
16. Efron B., Hastie T., Johnstone J., Tibshirani R. Least Angle Regression, The Annals of Statistics,
2004, Vol. 32, pp. 407-99.
17. Dreyper N., Smit G. Prikladnoy regressionnyy analiz [Applied Regression analysis]. 2nd ed.:
trans. from engl. Moscow: Finansy i statistika, 1986.
18. Louson Ch., Khenson R. Chislennoe reshenie zadach metoda naimen'shikh kvadratov [Numerical
solution of problems of the least squares method]: trans. from engl. Moscow: Nauka. Gl.
red. fiz.-mat. lit., 1986.
19. Voevodin V.V., Kuznetsov Yu.A. Matritsy i vychisleniya [Matrices and calculations]. Moscow:
Nauka. Glavnaya redaktsiya fiziko-matematicheskoy literatury, 1984.
20. Lutay V.N. Povyshenie ustoychivosti treugol'nogo razlozheniya plokho obuslovlennykh matrits
[Increasing the stability of the triangular decomposition of poorly conditioned matrices],
SibZhVM [Siberian Journal of Computational Mathematics], 2019, No. 4, Vol. 22, pp. 465-73.








