The performance of diagnostic-robust generalized potentials for the identification of multiple high leverage points in linear regression

Leverage values are being used in regression diagnostics as measures of influential observations in the X-space. Detection of high leverage values is crucial because of their responsibility for misleading conclusion about the fitting of a regression model, causing multicollinearity problems, masking...

Full description

Bibliographic Details
Published in:	Journal of Applied Statistics
Main Author:	Habshah M.; Norazan M.R.; Imon A.H.M.R.
Format:	Article
Language:	English
Published:	2009
Online Access:	https://www.scopus.com/inward/record.uri?eid=2-s2.0-70449440325&doi=10.1080%2f02664760802553463&partnerID=40&md5=bc708d903f4383b69bea3cdb61838654

id	2-s2.0-70449440325
spelling	2-s2.0-70449440325 Habshah M.; Norazan M.R.; Imon A.H.M.R. The performance of diagnostic-robust generalized potentials for the identification of multiple high leverage points in linear regression 2009 Journal of Applied Statistics 36 5 10.1080/02664760802553463 https://www.scopus.com/inward/record.uri?eid=2-s2.0-70449440325&doi=10.1080%2f02664760802553463&partnerID=40&md5=bc708d903f4383b69bea3cdb61838654 Leverage values are being used in regression diagnostics as measures of influential observations in the X-space. Detection of high leverage values is crucial because of their responsibility for misleading conclusion about the fitting of a regression model, causing multicollinearity problems, masking and/or swamping of outliers, etc. Much work has been done on the identification of single high leverage points and it is generally believed that the problem of detection of a single high leverage point has been largely resolved. But there is no general agreement among the statisticians about the detection of multiple high leverage points. When a group of high leverage points is present in a data set, mainly because of the masking and/or swamping effects the commonly used diagnostic methods fail to identify them correctly. On the other hand, the robust alternative methods can identify the high leverage points correctly but they have a tendency to identify too many low leverage points to be points of high leverages which is not also desired. An attempt has been made to make a compromise between these two approaches. We propose an adaptive method where the suspected high leverage points are identified by robust methods and then the low leverage points (if any) are put back into the estimation data set after diagnostic checking. The usefulness of our newly proposed method for the detection of multiple high leverage points is studied by some well-known data sets and Monte Carlo simulations. © 2009 Taylor & Francis. 13600532 English Article All Open Access; Green Open Access
author	Habshah M.; Norazan M.R.; Imon A.H.M.R.
spellingShingle	Habshah M.; Norazan M.R.; Imon A.H.M.R. The performance of diagnostic-robust generalized potentials for the identification of multiple high leverage points in linear regression
author_facet	Habshah M.; Norazan M.R.; Imon A.H.M.R.
author_sort	Habshah M.; Norazan M.R.; Imon A.H.M.R.
title	The performance of diagnostic-robust generalized potentials for the identification of multiple high leverage points in linear regression
title_short	The performance of diagnostic-robust generalized potentials for the identification of multiple high leverage points in linear regression
title_full	The performance of diagnostic-robust generalized potentials for the identification of multiple high leverage points in linear regression
title_fullStr	The performance of diagnostic-robust generalized potentials for the identification of multiple high leverage points in linear regression
title_full_unstemmed	The performance of diagnostic-robust generalized potentials for the identification of multiple high leverage points in linear regression
title_sort	The performance of diagnostic-robust generalized potentials for the identification of multiple high leverage points in linear regression
publishDate	2009
container_title	Journal of Applied Statistics
container_volume	36
container_issue	5
doi_str_mv	10.1080/02664760802553463
url	https://www.scopus.com/inward/record.uri?eid=2-s2.0-70449440325&doi=10.1080%2f02664760802553463&partnerID=40&md5=bc708d903f4383b69bea3cdb61838654
description	Leverage values are being used in regression diagnostics as measures of influential observations in the X-space. Detection of high leverage values is crucial because of their responsibility for misleading conclusion about the fitting of a regression model, causing multicollinearity problems, masking and/or swamping of outliers, etc. Much work has been done on the identification of single high leverage points and it is generally believed that the problem of detection of a single high leverage point has been largely resolved. But there is no general agreement among the statisticians about the detection of multiple high leverage points. When a group of high leverage points is present in a data set, mainly because of the masking and/or swamping effects the commonly used diagnostic methods fail to identify them correctly. On the other hand, the robust alternative methods can identify the high leverage points correctly but they have a tendency to identify too many low leverage points to be points of high leverages which is not also desired. An attempt has been made to make a compromise between these two approaches. We propose an adaptive method where the suspected high leverage points are identified by robust methods and then the low leverage points (if any) are put back into the estimation data set after diagnostic checking. The usefulness of our newly proposed method for the detection of multiple high leverage points is studied by some well-known data sets and Monte Carlo simulations. © 2009 Taylor & Francis.
publisher
issn	13600532
language	English
format	Article
accesstype	All Open Access; Green Open Access
record_format	scopus
collection	Scopus
_version_	1825722586551549952

The performance of diagnostic-robust generalized potentials for the identification of multiple high leverage points in linear regression

Similar Items