We are unable to create an online viewer for this document. Please download the document instead.
5th International Summer SchoolAchievements and Applications of Contemporary Informatics, Mathematics and PhysicsNational University of Technology of the UkraineKiev, Ukraine, August 3-15, 2010PARAMETER ESTIMATION FOR SEMIPARAMETRIC MODELS WITH CMARS AND ITS APPLICATIONS Fatma YERLIKAYA-ÖZKURTInstitute of Applied Mathematics, METU, Ankara,TurkeyGerhard-Wilhelm WEBERInstitute of Applied Mathematics, METU, Ankara,TurkeyFaculty of Economics, Business and Law, University of Siegen, GermanyCenter for Research on Optimization and Control, University of Aveiro, Portugal Universiti Teknologi Malaysia, Skudai, MalaysiaPakize TAYLANDepartment of Mathematics, Dicle University, Diyarbakır, TurkeyOutline•Introduction•Estimation for Generalized Linear Model (GLM)•Generalized Partial Linear Model (GPLM) •Estimation for GPLM–Least-Squares Estimation with Tikhonov Regularization–CMARS Method•Penalized Residual Sum of Squares (PRSS) for GLM with MARS•Tikhonov Regularization for GLM with MARS•An Alternative Solution for Tikhonov Regularization Problem with CQP •Solution Methods•Application•ConclusionIntroductionThe class of Generalized Linear Models (GLMs) has gained popularity as a statistical modeling tool.This popularity is due to:• The flexibility of GLM in addressing a variety of statistical problems,• The availability of software (Stata, SAS, S-PLUS, R) )to fit the models.The class of GLM is an extension of traditional linear models allows: The mean of a dependent variable to depend on a linear predictor by a nonlinear link function...... The probability distribution of the response, to be any member of an exponential family of distributions.Many widely used statistical models belong to GLM:o linear models with normal errors, o logistic and probit models for binary data,o log-linear models for multinomial data.IntroductionMany other useful statistical models such as with•Poisson, binomial, •Gamma or normal distributions,can be formulated as GLM by the selection of an appropriate link functionand response probability distribution. A GLM looks as follows: H ( ) Tx ;iii• E(Y ) : expected value of the response variable Y , iii•: smooth monotonic link function, H• x : observed value of explanatory variable for the i-th case, i• : vector of unknown parameters. Introduction•Assumptions: Y are independent and can have any distribution from exponential family densityiY ~ f ( y , ,)iYiii y b ( ) exp i iii c (y ,)inii ( 1,2,..., ),a ()i•a ,b ,c are arbitrary “scale” parameters, and is called a natural parameter.iiii•General expressions for mean and variance of dependent variable Y :i' E(Y ) b ( ),iiiiVar(Y ) V ( ) ,ii"V ( ) b ( ) , a () : / .iiiiiiEstimation for GLM•Estimation and inference for GLM is based on the theory of•Maximum Likelihood Estimation•Least–Squares approach:l( ) : n( ybc yi (iiii ) ( ,)).iii 1•The dependence of the right-hand side on is solely through the dependence of the on .iGeneralized Partial Linear Models (GPLMs)•Particular semiparametric models are the Generalized Partial Linear Models (GPLMs) :They extend the GLMs in that the usual parametric terms are augmented by a single nonparametric component:, TE Y X TG X T ;• Tmis a vector of parameters, and is a smooth function,which we try to estimate by CMARS. •Assumption: m-dimensional random vector X which represents (typically discrete) covariates, q-dimensional random vector T of continuous covariates,which comes from a decomposition of explanatory variables. Other interpretations of :role of the environment,expert opinions,Wiener processes, etc.. Estimation for GPLM•There are different kinds of estimation methods for GPLM. •Generally, the estimation methods for model , TE Y X TG X T ;is based on kernel methods and test procedures on the correct specification of this model.•Now, we will try to concentrate on special types of GPLM estimation based on-------Newly developed data mining method CMARSand-------Least –Squares estimation with Tikhonov regularization.Least-Squares Estimation with Tikhonov Regularization•The general model , TE Y X TG X T ;can be considered as semiparametric generalized linear model and can be written as follows:mH () ( X ,T )T X T X Tjj .j 1observation values y , x , t (i 1, 2 , . .. , )n . iii (G ) and H ( )T x tiii i and is a smooth function.ii•For the estimation of parametric part, we apply the linear least squares with Tikhonov regularization.The Least-Squares Estimation with Tikhonov RegularizationThe process is as follows:Firstly, we apply the linear least squares on the given data to find a vectorpreproc:(*)preprocYTX Equivalently, the model form isy mX .0jjj 1The method of least squares is used for estimating the regression coefficients, mpreproc ( , , ,..., )T in y X ,012m0jjj 1to minimize the residual sum of squares (RSS).