tag:blogger.com,1999:blog-3824394956672712858.post6427812377614235174..comments2024-01-26T23:57:05.224-08:00Comments on Probability Puzzles: Linear Regression, Transforms and RegularizationUnknownnoreply@blogger.comBlogger4125tag:blogger.com,1999:blog-3824394956672712858.post-76648327866998451672014-03-23T21:12:53.343-07:002014-03-23T21:12:53.343-07:00This example has no context so modelling decisions...This example has no context so modelling decisions are being made in a vacuum. That never happens in real life. <br /><br />Transforming your way around outliers (if that's what you have - only a context would shed light on this) is never a really good idea. Why not recognize hat you have an outlier and use a robust method. E.g.<br /><br />dfr <- data.frame(x = x, y = y) ## put things together that belong together...<br />library(robustbase)<br />mrob <- lmrob(y ~ poly(x, 3), dfr)<br /><br />summary(mrob) ## shows one outlier, as expected<br /><br />plot(y ~ x, dfr, type = "b")<br />lines(predict(mrob, dfr) ~ x, dfr, col = "blue")<br /><br />This model has 4 mean parameters + 1 variance, and so is comparable with yours which has 2 reg coeffs + 1 transform + 1 tuning constant + 1 variance.<br /><br />I don't think either model would be much good for extrapolation, though.Bill Venableshttps://www.blogger.com/profile/17388811817434387510noreply@blogger.comtag:blogger.com,1999:blog-3824394956672712858.post-72757104694652325622014-03-23T09:49:45.142-07:002014-03-23T09:49:45.142-07:00Good! Just was surprised by how bad the fit was, i...Good! Just was surprised by how bad the fit was, it's not "clearly worse off" anymore :-)Caerolushttps://www.blogger.com/profile/15850152660497355643noreply@blogger.comtag:blogger.com,1999:blog-3824394956672712858.post-55407473289201772972014-03-23T09:43:31.168-07:002014-03-23T09:43:31.168-07:00Right. Corrected. Thanks for pointing out.Right. Corrected. Thanks for pointing out.RumpelStiltSkinhttps://www.blogger.com/profile/09194504426856651927noreply@blogger.comtag:blogger.com,1999:blog-3824394956672712858.post-67532144298001831172014-03-23T09:20:34.437-07:002014-03-23T09:20:34.437-07:00I think there's something wrong with the red f...I think there's something wrong with the red fit (transformed response). Just did it myself and got y.hat = c(1.505845, 1.724990, 1.987590, 2.304775, 2.691234, 3.166601, 3.757496, 4.500575, 5.447183, 6.670643, 8.277932, 10.428961, 13.369481, 17.489335)Caerolushttps://www.blogger.com/profile/15850152660497355643noreply@blogger.com