老師,您好!
我在用R做project時遇到幾個問題,請幫忙解決一下!
1. 一個數(shù)據(jù)集中有20個變量,對其中四個變量進行分析(chol, copper, trig and platelet)。What transformations could you use to make
these more bell-shaped(更偏向于正態(tài)分布)? 附件中有數(shù)據(jù)。
2. 填充缺失值。原來用的都是均值或中位數(shù)填缺,但是這里用到了另外一種方法: We will investigate missing values through a practice called “missing in the
margins”. Replace the missings with a value that is outside the range of the variable, but close enough so that when plotted, it will not look too far off (e.g. the variable log(chol) falls roughly between 4.7 and 7.5 - so you can replace the missings with a value of 3)。This plot will have a lot of overplotting in the missings. Now jitter the missing values for each of the four variables by adding noise to them (in R: you can use the jitter() function, or add random normal noise using rnorm()). Make sure the variance that you add keeps the missings separate from the rest of the data. See plot below for an example of how this might look.
有些費解,不知道如何用R去實現(xiàn)。
麻煩老師抽時間盡快幫我解答一下,謝謝咯 !