小男孩‘自慰网亚洲一区二区,亚洲一级在线播放毛片,亚洲中文字幕av每天更新,黄aⅴ永久免费无码,91成人午夜在线精品,色网站免费在线观看,亚洲欧洲wwwww在线观看

分享

我承認(rèn)tidyverse已經(jīng)脫離了R語言的范疇

 育種數(shù)據(jù)分析 2022-05-11

最近知乎熱議:R和Python誰更優(yōu)雅的問題,或者誰更適合數(shù)據(jù)分析的問題,各種討論,非常值得一看:

https://www.zhihu.com/question/527922200

別點(diǎn)了,微信不支持超鏈接?。?!

就我個(gè)人而言,Python更適合寫流程,平時(shí)建模都是用R語言處理好數(shù)據(jù),交予第三方軟件,最后用Python串起來。不得不說,R語言的tidyverse是真的好,非常高效。從某種角度,只學(xué)R語言沒有接觸過tidyverse的用戶,看到R的代碼,覺得它已經(jīng)脫離了R語言的范疇?。?!

最近在學(xué)習(xí)tidyverse,批量方差分析之前都是用for循環(huán),然后用formula處理模型,再把結(jié)果保存為list的形式,現(xiàn)在學(xué)習(xí)了tidyverse的操作,可以用pivot_longer將所有性狀進(jìn)行長數(shù)據(jù)轉(zhuǎn)化,然后用group_by和nest變?yōu)榱斜恚詈笥胢ap進(jìn)行批量建模,用tidy進(jìn)行結(jié)果的整理,更加行云流水。下面我們通過代碼來看一下。

看一下我最終的代碼:

fm1 = fm %>% pivot_longer(-c(1:5),names_to = "trait",values_to = "y")head(fm1)fm1 %>% group_by(trait) %>% nest %>%  mutate(model = map(data,~aov(y ~ Spacing + Rep, data=.))) %>%   mutate(result = map(model,~tidy(.))) %>%   unnest(result)

上面的代碼,如果沒有tidyverse的基礎(chǔ),是看不懂啥意思的,畢竟map,group_by,mutate,nest,unnest,tidy都是什么鬼是從來沒見過的。

結(jié)果文件:


看一下需求:

> library(learnasreml)> data(fm)> str(fm)'data.frame':  827 obs. of  13 variables: $ TreeID : Factor w/ 827 levels "80001","80002",..: 1 2 3 4 5 6 7 8 9 10 ... $ Spacing: Factor w/ 2 levels "2","3": 2 2 2 2 2 2 2 2 2 2 ... $ Rep    : Factor w/ 5 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ... $ Fam    : Factor w/ 55 levels "70001","70002",..: 44 44 44 15 15 2 2 10 10 10 ... $ Plot   : Factor w/ 4 levels "1","2","3","4": 1 2 4 1 4 2 4 1 2 3 ... $ dj     : num  0.334 0.348 0.354 0.335 0.322 0.359 0.368 0.358 0.323 0.298 ... $ dm     : num  0.405 0.393 0.429 0.408 0.372 0.45 0.509 0.381 0.393 0.361 ... $ wd     : num  0.358 0.365 0.379 0.363 0.332 0.392 0.388 0.369 0.347 0.324 ... $ h1     : int  29 24 19 46 33 30 37 32 34 28 ... $ h2     : int  130 107 82 168 135 132 124 126 153 127 ... $ h3     : int  239 242 180 301 271 258 238 290 251 243 ... $ h4     : int  420 410 300 510 470 390 380 460 430 410 ... $ h5     : int  630 600 500 700 670 570 530 660 600 630 ...

數(shù)據(jù)共有827行數(shù)據(jù),相對Fam進(jìn)行方差分析。

比如對`dj`進(jìn)行方差分析:可以看到Fam之間達(dá)到極顯著水平。

問題來了,如果相對`dj`,`dm`……`h5`這些性狀都進(jìn)行方差分析,應(yīng)該如何處理呢?當(dāng)然可以一個(gè)性狀做一個(gè)模型,我們更想批量處理一些。

首選當(dāng)然是For循環(huán):

# fornn = names(fm)[-c(1:5)]
re = NULLfor(i in seq_along(nn)){ # i = 1 mod = aov(formula(paste0(nn[i],"~Fam + Rep")),data=fm) re[[i]] = summary(mod)}names(re) = nnre

結(jié)果:

> re$dj# A tibble: 3 x 6  term         df  sumsq   meansq statistic   p.value  <chr>     <dbl>  <dbl>    <dbl>     <dbl>     <dbl>1 Fam          54 0.0912 0.00169       3.52  9.85e-152 Rep           4 0.0319 0.00797      16.6   4.60e-133 Residuals   767 0.368  0.000480     NA    NA       
$dm# A tibble: 3 x 6 term df sumsq meansq statistic p.value <chr> <dbl> <dbl> <dbl> <dbl> <dbl>1 Fam 54 0.214 0.00396 2.12 0.000009962 Rep 4 0.0279 0.00696 3.73 0.00515 3 Residuals 766 1.43 0.00187 NA NA
$wd# A tibble: 3 x 6 term df sumsq meansq statistic p.value <chr> <dbl> <dbl> <dbl> <dbl> <dbl>1 Fam 54 0.123 0.00227 3.86 3.83e-172 Rep 4 0.0469 0.0117 19.9 1.29e-153 Residuals 768 0.452 0.000588 NA NA
$h1# A tibble: 3 x 6 term df sumsq meansq statistic p.value <chr> <dbl> <dbl> <dbl> <dbl> <dbl>1 Fam 54 13444. 249. 4.71 4.35e-232 Rep 4 4623. 1156. 21.9 4.06e-173 Residuals 768 40572. 52.8 NA NA
$h2# A tibble: 3 x 6 term df sumsq meansq statistic p.value <chr> <dbl> <dbl> <dbl> <dbl> <dbl>1 Fam 54 82699. 1531. 2.05 2.31e- 52 Rep 4 65677. 16419. 22.0 3.11e-173 Residuals 768 572403. 745. NA NA
$h3# A tibble: 3 x 6 term df sumsq meansq statistic p.value <chr> <dbl> <dbl> <dbl> <dbl> <dbl>1 Fam 54 183935. 3406. 1.88 2.12e- 42 Rep 4 108005. 27001. 14.9 1.01e-113 Residuals 768 1393118. 1814. NA NA
$h4# A tibble: 3 x 6 term df sumsq meansq statistic p.value <chr> <dbl> <dbl> <dbl> <dbl> <dbl>1 Fam 54 382898. 7091. 1.17 1.97e- 12 Rep 4 454090. 113523. 18.7 1.12e-143 Residuals 765 4644446. 6071. NA NA
$h5# A tibble: 3 x 6 term df sumsq meansq statistic p.value <chr> <dbl> <dbl> <dbl> <dbl> <dbl>1 Fam 54 676396. 12526. 1.58 5.79e- 32 Rep 4 682404. 170601. 21.6 7.01e-173 Residuals 765 6049952. 7908. NA NA

然后我們看tidyverse的解決方案:

head(fm)fm1 = fm %>% pivot_longer(-c(1:5),names_to = "trait",values_to = "y")head(fm1)fm1 %>% group_by(trait) %>% nest %>%  mutate(model = map(data,~aov(y ~ Spacing + Rep, data=.))) %>%   mutate(result = map(model,~tidy(.))) %>%   unnest(result)

第一步:將數(shù)據(jù)轉(zhuǎn)化為長數(shù)據(jù)

第二步:將數(shù)據(jù)group_by,然后nest形成列表

第三步:使用map進(jìn)行批量方差分析

第四步:使用map進(jìn)行結(jié)果整理

結(jié)果:


一個(gè)字:絕

二個(gè)字:真絕

……



    轉(zhuǎn)藏 分享 獻(xiàn)花(0

    0條評論

    發(fā)表

    請遵守用戶 評論公約