Find a list of strings across a data table in R -


i have vector of strings (candidates), each of want find within data table (fbgn_dmels), , return first column entry if match found within row (e.g. cg2175 should return "1-dec").

> head(candidates) [1] "cg2175" "cg31196"  "cg3169"  "cg15168" "cg2252"  "cg2019"   > fbgn_dmels                v1_01       v1_02       v1_03       v1_04       v1_05       v1_06       v1_07       v1_08       v1_09   v1_10 v1_11 v1_12 v1_13 v1_14     1:         1-dec fbgn0000427 fbgn0000645      cg2175          na          na          na          na          na      na    na    na    na    na     2:         1-sep fbgn0011710 fbgn0005665 fbgn0013404 fbgn0014082 fbgn0024226      cg1403          na          na      na    na    na    na    na     3:         128up fbgn0010339 fbgn0010196      cg8340          na          na          na          na          na      na    na    na    na    na     4: 14-3-3epsilon fbgn0020238 fbgn0011329 fbgn0016739 fbgn0016743 fbgn0046456 fbgn0051196 fbgn0064146 fbgn0066007 cg31196    na    na    na    na     5:    14-3-3zeta fbgn0004907 fbgn0010635 fbgn0019723 fbgn0023038 fbgn0046306 fbgn0064146     cg17870          na      na    na    na    na    na    ---                                                                                                                                               17743:          zw10 fbgn0004643 fbgn0000016 fbgn0002765 fbgn0029627      cg9900          na          na          na      na    na    na    na    na 17744:        zwilch fbgn0061476 fbgn0036933 fbgn0042214     cg18729     cg18639          na          na          na      na    na    na    na    na 17745:           zyd fbgn0265767 fbgn0243503 fbgn0025689 fbgn0058147 fbgn0040030      cg2893     cg40147          na      na    na    na    na    na 17746:           zye fbgn0036985      cg5847          na          na          na          na          na          na      na    na    na    na    na 17747:           zyx fbgn0011642 fbgn0047225 fbgn0052018     cg32018          na          na          na          na      na    na    na    na    na 

i solve issue using loops on data frame, seems quite slow , inefficient. wondering if there straightforward way of doing data tables.

many in advance suggestions how tackle this.

-geo

pretty hacky, seems work. i'm assuming data called fbgn_dmels:

candidates <- c("cg2175", "cg31196", "cg3169", "cg15168", "cg2252", "cg2019") getthem <- function(string){   string <- paste0("^",string,"$")   as.character(fbgn_dmels[which(apply(fbgn_dmels, 2, function(x) grepl(string, x, perl=true)), arr.ind = true)[1], "v1_01"][1]) } 

sapply(candidates, getthem)

first have defined function (getthem) gets first occurrence single one, use sapply hit candidates.


Comments

Popular posts from this blog

yii2 - Yii 2 Running a Cron in the basic template -

asp.net - 'System.Web.HttpContext' does not contain a definition for 'GetOwinContext' Mystery -

mercurial graft feature, can it copy? -