R:Count daily number of a variable distinguish per ID -


i've asked similar question before (here link), time, want calculate number of v distinguish per day , per id, "distinguish" not means different v 1 day, means different v day , forwards days.

for example, if there v1 in second day, in day before, don't count v1 second day.

id1:

day1: v1/v2 -----> 2 day1

day2: v1/v3 -----> 1 day2

day3: v3 -----> 0 day3

id2

day1: v4 -----> 1 day1

day2: v5/v4/v1 -----> 2 day2

day3: v3/v4 -----> 1 day3

here data:

id         day             v id1         1              v1 id1         1              v1 id1         1              v2 id1         2              v1 id1         2              v3 id1         3              v3 id1         3              v3 id1         3              v3 id2         1              v4 id2         2              v5 id2         2              v5 id2         2              v4 id2         2              v1 id2         3              v3 id2         3              v4 

with data above, wanna result like:

id         day             v         daily_v_distinguish_id id1         1              v1            2 id1         1              v1            na id1         1              v2            na id1         2              v1            1 id1         2              v3            na id1         3              v3            0 id1         3              v3            na id1         3              v3            na id2         1              v4            1 id2         2              v5            2 id2         2              v5            na id2         2              v4            na id2         2              v1            na id2         3              v3            1 id2         3              v4            na 

if use setdt(df1)[, daily_v_id := c(uniquen(v), rep(na, .n-1)), = .(id, day)], have not compared v in day forwards days.

we use data.table create 'daily_v_distinguish_id'. convert 'data.frame' 'data.table' (setdt(df1)), grouped 'id' create logical index based on elements in 'v' not duplicated. in next step, group 'id' , 'day' column, sum of 'indx' , concatenate 'na' fill rest of elements in each group , assign (:= 'daily_v_distinguish_id'.

 library(data.table)  setdt(df1)[, indx:=!duplicated(v) ,.(id)     ][, daily_v_distinguish_id:= c(sum(indx),rep(na, .n-1)) , .(id, day)     ][,indx:=null] df1 #     id day  v daily_v_distinguish_id # 1: id1   1 v1                      2 # 2: id1   1 v1                     na # 3: id1   1 v2                     na # 4: id1   2 v1                      1 # 5: id1   2 v3                     na # 6: id1   3 v3                      0 # 7: id1   3 v3                     na # 8: id1   3 v3                     na # 9: id2   1 v4                      1 #10: id2   2 v5                      2 #11: id2   2 v5                     na #12: id2   2 v4                     na #13: id2   2 v1                     na #14: id2   3 v3                      1 #15: id2   3 v4                     na 

a similar option using dplyr is

library(dplyr) df1 %>%     group_by(id) %>%    mutate(ind=!duplicated(v)) %>%    group_by(day, add=true)%>%     mutate(daily_v_distinguish_id=c(sum(ind), rep(na, n()-1))) %>%     select(-ind) 

or using ave base r

with(df1, ave(!duplicated(df1[-2]), id, day, fun=function(x)                    c(sum(x), rep(na, length(x)-1)))) #[1]  2 na na  1 na  0 na na  1  2 na na na  1 na 

data

df1 <- structure(list(id = c("id1", "id1", "id1", "id1", "id1", "id1",  "id1", "id1", "id2", "id2", "id2", "id2", "id2", "id2", "id2" ), day = c(1l, 1l, 1l, 2l, 2l, 3l, 3l, 3l, 1l, 2l, 2l, 2l, 2l,  3l, 3l), v = c("v1", "v1", "v2", "v1", "v3", "v3", "v3", "v3",  "v4", "v5", "v5", "v4", "v1", "v3", "v4")), .names = c("id",  "day", "v"), class = "data.frame", row.names = c(na, -15l)) 

Comments

Popular posts from this blog

yii2 - Yii 2 Running a Cron in the basic template -

asp.net - 'System.Web.HttpContext' does not contain a definition for 'GetOwinContext' Mystery -

mercurial graft feature, can it copy? -