regex - R Split string data delimited by spaces into columns -
i have large data frame 1 column, containing different numeric values separated spaces, need extract , organize in columns
<call begin=6.0982886400000051 end=6.1078732800000051 maxfreq=40893.5546875 minfreq=35400.390625 peakfreq=39672.8515625 peakfreqs=39672.8515625 39672.8515625 39672.8515625 39672.8515625 39672.8515625 39672.8515625 39672.8515625 39672.8515625 39672.8515625 39672.8515625 39672.8515625 39672.8515625 39062.5 39062.5 39062.5 39062.5 39062.5 39062.5 39062.5 39062.5 39062.5 39062.5 39062.5 39062.5 39062.5 39062.5 39062.5 39062.5 39062.5 39062.5 38452.1484375 38452.1484375 38452.1484375 38452.1484375 38452.1484375 38452.1484375 38452.1484375 38452.1484375 38452.1484375 38452.1484375 38452.1484375 38452.1484375 38452.1484375 38452.1484375 38452.1484375 38452.1484375 38452.1484375 38452.1484375 38452.1484375 38452.1484375 38452.1484375 38452.1484375 38452.1484375 38452.1484375 38452.1484375 38452.1484375 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37841.796875 37231.4453125 37231.4453125 37231.4453125 37231.4453125 37231.4453125 37231.4453125 37231.4453125 37231.4453125 37231.4453125 37231.4453125 37231.4453125 37231.4453125 37231.4453125 37231.4453125 37231.4453125 37231.4453125 37231.4453125 37231.4453125 37231.4453125 37231.4453125 36621.09375 36621.09375 36621.09375 36621.09375 intensity=-14.902734633213136 periodicity=0.853448275862069 shape=- calltype=cf-n species=pipistrellus kuhlii (77%), pipistrellus nathusii (77%) custom=false />
this more information data
'data.frame':39 obs. of 1 variable $ x1: factor w/ 120 levels " <double>25.318181818181806</double>",..: 66 67 68 69 70 71 72 73 74 75...
i need that:
call_begin call_end maxfrec minfrec 1 0.59170816000000048 0.60006400000000049 531.005.859.375 433.349.609.375 2 0.7636582400000006 0.77135872000000061 531.005.859.375 42.724.609.375 peakfrec 1 482.177.734.375 2 469.970.703.125
i have ideas achieve this, first try separate in columns, using strsplit, , later use substr function, extract numbers , rbind make table, found threads related topics, replicate in data.
i'll appreciate , please let me know if not clear.
similar solution described. solution bit more generic , doesn't depend on number of columns:
text <- '<call begin=0.59170816000000048 end=0.60006400000000049 maxfreq=53100.5859375 minfreq=43334.9609375 peakfreq=48217.7734375 <call begin=0.7636582400000006 end=0.77135872000000061 maxfreq=53100.5859375 minfreq=42724.609375 peakfreq=46997.0703125' process_line <- function(line) { sp <- strsplit(line, ' ')[[1]][-1] cn <- sapply(sp, function(x) strsplit(x, "=")[[1]][1]) data <- sapply(sp, function(x) as.numeric(strsplit(x, "=")[[1]][2])) names(data) <- cn data } t(sapply(strsplit(text, "\n")[[1]], process_line, use.names = false)) begin end maxfreq minfreq peakfreq [1,] 0.5917082 0.6000640 53100.59 43334.96 48217.77 [2,] 0.7636582 0.7713587 53100.59 42724.61 46997.07
it based on assumption test not separated lines, otherwise strsplit(text, "\n")[[1]]
text
. there no need use regex, since data can obtained splitting smaller chunks =
Comments
Post a Comment