Importing an adjacency matrix (csv) containing strings and floats in R -

- July 15, 2013

this question has answer here:

r - column names in read.table , write.table starting number , containing space 1 answer

i have co-occurrence adjacency matrix this: https://dl.dropboxusercontent.com/u/73950/matrix_added_cats.csv

where rows , columns may contain strings special characters ("(", "-", " ", etc.)

when import data r visualize ggplot2, this:

mydata <- read.csv("/matrix_added_cats.csv")

which returns:

                name  ngo gov..institutions industry..farming. industry..mining. academia.research aboriginal.groups 1                ngo 0.00              0.00                  0              0.00              0.01              0.00 2  gov. institutions 0.00              0.01                  0              0.04              0.03              0.01 3 industry (farming) 0.00              0.00                  0              0.00              0.00              0.00 4  industry (mining) 0.00              0.04                  0              0.10              0.25              0.07 5  academia/research 0.01              0.03                  0              0.25              0.36              0.10 6  aboriginal groups 0.00              0.01                  0              0.07              0.10              0.02

we see names of columns containing float values not same, proper, strings before. leads, think, several issues in ggplot2 visualization:

library(reshape) dat <- melt(mydata) mypalette <- colorramppalette(rev(brewer.pal(9, "spectral")), space="lab")  zp1 <- ggplot(dat,aes(x = variable, y = name, fill = value)) zp1 <- zp1 + geom_tile() zp1 <- zp1 + scale_fill_gradientn(colours = mypalette(100),trans = "reverse") zp1 <- zp1 + scale_x_discrete(expand = c(0, 0)) zp1 <- zp1 + scale_y_discrete(expand = c(0, 0)) zp1 <- zp1 + coord_equal() zp1 <- zp1 + theme_bw() + theme(axis.text.x = element_text(angle = 45, hjust = 1)) print(zp1)

1) co-occurrence matrix make sense, rows , columns should in same order (so same row/columns elements meet on diagonal), reason, ggplot2 orders them differently. because strings different between row , columns since import?

2) specials characters replaced "..", looks bad.

is there way fix these problems?

you can use argument check.names = false in read.csv suppress replacement of special characters in column names.

mydata <- read.csv("/matrix_added_cats.csv", check.names = false)  names(mydata) # [1] "name"               "ngo"                "gov. institutions"  "industry (farming)" # [5] "industry (mining)"  "academia/research"  "aboriginal groups"

Search This Blog

Panthy J

Importing an adjacency matrix (csv) containing strings and floats in R -

Comments

Post a Comment

Popular posts from this blog

yii2 - Yii 2 Running a Cron in the basic template -

asp.net - 'System.Web.HttpContext' does not contain a definition for 'GetOwinContext' Mystery -

wso2esb - How to concatenate JSON array values in WSO2 ESB? -