Posts

Showing posts with the label R

Random Forest Regression, Negative Variance Explained mechanism

Jeffery Evans , Senior Landscape Ecologist, The Nature Conservancy, Global Lands Science Team, Affiliate Assistant Professor, Zoology & Physiology, University of Wyoming explains a negative percent variance explained in a random forest regression in hilarious way - I have recently been asked the question: “why do I receive a negative percent variance explained in a random forest regression”. Besides the obvious answer “because your model is crap” I thought that I would explain the mechanism at work here so the assumption is not that randomForests is producing erroneous results. For poorly supported models it is, in fact, possible to receive a negative percent variance explained. Generally, explained variance (R²) is defined as: R² = 1 - sum((ŷ-mean(y))²) / sum((mean(y)-y)²) However, as indicated by Breiman (2001) and the R randomForest documentation the (regression only) “pseudo R-squared” is derived as: R² = 1 – (mean squared error) / var(y) Which, mathe...

Passing R variables dynamically to JavaScript for data visualization

Image
For a random project, I was interested to see if there a way to pass data between R and JavaScript. My purpose was to populate a Highcharts graph dynamically using preprocessed data from R. Of course, my first look was rCharts , an R package to create and publish interactive JavaScript visualizations, which also support the Highcharts for building interactive graphs and plots. The rCharts is a great package and widely popular among R users to create interactive charts without knowing the underlying JavaScript. My requirement was not only the interactive chart but also dynamic that can create chart on near real time as shown in Highcharts demo . It seems to me that rCharts neither provide a straightforward way to add-point on series nor a way to customize the JavaScript to create such dynamic charts. As far as I know, it has limitations (or out of context of rCharts philosophy) for doing sophisticated jobs that requires JavaScript customization. At this moment it only supports ...

How to update R to a new version?

A R update method proposed by Dr. Jeffrey S. Evans, University of Wyoming. Here is the method background and details: Updating R to a new version can be difficult so, I thought that R users out there would find these R version management and configuration approaches useful. There is an R package “installr” that allows one, at the command line, to check the current R version and optionally: 1) download the new version, 2) launch the install, 3) move installed libraries to the new R install, 4) update libraries and 5) uninstall the previous R version. I also attached an “Rprofile.site” R GUI configuration file that will set the path for the R library and changes some other settings. Following is the code that will install and require the “installr” package and run the “updateR” function with the appropriate flags. To run in Windows, right click the R icon and select "Run as administrator". This script will automatically: 1) check and download the current version of R, ...

How to get raster pixel values along the overlaying line?

Image
One afternoon at Java City, my friend Eric and I were discussing about the ways to to get raster pixel values along the overlaying line. The conversation encourages me to write an quick and dirty solution to solve the issue. The following R code snippet helps to conceive an idea to extract the raster values which are intersect or touch by the overlaying straight line in easy fashion using R raster package. #Print the raster pixel values along the overlaying line in R. The line's start and end row/col (coordinates) must be provided. library(raster) #Create an arbitrary raster, for instance I used a names of color as raster pixel values. r <- as.raster(matrix(colors()[1:100], ncol = 10)) #Start coordinate of a sample line x0=1      #row = 1 y0=3      #column = 3 r[x0,y0] #End coordinate of a sample line x1=10        #row =10 y1=7 #column=7 #Easy sample line generation algorithm : A naïve line-drawing algorithm...

Generate Euclidean distance matrix from a point to its neighboring points in R

#Generate Euclidean distance matrix from a point to its neighboring  points in R #Load sp library library(sp) #Create a 2D metrix of X & Y coordinates of the neighboring  points neighbours_point <- matrix(c(5, 6,3,5,4,8,7, 10, 60, 60,11,12), ncol=2) neighbours_point      [,1] [,2] [1,]    5    7 [2,]    6   10 [3,]    3   60 [4,]    5   60 [5,]    4   11 [6,]    8   12 #Create a point vector with x and y coordinates from which distance should be calculated refrence_point<-c(2,3) refrence_point [1] 2 3 #Compute the distance matrix distmat <- spDistsN1(neighbours_point,refrence_point, longlat=FALSE) distmat [1]  5.000000  8.062258 57.008771 57.078893  8.246211 10.816654 Enjoy!!

Extract Raster Values from Points

The R blog article encourages me to write this solution to extract Raster values from points in R. In geospatial analysis, extracting the raster value of a point is a common task. If you have few raster files or few points; you can extract the raster value by overlaying a point on the top of the raster using ArcGIS. What will you do, if you have hundreds of raster files and thousands of points? The easy solution is use loop in Python and ArcGIS. Is loop efficient to use? No. Can loop be avoided? Yes. Then how?  Follow the steps: Step 1: Create a Raster stack or Raster brick of your raster files using “raster” package in R . For example: rasStack = stack(raster1, raster2,raster3 …rasterN) Step 2:   Read point data, and convert them into spatial points data frame. Sample: pointfile.csv Point_ID LONGITUDE LATITUDE 1 48.765863             -...

R tips must know by Geospatial Analyst

1) Merge ESRI shape file with external CSV or data frame to plot the map with CSV/data frame variables library(maptools) library(sp) library(shapefiles) library(RColorBrewer)    # creates nice color schemes library(classInt)               # finds class intervals for continuous variables #Read files csvvalues=read.csv(“c:/csv_path ") shapefile= readShapePoly("c:/shape.shp") #Merge data by unique ID shapefile@data <-data.frame(shapefile@data, csvvalues, by="ID") attach(shapefile@data) # Define the number of classes to be mapped nclass <- 5 # Set the color ramp that will be used to plot these classes cols <- brewer.pal(nclass,"YlGnBu") # Set the class breakpoints using equal intervals # Can also use quantiles or natural breaks - see help(classIntervals) breaks <- classIntervals(Column_name_to_be_mapped, nclass, style="quantile") # Based on the breakpoints and color ramp, sp...

Interesting cheat sheets for R beginner

To clear the console: CTRL + L To seek help: ?command_name To view type:class(object_name) To maximize the console print view: Options(max.print=999999) To list active objects in R: ls() To Remove single object:rm(object_name) To Remove all Objects in R: rm(list=ls()) To remove all the objects - except 'a': rm(list = ls()[-a])