Rasterizing Shapefiles in R
15 May 2014
View the source of this post
I recently needed to convert a shapefile to a raster for use in another package and wanted to share my steps here.
For this demonstration, I started with the Natural Earth Data 1:10m Physical Vectors Land shapefile and will convert it to a raster using the
A bit of searching around on the web led me to Amy Whitehead’s page on Converting shapefiles to rasters in R. The code listed there wasn’t quite what I needed but gave me the head start on figuring out what I needed to do.
Load required packages
maptools package is used to import the shapefile to a
SpatialPolygonsDataFrame and the
raster package is for rasterizing the shapefile.
Load the shapefile we’ll be rasterizing
readShapePoly function from package
maptools to read our shapefile in as a
Create a blank raster
Before we can rasterize the shapefile, we need to create a blank raster. We specifiy the number of rows and columns to control the spatial resolution of our raster. We also have to specify the spatial extent, which will determine how much of the area of our shapefile we’re rasterizing. Here, I use the
extent function from the
raster package to grab the extent of the shapefile and pass that to the
raster function. You can also specify extent manually.
If we were to plot this now, we would get an error because, by default, new rasters are created without any values. Rasters are fundamentally just like a 2-dimensional matrix (though they are stored as 1-d vectors). To get at this 2-d matrix, rasters have a slot called
data, which you can access by calling
blank_raster@data. You can get (and set) the values of the raster using the
Above, we’ve set all those values to 1 and the result is just a raster of one value (color). The values inside the raster are stored in a vector of length
nrow * ncol, so let’s set the values equal to the sequence of numbers from
Pretty cool! So we can see that raster’s values are stored going from the top-left to the bottom-right.
Rasterize the shapefile
The first line here is pretty straightforward but the second line requires some explanation. The
raster function tries to assign values to the resulting raster cells. Cells inside any of the polygons of the shapefile will have some value, while cells not included in any of the polygons of the shapefile will have the value
NA. For some shapefiles, values are picked up from the attributes embedded in the shapefile and will result in variation in values (colors). In my case, I want to force all cells corresponding to land to be equal to 1 and leave the rest
Plot the result
Rasters can be plotted directly with the
plot function because the
raster package implements its own
plot method. Here, we also add the shapefile back on top of the raster to see how we did.
Looks pretty good! The filled-in yellow area is the new raster. Let’s zoom in to see what’s going on.
Not great. The raster cells are roughly outlining the land but end up being pretty poor approximations in places. Clearly, we might like to improve the spatial resolution of our raster but this is a pretty good start.
Let’s the spatial resolution up to see if we can do better.
Bonus: Effect of different raster resolutions
In the above example, I used a 100x100 grid for the raster. Below, I test 20x20, 100x100, 500x500, and 1000x1000 grids together to explore the effect of that part of the process. What resolution do we need to adequately describe the shape of the land?
You can see that, by 500x500, we’re getting a raster that looks pretty close to the underlying shapefile.