SparkR on Ec2 - Up and Running in 30 Minutes

Motivation The purpose of this post is to walk through spinning up a Spark cluster using Amazon Web Services EC2 servers and use R to interface with that cluster. The Apache Spark distribution comes with an EC2 script to do this, which was extremely helpful, but I had a hard time getting the newly released SparkR to […]

dplyr with PostgreSQL

A general complaint with R is that the size of your data is limited to the amount of memory available on your machine. One solution is to spin up a cloud server with 224 GB of RAM and install R if that is large enough for your data. Another solution is to load your data […]

Installing a Specific Version of R on Ubuntu

I just blindly accepted to install all updates on my Ubuntu machine, and one of those updates was to R version 3.1.3, which doesn't have all of the packages I want to use ported to it yet. So I had to roll back my installation of R. Here is how you do that...   Uninstall […]

OpenCPU Server on AWS: Accessing R via an API

Motivation and Use Case Developer - Suppose you're an app developer and want to take advantage of code someone wrote in R. You have no interest in learning R, nor whatever that fancy statistical algorithm does. You just want to call and API and get the answer.  Black boxes are fine with you, provided that black box […]

Lazy Man's Install of RStudio Server on EC2

Let’s say you have a large job to process with R, but the hardware on your laptop just isn’t cutting it. One solution is to spin up an AWS EC2 server, install R, and run your process on a temp server that you can use for a short amount of time. Currently, AWS has a […]