Securing R on Red Hat Enterprise Linux

Update:  Check out the next post in this series for help with the GTK 2.8.0 dependency requirement for the common R package RGtk2.

So, as I wrote earlier, I have a group of users who need to use the statistical computing language, R.  R presents an interesting security challenge because it permits the installation and execution of R packages by the users.  Users of R, in turn, expect the atypical privilege of installing and manipulating software on their systems without administrative oversight.

So, if you too work in an environment where you need to actually care about security and you get a new team of R users who need a Red Hat Enterprise Linux system configured for their use, you too might encounter a lot of dissatisfaction from that team when it comes to being able to install and execute software packages.  What follows is my advice for adding some security but still having the flexibility to allow the users to install software.

  1. Follow the Center for Internet Security’s benchmark for Red Hat Enterprise Linux (or some other trusted standard).  Lock your system down and make it secure by default.  Only then should you begin adding features (such as R).
    1. One of the modifications you will likely make to secure your system is to create a separate file system for /tmp and to mount it with the noexec option (this is an extremely valuable threat mitigation technique, and I recommend it highly).  If this is done, R, which uses /tmp by default to make and configure R packages, won’t be able to do that.
      1. To solve this, simply have your users configure their .bashrc files (or, if they prefer other shells, the analogous file) to export the TMPDIR variable (used by R) with a value in their home directory or something:
        1. mkdir ~/Rmake
        2. echo export TMPDIR=~/home/Rmake >>  ~/.bashrc
  2. Don’t allow the users to execute R as root (duh).  This will, without some sort of really horrible exploit allowing the improper elevation of privilege, constrain the damage the R packages can do (should any ever be malicious, that is).
    1. To allow users to install R packages to the central, default location (instead of to their own personal collections in /home), create a group for the R users (“rusers” is used below as an example) and execute the following commands:
      • chown root.rusers/usr/lib64/R/library
      • chmod 2775 /usr/lib64/R/library
        • The SGID bit is used to ensure that files and directories written into the library location by users will be owned by the rusers group.
      • chown root.rusers -R /usr/share/doc/R-3.1.1/
      • chmod 664 -R /usr/share/doc/R-3.1.1/
      • chmod 2775 /usr/share/doc/R-3.1.1 /usr/share/doc/R-3.1.1/html /usr/share/doc/R-3.1.1/manua
    2. Allow the users access to yum through sudo (optional – if your users need to install RHN/EPEL packages as well as R packages, this may be necessary, though atypical for secure environments – I know)
  3. Configure exhaustive and rigorous outbound firewall rules (RHEL has none by default) to constrain the system’s access to other systems on your network.  Disallow all traffic save for that which is absolutely required for the system’s functionality (monitoring systems, LDAP, Kerberos, DNS, proxy, etc.).
  4. Use a proxy server to allow access to only those repositories and Internet resources approved by the administration/security team:
    1. The problem here, of course, is that the users want to be able to download software on the fly, and the system needs access to RHN through Red Hat Subscription Manager (if you’re using RHN Classic, time to migrate to RHSM!).
    2. Select a proxy for use (squid is a fine, simple solution).
    3. Now, allow access to the appropriate locations:
      1. RHN
        1. As noted over at Red Hat’s access site, RHN uses the Akamai network to distribute packages for their repositories.  This means for us that there is no single IP address (or even consistent range of IP addresses) to which we can allow outbound traffic from the system in order to allow yum to work properly.
          1. Because iptables (and basically any packet-based firewall utility) is not capable of limiting traffic based on domain names, this means we need a proxy server to allow access from the system hosting R to only those domains on the Internet which we designate as acceptable.
        2. Use a proxy (squid is recommended) to allow communication from the system hosting R to subscription.rhn.redhat.com:443, cdn.redhat.com:443, and akamaiedge.net:443.  Presto, RHN updates may be applied through yum!
          1. Don’t forget to add the proxy line (using your own hostname and port number, mind you) to /etc/yum.conf:  proxy=http://proxyhost:proxyport
      2. EPEL
        1. Since the R packages (the RPMs, not the internal R packages) are distributed through EPEL in the RHEL environment, you’ll probably want to configure your proxy to allow access to EPEL through yum.
        2. As above, configure your proxy to allow communication from the system hosting R to download.fedoraproject.org and an EPEL mirror of your choice.  Configure the selected mirror in the /etc/yum.repos.d/epel.repo file by uncommenting the baseurl option under the [epel] heading and changing its content to the URL of the mirror you have selected.  Then, comment out the mirrorlist option.
          1. If your users are granted access to yum via sudo, they may now install packages from the single EPEL mirror you specified.
      3. R Repositories
        1. And finally, you need to allow your users access to the standard R repository locations, including a specified CRAN mirror.
        2. Configure your proxy to allow communication from the system hosting R to http://www.bioconductor.com, www.stats.ox.ac.uk/pub/RWin, http://www.omegahat.org/R, R-Forge.R-project.org, and http://www.rforge.net.  Select a CRAN repository (if you execute install.packages(“ggplot2”) from within the R shell, you will be provided a list of mirrors from which to select) to which you will allow access and add this to your proxy system’s configuration as well.
        3. Configure the standard Rprofile (/usr/lib/R/library/base/R/Rprofile) to specify the chosen CRAN mirror by default.  To do this, add the following lines to the Rprofile file (I’ve chosen the nih.gov mirror as an example and I’ve put in yourproxy:yourproxyport rather than your actual proxy server address and port):
  5. Perform regular audits of the user activity on the system.
    1. I’d perhaps write a small script to obtain the latest changes to the R packages directory (/usr/lib64/R/library) and check on the yum history.
    2. Execute your regular system audits as well, of course.

So there you have it.  Now your users can install R packages from CRAN and other standard R repositories.  They can also install RPM packages obtained from RHN and EPEL on the fly, without your intervention.  Despite these broad allowances, the users lack root access to the system, having been granted only the administrative ability to use yum through sudo and the atypical, yet necessary, ability to install R packages from designated repositories.

Happy R-ing!

Advertisements
This entry was posted in Information Technology and tagged , , . Bookmark the permalink.

2 Responses to Securing R on Red Hat Enterprise Linux

  1. Lance Ellinghaus says:

    Thank you for the information. What do you recommend for sites that do not allow any outbound traffic, thus RHN and CRAN are not accessible from the servers. No proxy access is allowed.

    • Well… what is your organization’s proposed solution for acquiring R packages? Are you able to host a local mirror of the CRAN repository? I’m not sure what security problem is being solved by preventing any access at all to the repositories on the Internet unless your security team intends to perform a full evaluation of the repository code in a local form (which is probably not the intent, since that would be a pretty herculean undertaking). Are they expecting you to simply acquire the CRAN data from outside of your local network and then bring it in with physical media or something like that?

      My professional opinion would be that limited proxy-based access to a trustworthy CRAN mirror would be the way to go. If your organization doesn’t support that, you’re going to have to inquire after their preferred method by which you acquire the software hosted in the CRAN repository. I’m presuming they’re not telling you that you simply can’t use the software at all, but if they aren’t, again, I don’t know what problem they’re trying to solve by preventing you from having any outbound access to a trusted repository. Without it, you’re just going to be inconveniencing yourself, as far as I can tell. You can still do something like acquiring the software from outside your organization’s network and then physically bringing media into your organization, but that doesn’t seem to mitigate against any serious threats, to me.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s