Developing Software

 The RCE provides many ways to develop and test your own code, using common languages, editors and source code utilities.

Anaconda Python

Overview

We offer Python 2.7 and 3.6 through the Anaconda environment manager. The benefit of using the Anaconda environment is that it is built with data science in mind: popular Python modules are already included in the environment, module versions are maintained by Anaconda for compatibility, and researchers can install additional modules to their home directory at any time. You can also create your own Python environments using Conda.

You can read about Anaconda and Conda at https://docs.continuum.io/anaconda/#anaconda-navigator-or-conda.

Anaconda Terminology

  • Using the "Shell" opens a new command-line environment (CLI) where the selected version of Anaconda (2 or 3) becomes the default Python environment. For example, if Anaconda3 is selected, running "python" invokes Python 3.6, and using "pip" installs modules for Python 3.6.
  • The "Navigator" is a desktop graphical user interface (GUI) that allows you to launch applications and easily manage conda packages, environments and channels without using command-line commands.

Running Anaconda

Note: The Anaconda GUI and CLI are available only on RCE exec nodes, and cannot be run on the login node.

There are several ways of invoking Anaconda:

  1. Launch an RCE Powered Shell
  2. Run "anaconda3-shell"
  3. You will get a bash shell configured so that when you run "python" it executes Anaconda Python 3.6 with the Anaconda libraries.

-or-

  1. Submit a batch job with Executable and Arguments as follows:
    - Executable = /usr/local/bin/python3
    - Arguments = script.py
  2. Have your script executed by Anaconda Python 3.6 with the Anaconda libraries.

-or-

Some combination of the above, like opening an RCE Powered Shell, and running a python script as:

python3 ~/script.py

The Anaconda 2 versions are:

  • anaconda2-shell
  • python2

Tip: Using Anaconda via SSH

  1. Log in to a RCE desktop session
  2. Select Applications > RCE Powered Applications > Anaconda Shell
    1. Enter your desired CPU and RAM
  3. Get the Condor job ID
    1. The job ID is listed in the top left-hand side of the window toolbar
      -or-
    2. Select Applications > System Tools > Terminal
      1. Run condor_q <username>
  4. Close your RCE desktop session (e.g. close your browser)
  5. SSH into the RCE from your local computer
    1. Execute condor_ssh_to_job <job_id>
    2. You should be within your chosen Anaconda environment

Creating R Modules

Building R modules in the RCE

The IQSS Data Science team is putting together the finishing touches on a new R package build system using Jenkins CI platform and GitHub. Check back for more information on the new Rbuild platform.

Programming Languages

Common programming languages available in the RCE include:

  • C, C++
  • Java
  • Perl (5.10, 5.16)
  • Python (2.6, 2.7, 3.6)
  • R
  • Ruby
  • Shell

Programming Tools and Utilities

Code/text editors available:

  • Emacs
  • Eclipse
  • Gedit
  • Bluefish
  • Kwrite
  • Vim

Tools to interact with a number of well-known source code repositories:

  • git (to interface with GitHubGitorious or private git repo's)
  • Subversion (svn)
  • CVS

Using Current Development Tools

The RCE is built with stability in mind. If you need a newer version of GCC or similar development tools, we offer Devtoolset via Software Collections. You can enable the tools from a Terminal:

scl enable devtoolset-4 bash

If you need these tools available on the cluster (e.g. to compile an R package) start an RCE Shell from the Applications > RCE Powered Applications menu. From there, enable the devtoolset as above, then call the appropriate statistical application (e.g. R, xstata-mp, etc.). When you've finished, type exit.

For a full list of updated packages provided by devtoolset-*, please see http://mirror.centos.org/centos/6/sclo/x86_64/rh/devtoolset-4/.

Below are a couple of examples of using the Software Collections Developer Toolset:

 

Install R package xgboost

Installing "xgboost" requires compilation using a newer version of GCC than is supported by default on the RCE. However, you can enable the software collections developer tools to use a newer version of GCC.

- In ~/.R/, create a file named Makevars
with these contents:
CXX14 = g++ -std=c++1y
CXX14FLAGS += -fPIC

- Start an RCE Powered Shell and enter the following
scl enable devtoolset-4 bash
R #or rstudio - these both work
chooseCRANmirror(81)
# I pick 72 here but any mirror should work
install.packages("xgboost")

-----
Install R package lme4

Installing "lme4" requires compilation using a newer version of GCC than is supported by default on the RCE. However, you can enable the software collections developer tools to use a newer version of GCC.

-Start an RCE powered shell
scl enable devtoolset-4 bash
R

chooseCRANmirror(81)
# I pick 72 here but any mirror should work

install.packages("minqa")

packageurl <- "https://cran.r-project.org/src/contrib/Archive/nloptr/nloptr_1.2.1.tar.gz"; install.packages(packageurl, repos=NULL, type="source");

install.packages("lme4")
library("lme4")