Installing Modules

Many programming languages and applications have extensions (aka add-ons, modules or packages) that can be installed in a users home directory.   See the list to the left of applications that allow users to install add-ons.

Python Modules & Environments

Using Conda

Conda is a package manager application that quickly installs, runs, and updates packages and their dependencies. It can query and search the package index and current installation, and install and update packages into existing conda environments. Conda is only aware of a subset of Python packages, and is not meant as a replacement to pip.

A virtual environment is a named, isolated, working copy of Python that that maintains its own files, directories, and paths so that you can work with specific versions of libraries or Python itself without affecting other Python projects. Conda is also an environment manager application. For example, you may have one environment with NumPy 1.7 and its dependencies, and another environment with NumPy 1.6 for legacy testing. If you change one environment, your other environments are not affected. You can easily activate or deactivate (switch between) these environments.

For example, to create an Python 3.6 environment with the Pandas version of 0.19 and its dependencies, open the Anaconda shell from the RCE applications menu, then issue these commands:

conda create --name mypandas019 python=3.6 pandas=0.19
source activate mypandas019

You can find full documentation for using Conda at https://conda.io/docs/using/envs.html.

We also have additional documentation for using Anaconda Python in the RCE.

If you install a module using conda, you can make it available to a limited set of conda environments if you so choose. If you install a module with pip, it automatically becomes available in all conda environments [for that version of Python].

Using pip

As an RCE user, you have the ability to install Python modules locally to your home directory and use them in your projects.

  1. Determine the required python version: You need to determine which version of Python you'd like to develop with. Currently, Python 2.7 is available via Anaconda 2, and Python 3.6 is available via Anaconda 3.
  2. Load the appropriate Anaconda environment: Open an Anaconda Shell via the RCE Powered Applications menu. Also see Working With Anaconda Python.
  3. Search for the Python module: Each version of Python installed on the RCE maintains its own module path. Determine whether a Python module is installed for a specific version of Python by using pip, to list packages installed for a desired version.
    pip list | grep $MODULE
    Example: pip list | grep simplejson
  4. Install your module: If your module is installed for the Python version you need, you're done. If the module is not installed install the module locally to your home directory.
    pip install $MODULE --user
    Example: pip install simplejson --user
  5. Can't find your module? If you're unable to locate your module using pip, maybe you're searching for the wrong module name. If you've decided you needed to install a module because, for example, import simplejson, did not work from a Python interactive console, you may have the wrong name. Often Python class names differ from Python module names. Try using the pip search feature.
  6. Still not found? Try searching the PyPI repository, the official Python module repository. Use google. Very rarely, some modules require that you manually compile Python packages using setup.py.
  7. Need help? Open a ticket by sending an email to support@help.hmdc.harvard.edu.

Installing modules from source code

Here's an example of installing the Python module 'rtree' by compiling it from source.

The library libspatialindex 1.7 is required for package 'rtree'. Only version 1.6 is available for CentOS 6, unfortunately. You can try building the package in your home directory.

  1. wget http://download.osgeo.org/libspatialindex/spatialindex-src-1.8.5.tar.gz
  2. tar -xzf spatialindex-src-1.8.5.tar.gz
  3. cd spatialindex-src-1.8.5
  4. ./configure --prefix=~/
  5. make
  6. make install
  7. set the environmental variables:
    1. export SPATIALINDEX_LIBRARY=~/lib/libspatialindex.so
    2. export SPATIALINDEX_C_LIBRARY=~/lib/libspatialindex_c.so
  8. pip install git+https://github.com/Toblerity/rtree.git --user

Python has trouble finding the libspatialindex library and this is a known bug for which there was a specific fix implemented. You can find more on the discussion here: https://github.com/Toblerity/rtree/issues/56.

Unfortunately, the details of how to use the fix are poorly documented. The name of the environmental variable looks like it changed between the discussion, where it's referred to as SPATIALINDEX_LIBRARY_PATH; however, the sequence of commands above seem to work.

R

Installing an R Package

 

The RCE provides almost all stable libraries maintained in the Comprehensive R Archive Network (CRAN), and others. For a full list refer to "Which R packages are available?"

If you would like to install a library separately for your own personal use, follow these instructions:

  1. In R, type library(<package_name>).

    For example, to install R Commander, type the following:

    > library(Rcmdr)

    R prompts you with a warning if the package that you chose to install uses other packages that are not installed already.

  2. To install missing packages on which your target package depends:

    1. Click Yes to continue. The Install Missing Packages window is displayed.

    2. Click OK to continue. R prompts you to select a mirror site from which to download the packages' sources.

  3. Select a site from which to download the sources, and then click OK.

    The dependent packages and your target package are now installed. If it is an executable, the function is executed.

Alternatively, you can install packages from within R like this:

install.packages("package_name")

If the R package fails to compile, you may need a newer version of GCC. See this page for using updated versions of developer tools.

Stata

Start Stata on the RCE by going to Applications->RCE Powered Applications->RCE Powered Stata.

Once stata is running, you can do one of the following:

  • Install command via ssc, by submitting the following command:

ssc install <package-name>

eg. ssc install outreg

For more information on ssc you can check out this Stata help page

  • You can also install from a 3rd party:

net install <package-name>, from(SOME.SITE.EDU/package-name) replace

For example to install a package named rdrobust, you can submit:

net install rdrobust, from(http://www-personal.umich.edu/~cattaneo/rdrobust/stata) replace

 

Storing Anaconda files in a project space

By default, Anaconda environments and their packages reside in ~/.conda in your home folder, and thus count against the limited 2.5GB quota on users' home folders.

Before you go down the path of setting up a virtual environment and populating it with packages, you may want to do something like this, to create a .conda folder in a project space, and then redirect ~/.conda to there instead:

Make a new .conda folder inside a project space:
cd ~/shared_space/nameoftheprojectspace
mkdir .conda

Then, create a symlink from your home folder to the new location:
cd ~
ln -s ~/shared_space/nameoftheprojectspace/.conda ~/.conda

This should let you then create an environment as described inhttps://rce-docs.hmdc.harvard.edu/book/anaconda-python and populate it with the packages you need. if you run out of room to hold your packages, send a request to us to scale up the size of project space.