Skip to main content

Working with R

From time to time you may be required to work with R script as most of the bioinformatics/computational biology packages are written in R (need citation). In your local computer, working with R may appear fairly easy as you can always install RStudio Desktop and execute your R script there. But what if you want to run a script that requires high memory and your computer doesn't have enough RAM? Yes you can always rely on Maxwell! It's pretty straightforward to work with R on HPC as you can just create a ready-to-run R script and run it using the Rscript command.

Rscript myscript.R

With VS Code you can do more than just running the script, you can also enter an interactive mode that will ease development and analysis (basically like using RStudio but on Maxwell and with powerful tooling that VS Code extension offers). So here's how we can do it:

  1. Open your terminal (ctrl + `) and create a new conda/mamba environment and activate it

    [r04mr23@maxlogin1(maxwell) ~]$ mamba create -n myrenv
    [r04mr23@maxlogin1(maxwell) ~]$ mamba activate myrenv
    (myrenv) [r04mr23@maxlogin1(maxwell) ~]$
  2. Install the R and R Debugger extensions for VS Code Extension installation

  3. In order for VS Code R to work you have to install the following packages Mandatory:

    • R: The R programming language itself
    • r-httpgd: Required for VS Code interactive plot viewer to work
    • radian: A better R terminal
    • r-languageserver: Required for VS Code to provide code completion, dignostics, formatting and any more features
    • r-jsonlite: Relatively fast jsonparser for statistical data and the web
    • r-irkernel: R Kernel for Jupyter Not mandatory but will be used for this guide:
    • r-ggplot2: Data visualisation
    • r-dplyr: Data manipulation

    (myrenv) [r04mr23@maxlogin1(maxwell) ~]$ mamba install R radian r-httpgd r-lang r-jsonlite r-languageserver r-irkernel r-ggplot2 r-dplyr
  4. Open the settings.json by pressing command + , or Code -> Settings... -> Setting and click the icon pointed below

    settings.json update icon

  5. Update settings.json file by adding the following parameters

    "r.rterm.linux": "/uoa/home/r04mr23/sharedscratch/.conda/envs/myrenv/bin/radian",
    "r.alwaysUseActiveTerminal": true,
    "r.sessionWatcher": true,
    "r.rpath.linux": "/uoa/scratch/users/r04mr23/.conda/envs/myrenv/bin/R"

    Replace r.term.linux and r.rpath.linux with the path where R and radian installed in your environment. If you're not sure, in the environment that you activated check using the following

    (myrenv) [r04mr23@maxlogin1(maxwell) ~]$ which radian | xargs readlink -f
    /uoa/scratch/users/r04mr23/.conda/envs/myrenv/bin/radian
    (myrenv) [r04mr23@maxlogin1(maxwell) ~]$ which R | xargs readlink -f
    /uoa/scratch/users/r04mr23/.conda/envs/myrenv/bin/R

    Copy the path, update your settings.json, save it (ctrl/command + s) and close it

  6. Turn on the session watcher this will allow the communication between VS Code R and R Live session by performing the following commands (Read more on here)

    • Edit the .Rprofile on your home directory
      (myrenv) [r04mr23@maxlogin1(maxwell) ~]$ vim ~/.Rprofile
    • Append the following code to it
      if (interactive() && Sys.getenv("RSTUDIO") == "") {
      source(file.path(Sys.getenv(if (.Platform$OS.type == "windows") "USERPROFILE" else "HOME"), ".vscode-R", "init.R"))
      }
    • Reload your terminal
  7. In the directory of your choice create a new R script test.R and copy the code below

    test.R
    print("Test Maxwell R script")

    x <- "Tester"
  8. Launch the terminal if you haven't (ctrl + `) and run radian

    (myrenv) [r04mr23@maxlogin1(maxwell) ~]$ radian
    R version 4.3.3 (2024-02-29) -- "Angel Food Cake"
    Platform: x86_64-conda-linux-gnu (64-bit)

    r$>
  9. Attach the terminal to vscode-R to the current session (this will allow you to keep track of variables that you created during the R-session etc.)

    r$> .vsc.attach()

    If successful it will change from the R: (not attached) to R: whatever version you're using No attachment Attach vscode-R to the session

  10. Now you can see the R: workspace (namespaces, variables etc) on the EXPLORER tab Explorer tab with workspace You can customise the layout by dragging the workspace around

  11. VSCode R allows you to preview the dataset that you loaded, let's say I'm loading a data frame variable called midwest from this ggplot tutorial

    midwest <- read.csv("http://goo.gl/G1K41K")

    After you run it, the midwest variable will appear in the Workspace, clicking the magnifier icon will open the data.frame variable for us Data preview

  12. With VSCode R we can also preview plots, for example the following will generate a scatter plot using ggplot2

    library(ggplot2)
    ggplot(midwest, aes(x=area, y=poptotal)) + geom_point()

    Ggplot viewer

That's how we can use Maxwell and VSCode to create an interactive R. However we still have a problem, we're running this R session in a login node!

r$> Sys.info()["nodename"]
nodename
"maxlogin1.int.maxwell.abdn.ac.uk"

Which isn't recommended as it has limited memory and may affect the login node performance. In order to run R in the other compute nodes we can create an interactive session and then run radian/R terminal in it.

Happy hacking!