Background

If you aren’t yet familiar with the resource manager Slurm and how to submit jobs to it, then check out that tutorial first.

The login node of Slurm (i.e., the interactive environment you land in when you first log in) is designed for requesting jobs only, and you will shortly run out of resources or get a nasty email from admin if you run significant tasks there. But when you’re developing pipelines using the standard run-edit-run-edit-run procedure, you don’t want to write, submit, and wait for the output of a job request every time you run.

You can solve this by requesting an interactive environment to work in. It will give you some resources to use for a limited amount of time.

Setup

Make sure you have the most recent version of the .zsh profile file. Note: This will overwrite your existing zsh profile file, so if you’ve made changes, be sure to save and integrate them properly.

    cd /home/$USER
    [[ -e .zshrc ]] && mv .zshrc .zshrc_old
    cp /data/gpfs/projects/punim1869/.admin/assets/.zshrc .
    source ./.zshrc

Requesting an interactive session

  1. I’ve built a shortcut si ([s]tart [i]nteractive) to make these requests, the syntax is:

     si <session_name> <CPUs> <RAM_in_Gb> <time_formatted_for_slurm>

    The session name can be anything, and is what will show up in the slurm job queue. Two common time formats are <days>-<hours> and <hours>:<minutes>:<seconds>. An example, to request 10 CPU cores and 50 Gb RAM for two hours might be:

     si my_session 10 50 2:00:00

    For convenience if you just enter si by itself, it will automatically make a request for 2 cores and 2 Gb RAM for 45 minutes.

  2. Once you’re in the session, a new instance of Zsh will be launched. For curious reasons, it will not automatically load your Zsh profile (which gives you access to all the shortcut commands like si above). To do this manually, type:

     source ~/.zshrc

    Do this every time you start an interactive session.

Am I in an interactive session?

The command hn (an alias for hostname) will show you which cluster node you are on. If the result is something like spartan-login2.hpc.unimelb.edu.au, then you are in the login node and NOT in an interactive session. If it shows a node ID, which will look something like spartan-bm046.hpc.unimelb.edu.au, then you are in an an interactive.

NOTE: You can run squ (“Slurm Queue, User”; a function defined by the custom .zshrc) to show your list of running jobs, which will list your interactive by name and the node ID it is running on.

Are interactive sessions all I need to get jobs done?

NO!

They are still resource-limited and jobs in then will occasionally be killed when resources are scarce—and you will still need to run some jobs that go for a longer time than you want to risk keeping an interactive session connected for. Run any significant tasks using slurm batch files. Tutorial here.