If you aren’t yet familiar with the resource manager Slurm and how to submit jobs to it, then check out that tutorial first.
The login node of Slurm (i.e., the interactive environment you land in when you first log in) is designed for requesting jobs only, and you will shortly run out of resources or get a nasty email from admin if you run significant tasks there. But when you’re developing pipelines using the standard run-edit-run-edit-run procedure, you don’t want to write, submit, and wait for the output of a job request every time you run.
You can solve this by requesting an interactive environment to work in. It will give you some resources to use for a limited amount of time.
Make sure you have the most recent version of the .zsh profile file. Note: This will overwrite your existing zsh profile file, so if you’ve made changes, be sure to save and integrate them properly.
cd /home/$USER
[[ -e .zshrc ]] && mv .zshrc .zshrc_old
cp /data/gpfs/projects/punim1869/.admin/assets/.zshrc .
source ./.zshrc
I’ve built a shortcut si
([s]tart [i]nteractive) to
make these requests, the syntax is:
si <session_name> <CPUs> <RAM_in_Gb> <time_formatted_for_slurm>
The session name can be anything, and is what will show up in the
slurm job queue. Two common time formats are
<days>-<hours>
and
<hours>:<minutes>:<seconds>
. An example,
to request 10 CPU cores and 50 Gb RAM for two hours might be:
si my_session 10 50 2:00:00
For convenience if you just enter si
by itself, it will
automatically make a request for 2 cores and 2 Gb RAM for 45
minutes.
Once you’re in the session, a new instance of Zsh will be
launched. For curious reasons, it will not automatically load your Zsh
profile (which gives you access to all the shortcut commands like
si
above). To do this manually, type:
source ~/.zshrc
Do this every time you start an interactive session.
The command hn
(an alias for hostname
) will
show you which cluster node you are on. If the result is something like
spartan-login2.hpc.unimelb.edu.au
, then you are in the
login node and NOT in an interactive session. If it
shows a node ID, which will look something like
spartan-bm046.hpc.unimelb.edu.au
, then you are in an an
interactive.
NOTE: You can run squ
(“Slurm Queue, User”; a
function defined by the custom .zshrc) to show your list of running
jobs, which will list your interactive by name and the node ID it is
running on.
NO!
They are still resource-limited and jobs in then will occasionally be killed when resources are scarce—and you will still need to run some jobs that go for a longer time than you want to risk keeping an interactive session connected for. Run any significant tasks using slurm batch files. Tutorial here.