ssh
STATuser@slurm-submit-00.cs.wisc.edu
(where STATuser
is your statistics user ID). We
can submit computing jobs from this machine. (Or from
public0[234] just run ssh
slurm-submit-00.cs.wisc.edu
.)cd /workspace/STATuser
pwd
mkdir /workspace/STATuser
~/.bashrc
" initialization file:echo "export PATH=$PATH:/workspace/software/bin" >> ~/.bashrc
/workspace/software/bin
, the
directory containing python software,
and .
, the current directory, to my
path." Log out and log back in to make
this change effective. (Or you can run "source ~/.bashrc
".)
slurm-submit-00
, a shared submit server, only
to edit and run Slurm commands that manage jobs. These commands are
essential:
sbatch --partition short
<script.sh>
starts a batch job for
execution on a computing node. It is called
from ./submit.sh
in each example,
below (except the 5mtcarsPractice example, an
exercise).-STATuser
"
to your Statistics ID in the following line:--mail-user=STATuser@wisc.edu
--mail-type=END,FAIL
--mem-per-cpu=500M
squeue -u <STATuser>
lists my jobs.scancel -u <STATuser>
cancels all my jobs; scancel <jobid>
cancels one job.srun --partition short --pty /bin/bash
runs an
interactive job, giving you a command prompt on a
computing node.
seff <jobid>
shows a "Slurm job efficiency report" of memory usage, etc.
On 3/17/25, the command was
at /workspace/software/slurm/contribs/seff/seff
. We
can use that long path, or run
echo "export
PATH=$PATH:/workspace/software/slurm/contribs/seff"
>> ~/.bashrc
to add its directory to
our PATH
. (Then log out and log back
in or run "source ~/.bashrc
".)
srun --partition short --pty
/bin/bash
from slurm-submit-00
) to get a
command line on a computing node in the cluster if necessary for running jobs manually while debugging.bash:
... .bashrc: Permission denied
".bash
cannot read the
configuration file ~/.bashrc
on
starting, which is because the computing nodes
cannot read home directories. We are working
in /workspace/STAT_USER
for the
same reason.)
emacs
-nw
tries to load an emacs initialization
file from our home directory, which is not readable
on the HPC computing nodes; this takes time as emacs
waits for the load to fail. Instead, runemacs -nw --no-init-file
echo "file=$file, n=$n, sum=$sum
"echo "step 3 starting ..."
echo "... step 3 done"
SLURM_ARRAY_TASK_ID
to 1, 2, 3, and so on
for each of several parallel jobs in a job array. We
can simulate this serially by using, e.g.export SLURM_ARRAY_TASK_ID=1; jobArray.sh
export SLURM_ARRAY_TASK_ID=2; jobArray.sh
export SLURM_ARRAY_TASK_ID=3; jobArray.sh
examples/4jobArrayWithDependencies/submit.sh
.wget www.stat.wisc.edu/~jgillett/DSCP/HPC/examples.tar
tar -xvf examples.tar
5mtcarsPractice
directory. It will be
part of a homework assignment soon.
.tar
" file of
your code and other human-written files (omitting most data files
and most output files) and then copying that single file to your
own computer.