Using NeSI

getting started

contact tricia to get an account
login online my.nesi.org.nz using your VUW credentials
configure your 2-factor authentication as described here
setup your .config file as described here
ssh mahuika and provide the required passwords and you are in!

[tricia@]/Users/tricia $ ssh mahuika
(huntpa@lander.nesi.org.nz) Login Password (First Factor):
(huntpa@lander.nesi.org.nz) Authenticator Code (Second Factor):
(huntpa@login.mahuika.nesi.org.nz) Login Password:

you should only need to authenticate once (each time you login), there after ssh mahuika form new terminals will go straight to your home directory
you will want to edit either the .bashrc or .bash_profile to your liking
take a look at the slides here slides
general top level support page here
information on the partitions here

directories and moving files to and from mahuika

directories are

/home/username 20GB

/nesi/project/vuw04056 100GB

/nesi/nobackup/vuw04056 10TB

on the local computer

scp <path/filename> mahuika:<path/filename>

for example

[tricia@]/Volumes/Tricia_Home/work/jobs/sangoro $ scp water.inp mahuika:.
water.inp        100%  218    13.8KB/s   00:00    
[tricia@]/Volumes/Tricia_Home/work/jobs/sangoro

further information (including for Windows) is here

TEST creating a batch script and checking you can run jobs

NeSI uses Slurm like Raapoi so you will need a batch script
test that you can submit with this script
copy into run.sh and type sbatch run.sh

#!/bin/bash -e
#SBATCH --job-name=SerialJob # job name (shows up in the queue)
#SBATCH --time=00:01:00      # Walltime (HH:MM:SS)
#SBATCH --mem=512MB          # Memory in MB
#SBATCH --qos=debug          # debug QOS for high priority job tests

pwd # Prints working directory

you should see a slurm-*.out file which contains your pwd

huntpa@mahuika01 /home/huntpa $ cat slurm-52598016.out
/home/huntpa

check you can run a parallel job

#!/bin/bash -e
#SBATCH --job-name=MPIJob    # job name (shows up in the queue)
#SBATCH --time=00:01:00      # Walltime (HH:MM:SS)
#SBATCH --mem-per-cpu=512MB          # Memory in MB
#SBATCH --cpus-per-task=4       # 2 Physical cores per task.
#SBATCH --ntasks=2              # number of tasks (e.g. MPI)

srun pwd # Prints working directory

REAL batch script

modify this script to run jobs
I call mine runorcaP.sh
you should be running jobs in our nobackup directory /nesi/nobackup/vuw04056
create your own directory in this folder and run your jobs there

eg mine is /nesi/nobackup/vuw04056/tricia

copy the completed job files back into our shared project directory /nesi/project/vuw04056/your_name
make sure you set maxcore at about (2/3)*(mem/ntasks)

so in the example there is 1G per task, so I set %maxcore 800 (because its a small job)

a node has 128 cores, each core has 2cpus, in the script we call cores

we do NOT want to work across nodes hence nodes=1

we want orca to call 8 or 16 cores hence ntasks=no of cores

in the sacct analysis you might see 16 or 32 cpus (because 2 cpus per core)

partition selection set on advice from NeSI support
if you want to submit more than 50 linked jobs (ie same script different geometry) use an array job

#!/bin/bash -e
#SBATCH --job-name=XX      
#SBATCH --time=05:00:00
#SBATCH --nodes=1
#SBATCH --ntasks=8
#SBATCH --mem=8G
#SBATCH --error=./workdir_%j/slurm_%j.err
#SBATCH --output=./workdir_%j/XX.out
#SBATCH --partition=milan,large,long,bigmem

echo "slurm job ID: ${SLURM_JOBID}" > ./workdir_${SLURM_JOB_ID}/XX.minfo
echo "start time " >> ./workdir_${SLURM_JOB_ID}/XX.minfo
date >> ./workdir_${SLURM_JOB_ID}/XX.minfo

cd ./workdir_${SLURM_JOB_ID}
cp ${SLURM_SUBMIT_DIR}/XX.inp .

module --quiet purge
module load ORCA/5.0.4-OpenMPI-4.1.5

# ORCA under MPI requires that it be called via its full absolute path
orca_exe=$(which orca)

# Don't use "srun" as ORCA does that itself when launching its MPI process.
${orca_exe} XX.inp

echo "finish time " >> ./workdir_${SLURM_JOB_ID}/XX.minfo
date >> ./workdir_${SLURM_JOB_ID}/XX.minfo

checking jobs

key commands

squeue all jobs

squeue --me my jobs

scancel jobID kill named job

sacct -x jobs run in last day

we "pay" for NeSI usage so it is important to make sure you are using the system effectively

nn_seff jobID summary of cpu and memory efficiency

eg water input file

huntpa@mahuika01 /home/huntpa $ cat water.inp
!PBE opt numfreq def2-SVP def2/J smallprint TightSCF NoPop xyzfile
%maxcore 2000
%pal nprocs 8 end
%elprop Polar 1 end
* xyz 0 1
O   0.0000   0.0000   0.0626
H  -0.7920   0.0000  -0.4973
H   0.7920   0.0000  -0.4973
*

water job submit script below, run for 10min, the job did not complete but only just

huntpa@mahuika01 /home/huntpa $ nn_seff 52598705
Cluster: mahuika
Job ID: 52598705
State: TIMEOUT
Cores: 8
Tasks: 8
Nodes: 2
Job Wall-time:  103.2%  00:10:19 of 00:10:00 time limit
CPU Efficiency:  34.4%  00:28:25 of 01:22:32 core-walltime
Mem Efficiency:   1.8%  590.79 MB (0.00 MB to 100.31 MB / task) of 32.00 GB (4.00 GB/task)

check the memory usage after an example job

huntpa@mahuika01 /home/huntpa/workdir_52598705 $ grep -i 'Memory' water.out
   Shared memory     :  Shared parallel matrices
Maximum memory used throughout the entire GTOINT-calculation: 8.7 MB
Maximum memory used throughout the entire SCF-calculation: 6.2 MB
Maximum memory used throughout the entire SCFGRAD-calculation: 4.9 MB
Maximum memory used throughout the entire GTOINT-calculation: 8.7 MB
Maximum memory used throughout the entire SCF-calculation: 6.2 MB
Maximum memory used throughout the entire SCFGRAD-calculation: 5.0 MB
Maximum memory used throughout the entire GTOINT-calculation: 8.7 MB
Maximum memory used throughout the entire SCF-calculation: 6.2 MB
Maximum memory used throughout the entire SCFGRAD-calculation: 4.9 MB
Maximum memory used throughout the entire GTOINT-calculation: 8.7 MB
Maximum memory used throughout the entire SCF-calculation: 6.2 MB
Maximum memory used throughout the entire SCFGRAD-calculation: 5.1 MB
Maximum memory used throughout the entire GTOINT-calculation: 8.7 MB
Maximum memory used throughout the entire SCF-calculation: 6.2 MB
Memory available               ... 1996.8 MB
Memory needed per perturbation ...   0.0 MB

use

sacct --format="JobID,JobName,Elapsed,AveCPU,MinCPU,TotalCPU,Alloc,NTask,MaxRSS,State" -j <jobid>

huntpa@mahuika01 /home/huntpa $ sacct --format="JobID,JobName,Elapsed,AveCPU,MinCPU,TotalCPU,Alloc,NTask,MaxRSS,State" -j 52598705
JobID           JobName    Elapsed     AveCPU     MinCPU   TotalCPU  AllocCPUS   NTasks     MaxRSS      State 
------------ ---------- ---------- ---------- ---------- ---------- ---------- -------- ---------- ---------- 
52598705          water   00:10:19                        28:24.740         16                        TIMEOUT 
52598705.ba+      batch   00:10:20   00:00:03   00:00:03  00:02.601         12        1     87444K  CANCELLED 
52598705.ex+     extern   00:10:19   00:00:00   00:00:00  00:00.001         16        2          0  COMPLETED 
52598705.0   orca_gtoi+   00:00:18   00:00:06   00:00:04  00:56.724         16        8    101592K  COMPLETED 
52598705.1   orca_scf_+   00:02:27   00:00:43   00:00:26  05:51.947         16        8    100872K  COMPLETED 
52598705.2   orca_scfg+   00:00:06   00:00:04   00:00:03  00:35.225         16        8     98876K  COMPLETE

... and more

check our usage

check core usage by the group

nn_corehour_usage vuw04056

huntpa@mahuika01 /home/huntpa $ nn_corehour_usage vuw04056

Note: Fair Share rankings will only be shown for the current cluster, mahuika.

Project vuw04056
================

Project vuw04056 on the mahuika cluster
---------------------------------------
Fair share score on mahuika: 0.998855 out of 1.0
Ranked 158th of 622 active projects (behind 25.24% of active projects)

Usage period                               CPU core hours P100 GPU device hours A100 GPU device hours GB-hours of RAM Compute units
------------                               -------------- --------------------- --------------------- --------------- -------------
2025-01-14T15:00:00 to 2025-01-15T15:00:00              0                     0                     0               0             0

running an orca job

note you cannot use Gaussian or Gaussview on NeSI
we will be using Orca
check that a code is installed and available using

module spider "code"

for example module spider orca

----------------------------------------------------------------------------------
  ORCA:
----------------------------------------------------------------------------------
    Description:
      ORCA is a flexible, efficient and easy-to-use general purpose tool for quantum chemistry with specific
      emphasis on spectroscopic properties of open-shell molecules. It features a wide variety of standard quantum
      chemical methods ranging from semiempirical methods to DFT to single- and multireference correlated ab initio
      methods. It can also treat environmental and relativistic effects. 

     Versions:
        ORCA/4.0.1-OpenMPI-2.0.2
        ORCA/4.2.1-OpenMPI-3.1.4
        ORCA/5.0.1-OpenMPI-4.1.1
        ORCA/5.0.3-OpenMPI-4.1.1
        ORCA/5.0.4-OpenMPI-4.1.5

----------------------------------------------------------------------------------
  For detailed information about a specific "ORCA" module (including how to load the modules) use the module's full name.
  For example:

     $ module spider ORCA/5.0.4-OpenMPI-4.1.5
----------------------------------------------------------------------------------

here is a very basic submit script

#!/bin/bash -e
#SBATCH --job-name=water      
#SBATCH --time=00:10:00
#SBATCH --ntasks=8
#SBATCH --mem-per-cpu=2G
#SBATCH --error=./workdir_%j/slurm_%j.err
#SBATCH --output=./workdir_%j/water.out

echo "slurm job ID: ${SLURM_JOBID}" > ./workdir_${SLURM_JOB_ID}/water.minfo
echo "start time " >> ./workdir_${SLURM_JOB_ID}/water.minfo
date >> ./workdir_${SLURM_JOB_ID}/water.minfo

cd ./workdir_${SLURM_JOB_ID}
cp ${SLURM_SUBMIT_DIR}/water.inp .

module --quiet purge
module load ORCA/5.0.4-OpenMPI-4.1.5

# ORCA under MPI requires that it be called via its full absolute path
orca_exe=$(which orca)

# Don't use "srun" as ORCA does that itself when launching its MPI process.
${orca_exe} water.inp

echo "finish time " >> ./workdir_${SLURM_JOB_ID}/water.minfo
date >> ./workdir_${SLURM_JOB_ID}/water.minfo

Mod:Hunt Research Group/Running jobs on NeSI

Contents

Using NeSI

getting started

directories and moving files to and from mahuika

TEST creating a batch script and checking you can run jobs

REAL batch script

checking jobs

check our usage

running an orca job

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Help

Tools