I am following this tutorial: https://docs.nanoacademic.com/rescupy/tutorials/defect/tutorial_defect/
code.py:
from rescupy import Atoms, Cell, System, TotalEnergy
from rescupy.checkconv import CheckPrecision
a = 3.56
cell = Cell(avec=[[a,0.,0.],[0.,a,0.],[0.,0.,a]], resolution=0.20)
atoms = Atoms(fractional_positions="diamond.fxyz")
sys = System(cell=cell, atoms=atoms)
ecalc = TotalEnergy(sys)
ecalc.solver.set_mpi_command("mpiexec -n 16")
ecalc.solver.set_stdout("resculog.out")
calc = CheckPrecision(ecalc, parameter="resolution", etol=1.e-3)
calc.solve()
Because I see mpiexec -n 16
I am thinking to submit the job using batch.
My submit script:
#!/bin/sh
#SBATCH --account=rrg-bevankir-ad
#SBATCH --cpus-per-task=16
#SBATCH --time=00-00:10 # time (DD-HH:MM)
#SBATCH --job-name=test
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
export RESCUPLUS_LICENSE_PATH=.../RESCU/rescuplus-1.0.0/license/license.lic
export PATH=.../RESCU/rescuplus-1.0.0/bin:$PATH
module purge
module load CCconfig StdEnv/2020
module load gcc/9.3.0 openmpi/4.0.3 python/3.8.10
module rm imkl
source .../RESCU/spack/share/spack/setup-env.sh
spack env activate .../RESCU/rescuplus-1.0.0
source .../RESCU/rescuplus-1.0.0/venv/bin/activate
module load python/3.8.10
export RESCUPLUS_LICENSE_PATH= .../RESCU/rescuplus-1.0.0/license/license.lic
export RESCUPLUS_PSEUDO=$PWD
python code.py
sacct -j $SLURM_JOB_ID
seff $SLURM_JOB_ID
But the job didn't run. The resculog.out shows:
There are not enough slots available in the system to satisfy the 16
slots that were requested by the application:
.../RESCU/rescuplus-1.0.0/bin/rescuplus_scf
Either request fewer slots for your application, or make more slots
available for use.
A "slot" is the Open MPI term for an allocatable unit where we can
launch a process. The number of slots available are defined by the
environment in which Open MPI processes are run:
1. Hostfile, via "slots=N" clauses (N defaults to number of
processor cores if not provided)
2. The --host command line parameter, via a ":N" suffix on the
hostname (N defaults to 1 if not provided)
3. Resource manager (e.g., SLURM, PBS/Torque, LSF, etc.)
4. If none of a hostfile, the --host command line parameter, or an
RM is present, Open MPI defaults to the number of processor cores
In all the above cases, if you want Open MPI to default to the number
of hardware threads instead of the number of processor cores, use the
--use-hwthread-cpus option.
Alternatively, you can use the --oversubscribe option to ignore the
number of available slots when deciding the number of processes to
launch.
The job did run when I submit interactive jobs using salloc.
What is the best practice to submit a batch job of RESCU+? Could you give me some suggestions on this? Thank you!