RESCU+ About slab calculations

shwhyshwhy

I am testing some GaN slab calculations. I used the RESCU+ fully relaxed GaN unitcell to build the slab with (100) GaN surface with a 20A vacuum. And using the following script (pseudopotentials: Ga_PBE_TZP.mat and N_PBE_TZP.mat):

from ase.build import bulk
from rescupy import Atoms, Cell, System, TotalEnergy, Relax
import numpy as np

cell = Cell(avec=[[3.2159608353184348,0.0000000000000000, 0.0000000000000000],[0.0000025714542510, 5.2566545374655238, 0.0000000000000000],[0.0000000000000000, 0.0000000000000000, 60.4240711222781002]], resolution=0.1)
atoms = Atoms(positions="slab100-gan.xyz")
sys = System(cell=cell, atoms=atoms)
sys.kpoint.set_grid([13,8,1])
sys.xc.set_functional_names(["XC_GGA_X_PBE","XC_GGA_C_PBE"])
calc = TotalEnergy(sys)
calc.solver.set_mpi_command("mpiexec -n 40")
calc.solver.mix.alpha = 0.5

rlx = Relax.from_totalenergy(calc)
rlx.solve()

However, the electronic step does not converge within 100 steps. Something like:

...
      97   -0.139018E+04    0.886884E+00    0.931312E-02          27.950
      98   -0.139123E+04    0.104909E+01    0.962021E-02          26.376
      99   -0.138823E+04    0.299725E+01    0.148844E-01          26.556
     100   -0.138914E+04    0.909054E+00    0.120535E-01          28.739

I am not sure whether there are some tags for slab calculations such as dipole corrections that I need to apply? Or could you also give me some examples of how to apply some vdW correction schemes? Should I give it more iterations or are there any tips for better convergence? Thank you!

vincentm

shwhyshwhy I am not sure whether there are some tags for slab calculations such as dipole corrections that I need to apply?

If the slab is not passivated, with hydrogen atoms say, then the system is a metal-vacuum system which can be hard to converge. It's small size should pose no problem however. We'll investigate, but in the meantime, try lowering alpha and/or increasing the smearing parameter sigma.

shwhyshwhy Or could you also give me some examples of how to apply some vdW correction schemes?

We'll soon release a feature and tutorial on including Grimme's D3 corrections, but it's not possible at the moment.

shwhyshwhy

Dear vincentm,

Thank you so much for your reply! I will try surface passivation and Grimme's D3 corrections later!

I tried to increase the maximum electron iterations to 400, and it actually converged at the 209th step at the first ionic step. And later on, in the following ionic steps, the electronic relaxation converges within 50 steps.

I have some follow-up questions on this:

I find that using 4 nodes vs. 1 node (40 tasks for each node) gives me the same calculation speed. Is that the atomic relaxation can only be done in 1 node?
It shows some warning message [1663305466.251883] [bc12020:85675:0] mpool.c:42 UCX WARN object 0x2440b80 was not returned to mpool ucp_requests" in the resculog.out. I don't understand these at all, could you please give me some ideas on this and whether I should concern about these?
If the calculation is canceled due to time limitations, should I keep "nano_rlx_out.h5 " inside with a new configuration to let the convergence better for the restart? I am just guessing the first ionic step takes so long because it does not have some wavefunction/electron density to start? I know that in VASP we can load wavefunction and electron density sometimes, but I am not sure whether the restart works similarly in RESCU+. Could you give me some guidance on this as well?

Thank you so much for your kind help!

vincentm

shwhyshwhy

shwhyshwhy I find that using 4 nodes vs. 1 node (40 tasks for each node) gives me the same calculation speed. Is that the atomic relaxation can only be done in 1 node?

Since the system is quite small, you'll have to use some degree of k-point parallelism to use 160 cores. So try setting mpidist.kptprc to 8 or even 40.

shwhyshwhy It shows some warning message [1663305466.251883] [bc12020:85675:0] mpool.c:42 UCX WARN object

I'm not sure about that. Seems like a warning. You can probably ignore it.

shwhyshwhy If the calculation is canceled due to time limitations, should I keep "nano_rlx_out.h5 " inside with a new configuration to let the convergence better for the restart?

The code should save the density at every ionic step indeed. You could then restart using the keywords listed here.

shwhyshwhy

Thank you so much vincentm! They are very helpful!

vincentm

@shwhyshwhy

Could you attach the file slab100-gan.xyz please?

shwhyshwhy

vincentm

Here is slab100-gan.xyz:

32

N       1.607986137470616      0.652838431135431     20.928366869194459
N       0.000066410544851      3.281159431858092     31.140402430333509
N       1.608050942932065      0.652831593306230     32.068769299527972
N       1.608030626838707      3.281161141315392     28.355301822750135
N       0.000054323907484      0.652833302763530     29.283668691944591
N       1.608083345662788      0.652828174391629     37.638970514694719
N       0.000098813275578      3.281156012943491     36.710603645500257
N       1.608063029569432      3.281157722400792     33.925503037916883
N       0.000034007814128      3.281162850772692     25.570201215166755
N       1.607998224107982      3.281164560229993     22.785100607583377
N       0.000021921176761      0.652836721678131     23.713467476777836
N       0.000119129368933      0.652826464934329     40.424071122278100
N       1.608095432300156      3.281154303486191     39.495704253083645
N       0.000001605083403      3.281166269687293     20.000000000000000
N       1.608018540201341      0.652835012220831     26.498568084361217
N       0.000086726638211      0.652829883848930     34.853869907111346
Ga      0.000099779646427      5.256644280721722     36.710603645500271
Ga      1.608084312033638      2.628316442169860     37.638970514694719
Ga      0.000087693009060      2.628318151627160     34.853869907111346
Ga      1.608063995940280      5.256645990179022     33.925503037916883
Ga      0.000055290278334      2.628321570541761     29.283668691944595
Ga      0.000067376915700      5.256647699636323     31.140402430333509
Ga      1.608096398671004      5.256642571264422     39.495704253083645
Ga      1.608031593209556      5.256649409093622     28.355301822750135
Ga      1.608019506572189      2.628323279999061     26.498568084361214
Ga      0.000034974184976      5.256651118550923     25.570201215166755
Ga      0.000022887547610      2.628324989456361     23.713467476777836
Ga      1.607999190478831      5.256652828008224     22.785100607583377
Ga      1.607987103841465      2.628326698913662     20.928366869194459
Ga      0.000000000000000      0.000000000000000     20.000000000000000
Ga      1.608051909302913      2.628319861084460     32.068769299527972
Ga      0.000120095739784      2.628314732712560     40.424071122278100

I am now also doing a slab "supercell" calculation, and I will also test whether more nodes will help! Thank you!

shwhyshwhy

Dear @vincentm,

I tested a supercell of 896 atoms with multiple nodes. And I found that it still only calculates on one node. The submit script and output file look like this:

submit.sh:

#!/bin/sh
#SBATCH --account=XXX
#SBATCH --nodes=4
#SBATCH --ntasks-per-node=40
#SBATCH --mem=0
#SBATCH --time=00-12:00 # time (DD-HH:MM)
#SBATCH --job-name=test_r

# module load and environment ...

python relax_surface.py

where relax_surface.py is:

from ase.build import bulk
from rescupy import Atoms, Cell, System, TotalEnergy, Relax
import numpy as np

cell = Cell(avec=[[22.5117258472290445,    0.0000000000000000,    0.0000000000000000],
                  [0.0000102858170040,   21.0266181498620952,    0.0000000000000000],
                  [0.0000000000000000,  0.0000000000000000, 60.4240711222781002]], resolution=0.1)

atoms = Atoms(positions="slab100-gan-741.xyz")
sys = System(cell=cell, atoms=atoms)
sys.kpoint.set_grid([1,1,1])
sys.xc.set_functional_names(["XC_GGA_X_PBE","XC_GGA_C_PBE"])
calc = TotalEnergy(sys)
calc.solver.set_mpi_command("mpiexec -n 40")
calc.solver.mix.alpha = 0.2
calc.solver.mix.maxit = 400
rlx = Relax.from_totalenergy(calc)
rlx.solve()

It seems that I still only use 40 cores instead of 160 cores as shown in rescuplus.out:

------------------------------------
        mpi distribution info
------------------------------------
      proc num: 40
....................................

Can you give me some advice on how to use more nodes? Thank you very much!

vincentm

The problem is

calc.solver.set_mpi_command("mpiexec -n 40")

If you want to use 160 cores, set

calc.solver.set_mpi_command("mpiexec -n 160")

Since you are using Slurm, you could also just set

calc.solver.set_mpi_command("srun")

which you won't have to change if you change Slurm parameters.

shwhyshwhy

Dear vincentm

I have tried both ways you have suggested. They all worked great! Thank you so much!