AMGX runtime error with preconditioning

Hello,

I am trying to use the AmgX solver on a symmetric positive-definite matrix generated by some Fortran code. I am able to solve the system using plain conjugate gradient without preconditioning (as in the config file PCG_NOPREC.json); however, when I try to use a preconditioner (as in config files like PCG_CLASSICAL_V_JACOBI.json and many others) I get a runtime error as follows:

Caught amgx exception: Cuda failure: 'an illegal memory access was encountered'

 at: /home/omer/AMGX/src/solvers/dense_lu_solver.cu:562

This config runs fine with the example matrix that comes in the repository, but of course, that is a dummy matrix. Any insight about why it might be crashing on my matrix and what config parameters I should use so that I can add preconditioning to conjugate gradient without crashing would be greatly appreciated.

I will paste the rel

createf90.txt (28.2 KB)

evant section of my .bashrc file and attach the main Fortran file used for matrix generation. Please do not hesitate to ask for additional information about my system if it would be helpful.

My .bashrc file:

#-------------------NVIDIA--HPC--SDK--TOOLKIT-------------------------

export NVARCH='Linux'_'x86_64'
export NVCOMPILERS=/home/omer/nvidia/hpc_sdk
export HPC_VERSION=25.5
export MANPATH=$NVCOMPILERS/$NVARCH/$HPC_VERSION/compilers/man:$MANPATH
export PATH=$NVCOMPILERS/$NVARCH/$HPC_VERSION/compilers/bin:$PATH

export PATH=$NVCOMPILERS/$NVARCH/$HPC_VERSION/comm_libs/mpi/bin:$PATH
export MANPATH=$MANPATH:$NVCOMPILERS/$NVARCH/$HPC_VERSION/comm_libs/mpi/man

# For running AmgX
export LD_LIBRARY_PATH=$NVCOMPILERS/$NVARCH/$HPC_VERSION/cuda/lib64:$LD_LIBRARY_PATH
#export LD_LIBRARY_PATH=/home/omer/nvidia/hpc_sdk/Linux_x86_64/25.5/math_libs/lib64:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$NVCOMPILERS/$NVARCH/$HPC_VERSION/compilers/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$NVCOMPILERS/$NVARCH/$HPC_VERSION/comm_libs/mpi/lib:$LD_LIBRARY_PATH


Update: I suspected that this was a memory issue so I tried using this config on smaller matrices and it worked. However, it reached the desired tolerance in a single iteration, suggesting that this configuration used a much more intense preconditioner than desired. Adding the line

"coarse_solver": "NOSOLVER"

to the preconditioner fixed the memory issue and the solver started to accept larger matrices; however, the residuals wouldn’t converge. Adding the line

"relaxation_factor": 0.1

to the smoother drastically improved residual conversion and the preconditioning actually became viable. Along these two settings, I changed the max_iters parameter of the preconditioner to 5 and the final config became:

{
    "config_version": 2, 
    "solver": {
        "preconditioner": {
            "print_grid_stats": 1,
            "print_vis_data": 0,
            "solver": "AMG", 
            "smoother": {
                "scope": "jacobi", 
                "solver": "BLOCK_JACOBI", 
                "monitor_residual": 0, 
                "print_solve_stats": 0,
                "relaxation_factor": 0.1
            },
            "print_solve_stats": 1, 
            "aggressive_levels": 2,
            "presweeps": 1, 
            "interpolator": "D2",
            "max_iters": 5, 
            "monitor_residual": 1, 
            "store_res_history": 0, 
            "scope": "amg", 
            "max_levels": 50, 
            "cycle": "V", 
            "postsweeps": 1,
            "coarse_solver": "NOSOLVER"
        }, 
        "solver": "PCG", 
        "print_solve_stats": 1, 
        "print_grid_stats": 1,
        "obtain_timings": 1, 
        "max_iters": 30000, 
        "monitor_residual": 1, 
        "convergence": "ABSOLUTE", 
        "scope": "main", 
        "tolerance" : 1e-09, 
        "norm": "L2"
    }
}

I would greatly appreciate any insight on recommended parameters especially for relaxation_factor and max_iters.

I’m glad that you were able to pick up some parameters that would work for you. Unfortunately there is no single solver/preconditioner and single set of parameters that will “generally” work, so recommended approach is exactly like you did - pick solver that you believe would suit for your problem, and try to tune it’s parameters for memory/performance.

I will comment of some of your findings.

when I try to use a preconditioner (as in config files like PCG_CLASSICAL_V_JACOBI.json and many others) I get a runtime error as follows:

Sorry for the lack of description for such errors. Couple of things that can go wrong in dense solver:

  • not enough memory: try to build “deeper” hierarchy so that coarsest level is smaller. Exploring multigrid hierarchy and estimating your available memory can help with that
  • singular matrix: not likely but could happen. Checking return code manually from cuSolver might help here, but unfortunately it’s not exposed in AMGX interface

“coarse_solver”: “NOSOLVER”

you can actually also use any of the smoothers on the coarsest solvers.

“relaxation_factor”: 0.1

While relaxation factor anywhere between (0.0, 2.0) is valid (or (0.0, 1.0), based on reinterpretation ), typically such small value means that multigrid is not as efficient as it could be. I would try to maybe change aggressive levels to 1 or 0, since it might hurt efficiency.

AMG: “monitor_residual”: 1,

is not needed, outer solver takes care of the convergence criteria

AMG: “max_iters”: 5,

typically we run multigrid cycle and let outer solver correct preconditioner, rather than taking “5 steps in the same direction”. I would probably leave it at 1.

“max_iters”: 30000,

does your solver really need so many iterations? Sounds fishy. Have you tried different solvers? Stronger smoother? Increase number of pre- and post-sweeps?

Those are some “rules of thumb”, but like i mentioned previously, you could have just found best parameters for your case.