Parameterised tessellation very GPU memory intensive, any suggestions?

I am trying to use a set of roughly a dozen NACA airfoil meshes with slightly different parameters to use in the parameterized tessellation example, trying to optimise the parameters. (Example here: https://gitlab.com/nvidia/modulus/examples/-/blob/release_22.09/geometry/parameterized_tesselated_example.py).

I am using an A100 80GB GPU, but as soon as I am trying to constrain it and run the simulation, the GPU goes to >100% and the kernel crashes.

Is there a way to somehow limit the memory requirements and enable training of even larger sets on an 80GB machine?

Thanks!

Hi @benedikt_dietz

Does Modulus crash when its sampling the points of from the STL files or while training? I would start with lowering the number of points your using via the batch_per_epoch and batch_size parameters in your constraints.

For some really complicated geometry we have also resorted to pre-sampling the STL files in a separate script with the geometry module and saving them to memory in say a numpy array. Then in the actual training loop, load them from the numpy file. This useful for speeding up testing as well.

Hi, thanks for getting back!

I tried to implement this like the following:

bracket_files = glob.glob("./naca_foils/foils/naca*.stl")
bracket_files.sort()
stl_dict = []
for f in bracket_files:
    _temp = f.split('/naca')[-1].split('.')[0]
    stl_dict.append({
        'path': f,
        'angle': int(_temp.split('_')[-1]),
        'digit_1': int(_temp.split('_')[0][0]),
        'digit_2': int(_temp.split('_')[0][1]),
        'digit_3': int(_temp.split('_')[0][2:4]),
    })

and then tried to sample from the tessellated geometries like this to do the sampling outside of the actual training loop:

os.makedirs('tessllation_memory/', exist_ok=True)
for i,f in enumerate(stl_dict[:2]): 
    
    t = Tessellation.from_stl(f['path'], airtight=True)
    
    points = t.sample_boundary(
        nr_points=10, 
        quasirandom=False, 
    )
    del t
    
    ones = np.reshape(np.ones(len(points['x'])), (-1,1))
    points['angle'] = f['angle'] * ones
    points['digit_1'] = f['digit_1'] * ones
    points['digit_2'] = f['digit_2'] * ones
    points['digit_3'] = f['digit_3'] * ones
    

    h5file = h5py.File('tessllation_memory/'+str(i)+'.hdf5', 'w')
    for grp_name in points:
        print('grp_name'.ljust(40,'.'), grp_name)
        dset = h5file.create_dataset(grp_name, data = points[grp_name])
    h5file.close()
        
    del points

Unfortunately, this breaks the kernel (almost) every time I run it, even though I’m working on a fairly large GPU and only sampling a very limited number of points on the geometries.

Do you have a suggestion on how to manage this? I believe it’s due to excessive memory utilization, do you think that’s a correct diagnosis of the issue?

Thanks a lot in advance!

Hi @benedikt_dietz

Please try with the updated Modulus container if possible, migration should be very straight forward (guide here). I believe we had a few fixes to pySDF in this recent release.