Scaling and nondimensionalizing parameterized geometry

Yes that should be fine. The key here is to scale the inputs for the machine learning model to improve convergence (similar to how you scale data in a traditional DL setting). What the absolute best scaling is a judgement call / empirical (again analogous to data-driven problems where you could min/max norm, guassian norm, etc.).