Difficulty assesment of creating a urban city scene // Remote rendering of Isaac sim?

Hello everyone!
I am currently writing my master’s thesis and am looking for a suitable simulation with which I can generate synthetic data. The data to be generated will consist of video recordings of pedestrians performing various actions, which will later be identified by my skeleton-based action recognition model. It is important that the data has high fidelity and was recorded in urban areas (sidewalks, inner city, etc.). I believe that Isaac sim will be very suitable for this task. However, being inexperienced, I have the following questions about the application:

1) How difficult will it be to recreate an urban environment, or are there assets I can use?

2) Is it possible to run the rendering on the GPU cluster of my university and output the simulation on my personal laptop? Or do I need to run the simulation headless, transfer some sample files to my machine and validate them this way?

Thanks for any help, and sorry for my lack of expertise.