Multiple Processes performing inference using the same TF Graph

ben.kawecki · March 13, 2019, 5:13pm

Hello,

I’m currently working on an python based application performing basic object detection using tensorflow. My application is modeled off of [url]https://github.com/datitran/object_detector_app[/url] with the main process taking frames from the camera (which is running in a seperate process), passing them into a queue which is then read by n worker processes which perform the inference. This application works fine if I only have 1 worker performing inference. When I have more than 1 worker each worker will load the graph from memory, process exactly 1 frame of data, then fail without raising an exception. The only reason I can tell it fails is that the application will begin spawning another worker process.

I’ve run the exact same code on a PC running the CPU version of TF with no problem. If anyone has any ideas for why this is happening I would love to hear them. Otherwise, does anyone have an idea to increase the framerate without spawning additional worker processes?

Thanks,
Ben

AastaLLL · March 14, 2019, 5:20am

Hi,

May I know the number of TensorFlow sessions you created?
Do you use the same TF session or create one for each process?

TensorFlow by default allocates all the available memory, which may cause the second session without enough memory.
Is it a possible cause for your application?

Maybe you can try to limit the amount of memory first:

config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.4
session = tf.Session(config=config, ...)

Thanks.

ben.kawecki · March 14, 2019, 2:09pm

So my default was to spawn two workers which would be two separate tensorflow sessions. I used different tensorflow sessions for each graph, as I thought that using the same tensoflow session across two workers would not provide an increase in total FPS (bottle necked at the TF graph). So I don’t know for sure, but in the stdout log when the TF session instantiates, it lists total GPU memory and available GPU memory. When running this I remember that after the first two graphs are created it lists available GPU memory as ~8GB, the problem is when after the first two workers fail (for whatever reason, no exception is thrown) they still appear to have a lock on the memory, because after the second two workers and graphs are created it lists the available GPU memory as ~150MB. I typically exit the program at this point because if the second two workers fail and the third set of two workers are created then the Jetson typically freezes.

I’m working remotely on a different project today so I don’t have access to the Jetson, but I will provide a log of the stdout tomorrow. I will also try limiting the amount of memory as well.

AastaLLL · March 19, 2019, 3:32am

Hi,

Here are two more suggestions for your reference:

1. Try to limit the memory of each worker no more than 0.4 fraction.

2. Try to add session.close() to force the app return the memory.
Although this may not work for an unexpected failure.

Thanks.

ben.kawecki · March 25, 2019, 1:59pm

Hey sorry for the late reply on this, work got crazy on a two week sprint for a seperate project and this was sidelined for the last two weeks.

I will try limiting the amount of memory that each worker receives. Does this number (0.4) change if I also use TensorRT? If I decide to use more than two workers what is the total maximum memory usage you recommend?

I have a session.close() located in myworker code;however, it doesn’t seem to work. Moving the session object into a context manager wouldn’t help in this case because python is not returning an exception.

AastaLLL · March 29, 2019, 6:30am

Hi,

Thanks for your feedback.

The value will limit the memory amount that allocated by TensorFlow.
For example, 0.4 indicates tf.session() will request an [Total Memory]*0.4 memory when creating.
So you can set the value based on the number of workers you want to use.

Thanks.

Topic		Replies	Views
TensorFlow memory usage and extra processes/threads. How can we prevent so many resources from being used? Frameworks (archived) tensorflow	0	515	July 17, 2020
Session getting killed in TX2 Jetson TX2	4	1019	October 18, 2021
Full processing only in one GPU, how to leverage the processing to another GPU? Frameworks (archived) tensorflow	1	1130	April 16, 2019
Cuda Error, creating more than one session using tensorflow Jetson TX2	9	3992	October 18, 2021
Limiting memory per process Linux	0	1214	November 2, 2017
Why tensorRT occupy many memory ? Jetson TX2	9	3975	May 12, 2021
Slower inference times when running multiple programs Deep Learning (Training & Inference)	0	440	May 6, 2019
Reduce TensorFlow GPU usage Jetson TX2	10	1434	October 18, 2021
Using GPU in 2 processes (keras) in parallel - crash Jetson AGX Xavier	2	862	October 18, 2021
What happens if two processes running with MPS need more memory than available on GPU CUDA Programming and Performance	2	556	October 12, 2021

Multiple Processes performing inference using the same TF Graph

Related topics