performance advantage of layered textures?

Hi all,
Is there any performance advantage (for texture evaluation) when using layered textures
for texture evaluation with floating point and in bi-linear interpolation mode?

I am mostly concerned w/ Fermi & Tesla environments.

By “advantage”, I mean, for example: 4 layers in a layered context (4 calls
to tex2DLayered to evaluate the 4 layers, with a different layer index each time…
VERSUS 4 separate calls to 4 separate single layered textures).

These would be all evaluated one right after the other because I need
all the results at once.

I see this call that has the integer layer index at the end of its
arg list…

tex2DLayered(
texture<DataType, cudaTextureType2DLayered, readMode> texRef,
float x, float y, int layer);

But there doesn’t seem to be any notes about performance of layered textures versus
ordinary 2D textures in the programming guide.

Or, is the texture layering API mainly there for programming structure convenience?