Your question in fact has multiple layers. I’ll try to address them one by one.
1. Nested class
The name of a nested class is in the scope of the enclosing class. This is the only thing important about the nested class. It does not change the memory layout or alignment requirement of the enclosing class. For example:
struct X1 {
int p;
struct A {
double q;
};
};
sizeof(X1)
is 4 and alignof(X1)
is 4.
However, this is different:
struct X2 {
int p;
struct A {
double q;
} a; // --> Note this part
};
Now that A a
is a data member of X2, sizeof(X2)
is 16 and alignof(X2)
is 8.
2. Empty class size
The size of any object or member subobject is required to be at least 1 even if the type is an empty class type. This means:
struct X3 {
};
Both sizeof(X3)
and alignof(X3)
are 1. This is actually what’s happened to your second printf
: S2
is an empty class (with S
being its nested class), so its alignment requirement is 1.
3. std::aligned_storage
I’m not an expert in libcu++, but from the standpoint of C++, the way you use std::aligned_storage
for aligned buffer is incorrect. It should be:
std::aligned_storage<Len,Align>::type buf; // C++11
std::aligned_storage_t<Len,Align> buf; // since C++14
A sample implementation of std::aligned_storage
is given here. It has only a nested class, and consequently its size and alignment requirement are both 1. This is what’s happened to your first printf
.
Nit: std::aligned_storage
is to be deprecated in C++23. I guess libcu++
will take this into account and eventually deprecate it too. Consider simply implementing your own aligned buffer.
4. alignof operator
To conform to the C++ standard, you probably should pass a type rather than an object to align(T)
. Consider using one of the following instead:
alignof(decltype(ss)); // C++
__alignof(ss); // CUDA built-in
__alignof(S2); // CUDA built-in