i’m a new cuda user and i need help in a problem regarding string operation using cuda. I need to sort string but my problem is that i can’t understand if there is a way to do it by means of cuda… shall i use vector of char or there is a prebuilded string type ?
Well I could say that several things are better to be done using the CPU. It is just for keeping in mind: reading 8 bits using devices with compute capability 1.1/1.0 will produce non coalesced memory accesses to global memory.
Which can be mitigated by using (string_length+3)/4 threads and copying to shared memory first - Each thread copies 4 bytes. Similarly, the thread access patterns to shared memory can be rearranged to avoid bank conflicts. I’d say, if you know your hardware this can be done.
One question is whether each thread would compare one byte and the result of the strcmp() is determined through a reduction step. It should depend a lot on the expected string length - only large lengths make this approach feasible.
Alternatively you could have each thread compare and conditionally swap two strings. This however may becomes quickly branch divergent if the strings have different lengths. Ideally all of your strings are same length (or padded with spaces). Like for example a hugh block of MD5 hashes in ASCII format.