I was wondering if it is possible to write to a cudaArray without writing to linear memory and then doing at device to device mem copy
No, this is not possible currently, you have to do a memcpy.
Note that there is a performance bug with device-to-device mem copies in the current beta, the performance will improve greatly in the next release.
When can we expect to see a new release? I have a project where I do several memcopies that could need that improvement.