Lu Factorizazion CUDA (Stand-alone)

I’m looking for a stand-alone version of parallel algorithm Lu Factorization in CUDA C. I mean without cuBLAS/Magma/etc libraries. Because I can’t install that kind of libraries on my sistem and i can’t even update it; now there is CUDA 4 on it.