Directionally Weighted Demosaicing on CUDA

Dear community members,

I have recently implemented a simple 3x3 demosaicing (linear interpolation) . Even if it’s quite fast, the results are so-so. So I was thinking about implementing a other method like Directionally Weighted Demosaicing (with or without median filtering). Before I start from scratch It came to my mind to ask you, if someone of you have some experience with these algorithms, giving me some advices or even point me to so snippets.

Many thanks in advance
Best greg