Analyzing Genome Sequence Data on AWS with WekaFS and NVIDIA Clara Parabricks Pipelines

Originally published at: https://developer.nvidia.com/blog/analyzing-genome-sequence-data-on-aws-with-wekafs-and-clara-parabricks-pipelines/

Whole genome sequencing has become an important and foundational part of genomic research, enabling researchers to identify genetic signatures associated with diseases, differentiate sequencing errors from biological signals, and better characterize the genomes of various organisms. With the ongoing COVID-19 pandemic threatening the globe, characterizing, and understanding genomes is now more crucial than ever. Commercially…

Thanks @jwitsoe! Great work by Robert and Bob! In particular very interesting how data management is made easier and especially how much better performance is over CPU and with the upgrade to newer version of Parabricks!

I greatly enjoyed working on this solution to accelerate genome analysis, especially given the ongoing pandemic as I feel it has a very tangible and relevant impact on the world. If you have any questions or comments, let us know. Thanks Jen for your wonderful work as well!