Human_PAR pipeline error

I try with Clara Parabricks v3.6.1 free trial version. I found “UnicodeDecodeError: ‘utf-8’ …” when using Human_PAR pipeline on haplotypecaller step. The error messages as below :
Traceback (most recent call last):
File “/parabricks/run_pipeline.py”, line 7, in
sys.exit(PB.pb_main())
File “PB.pyx”, line 1710, in PB.pb_main
File “/parabricks/pbargs_check.py”, line 550, in pbargs_check
check_haplotypecaller(runArgs.runArgs)
File “/parabricks/pbargs_check.py”, line 179, in check_haplotypecaller
check_human_par(runArgs)
File “/parabricks/pbargs_check.py”, line 76, in check_human_par
chrom_reads = f.readlines()
File “/usr/lib/python3.7/codecs.py”, line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0xc5 in position 26: invalid continuation byte

Could not run haplotypecaller as part of human par germline pipeline
Exiting pbrun …

The Germline pipeline is worked fine. Anyone have same problem ?

Thanks

Hey @chumpol.nga,

Can you send the command that you ran to generate this error? On first glance, it looks like something could be corrupted in one of your input files. Have you checked for that?

Thank you

Hi @gburnett,
I have test on my data and also the example file from Parabricks. The command that I used as follow :

pbrun human_par --ref Homo_sapiens_assembly38.fasta --tmp-dir /raid/scratch --in-fq sample_1.fq.gz sample_2.fq.gz “@RG\tID:sample1\tLB:lib1\tPL:PL1\tSM:sample1\tPU:unit1” --knownSites Homo_sapiens_assembly38.known_indels.vcf.gz --range-male 1-10 --range-female 150-250 --out-bam sample1.cram --gvcf --out-variants sample1.par.g.vcf.gz --out-recal-file sample1.recal.txt 2>&1 | tee sample1_PAR_output.log

I also attached the output in this email.

Regards,
Chumpol

sample1_PAR_output.log (9.65 KB)

Hey @chumpol.nga,

I am still working on debugging this. I ran the code and got a different error. I will talk to the engineering team about this.