Hello everyone,
I am currently working with GenMol, which requires the SAFE format for molecular generation.I have been using the following code but haven’t been able to get the expected results on the GenMol website. I was wondering if anyone could help me troubleshoot or clarify the correct approach.
https://build.nvidia.com/nvidia/genmol-generate
import datamol as dm
import safe as sf
for Baricitinib molecule:
smiles = “CCS(=O)(=O)N1CC(C1)(CC#N)N2C=C(C=N2)C3=C4C=CNC4=NC=N3”
mol = dm.to_mol(smiles)
safe_str = sf.encode(mol)
print(safe_str)
print(f"Representation using {len(safe_str.split(‘.’))} fragments")
this is the output of this code:
c18ncnc2[nH]ccc12.N15CC67C1.n17cc8cn1.CCS5(=O)=O.C6C#N
But the Molecule Sequence in GenMol website:
C124CN3C1.S3(=O)(=O)CC.C4C#N.[*{20-20}]
I would truly appreciate any guidance in this regard!
Best regards,
Negar