Minimize energy

Energy minimization of a ligand in a protein pocket as used in the PoseBusters/DynamicBind paper.

This code is based on the OpenMM user guide: http://docs.openmm.org/latest/userguide

class posebench.models.minimize_energy.NoHydrogen[source]

Select class that filters out hydrogen atoms.

accept_atom(atom: Any) bool[source]

Return True if the atom element is not H or D.

posebench.models.minimize_energy.add_ligand_to_complex(modeller: Modeller, ligand: Molecule) Modeller[source]

Add a ligand to a protein-ligand complex.

Parameters:
  • modeller – The Modeller object for the protein-ligand complex.

  • ligand – The prepared ligand.

Returns:

The updated Modeller object for the protein-ligand complex.

posebench.models.minimize_energy.compute_ligand_local_geometry_violations(ref_input_sdf_filepath: str, input_sdf_filepath: str | None) int[source]

Computes the number of ligand local geometry violations in the SDF file.

Parameters:
  • ref_input_sdf_filepath – Path to the reference (e.g., original) ligand input SDF file.

  • input_sdf_filepath – Path to the optional latest ligand input SDF file.

Returns:

The cumulative number of ligand local geometry violations.

posebench.models.minimize_energy.compute_protein_local_geometry_violations(input_pdb_filepath: str) tuple[int, list[int]][source]

Computes the number of protein local geometry violations in the PDB file.

Parameters:

input_pdb_filepath – Path to the latest protein input PDB file.

Returns:

A tuple containing the number of protein local geometry violations and the list of residue indices with associated violations.

posebench.models.minimize_energy.deserialize(path: str) System[source]

Deserialize an OpenMM system from a file.

Parameters:

path – Path to the file to deserialize the system from.

Returns:

The deserialized system.

posebench.models.minimize_energy.generate_conformer(mol: Mol)[source]

Generate an RDKit conformer for a molecule.

Parameters:

mol – The molecule for which to generate an RDKit conformer.

posebench.models.minimize_energy.generate_system(modeller: Modeller, ligands: Molecule, num_particles_protein: int, num_particles_total: int, name: str, force_fields: list[str], relax_protein: bool, cache_dir: Path | None = None) System[source]

Generate an OpenMM system for the protein-ligand complex.

Parameters:
  • modeller – The Modeller object for the protein-ligand complex.

  • ligands – The prepared ligands.

  • num_particles_protein – The number of particles in the protein.

  • num_particles_total – The total number of particles in the complex.

  • name – The name of the system.

  • force_fields – The force fields to use.

  • relax_protein – Whether or not to relax the protein.

  • cache_dir – The directory to cache the system in.

Returns:

The generated OpenMM system.

posebench.models.minimize_energy.get_all_ligand_non_hydrogen_atoms(input_sdf_filepath: str | None, ref_input_sdf_filepath: str | None, input_mol: Mol | None = None, ref_mol: Mol | None = None) tuple[ndarray, list[list[bool]]][source]

Get all non-hydrogen atoms in the ligand SDF file.

Parameters:
  • input_sdf_filepath – Path to the latest ligand input SDF file.

  • ref_input_sdf_filepath – Path to the reference (e.g., original) ligand input SDF file.

  • input_mol – The optional RDKit molecule to directly use for the input ligand.

  • ref_mol – The optional RDKit molecule to directly use for the reference ligand.

Returns:

A tuple containing the NumPy ligand atom coordinates and the list-of-lists ligand adjacency matrix.

posebench.models.minimize_energy.get_all_protein_atoms(input_pdb_filepath: str) list[Any][source]

Returns a list of all protein atoms in the PDB file.

Parameters:

input_pdb_filepath – Path to the input PDB file.

Returns:

A list of all protein atoms in the PDB file.

posebench.models.minimize_energy.get_fastest_platform() Platform[source]

Get the fastest available OpenMM platform.

Returns:

The fastest available OpenMM platform.

posebench.models.minimize_energy.load_molecule(mol_path: Path, **kwargs) Molecule[source]

Load a molecule from a file.

Parameters:
  • mol_path – Path to the molecule file.

  • kwargs – Additional keyword arguments to pass to the Molecule.from_file method.

Returns:

The loaded molecule.

posebench.models.minimize_energy.main(cfg: DictConfig)[source]

Minimize the energy of a ligand in a protein pocket.

posebench.models.minimize_energy.minimize_energy(cfg: DictConfig)[source]

Minimize the energy of a ligand in a protein pocket using OpenMM.

posebench.models.minimize_energy.optimize_ligand_in_pocket(protein_file: Path, ligand_file: Path, output_file: Path | None = None, protein_output_file: Path | None = None, complex_output_file: Path | None = None, protein_gap_mask: str | None = None, ligand_stiffness: float = 3000.0, protein_stiffness: float = 1000.0, protein_residue_set: str = 'non_hydrogen', tolerance: float = 0.01, allow_undefined_stereo: bool = True, prep_only: bool = False, temp_dir: Path = PosixPath('.'), name: str | None = None, add_solvent: bool = False, relax_protein: bool = False, remove_initial_protein_hydrogens: bool = False, assign_each_ligand_unique_force: bool = False, model_ions: bool = False, cache_files: bool = True, assign_partial_charges_manually: bool = False, report_initial_energy_only: bool = False, platform_name: str = 'fastest', cuda_device_index: int = 0, max_iterations: int = 0, energy: float = Unit({BaseUnit(base_dim=BaseDimension('amount'), name='mole', symbol='mol'): -1.0, ScaledUnit(factor=4.184, master=kilojoule, name='kilocalorie', symbol='kcal'): 1.0}), length: float = Unit({BaseUnit(base_dim=BaseDimension('length'), name='angstrom', symbol='A'): 1.0})) dict[str, Any][source]

Optimize a ligand in a protein pocket using OpenMM.

Parameters:
  • protein_file – Path to the protein file.

  • ligand_file – Path to the ligand file.

  • output_file – Path to save the optimized ligand to.

  • protein_output_file – Path to save the optimized protein to.

  • complex_output_file – Path to save the optimized protein-ligand complex to.

  • protein_gap_mask – Mask of the protein regions in the PDB file for which to skip relaxation, represented as a sequence-length string.

  • ligand_stiffness – Stiffness of the ligand chains.

  • protein_stiffness – Stiffness of the protein chains.

  • protein_residue_set – Residue set to use for relaxation. Can be either non_hydrogen or c_alpha.

  • tolerance – The tolerance to use for energy minimization.

  • allow_undefined_stereo – Whether or not to allow undefined stereochemistry.

  • prep_only – Whether or not to only prepare the ligand and protein.

  • temp_dir – The temporary directory to use.

  • name – The name of the system.

  • add_solvent – Whether or not to add solvent to the protein.

  • relax_protein – Whether or not to relax the protein.

  • remove_initial_protein_hydrogens – Whether or not to remove the initial protein hydrogens.

  • assign_each_ligand_unique_force – Whether to assign each ligand a unique force constant.

  • model_ions – Whether or not to model ions.

  • cache_files – Whether or not to cache the prepared files.

  • assign_partial_charges_manually – Whether or not to assign partial charges manually.

  • report_initial_energy_only – Whether or not to report the initial energy only.

  • platform_name – The name of the OpenMM platform to use.

  • cuda_device_index – The index of the CUDA device to use.

  • max_iterations – The maximum number of iterations to use for energy minimization.

  • energy – When relaxing the protein, the unit of energy to use.

  • length – When relaxing the protein, the unit of length to use.

Returns:

A dictionary containing the energy before and after minimization, and the optimized ligand.

posebench.models.minimize_energy.prep_ligand(ligand_file: Path, temp_file: Path, relax_protein: bool, allow_undefined_stereo: bool = True, mol: Mol | None = None) tuple[Molecule, str | None][source]

Prepare a ligand for use in OpenMM.

Parameters:
  • ligand_file – Path to the ligand file.

  • temp_file – Path to the temporary file to save the prepared ligand to.

  • relax_protein – Whether or not to relax the protein.

  • allow_undefined_stereo – Whether or not to allow undefined stereochemistry.

  • mol – The optional RDKit molecule to directly use for the ligand.

Returns:

The prepared ligand.

posebench.models.minimize_energy.prep_protein(protein_file: Path, temp_file: Path, relax_protein: bool, remove_initial_protein_hydrogens: bool, add_solvent: bool = False) PDBFile[source]

Prepare a protein for use in OpenMM.

Parameters:
  • protein_file – Path to the protein file.

  • temp_file – Path to the temporary file to save the prepared protein to.

  • relax_protein – Whether or not to relax the protein.

  • remove_initial_protein_hydrogens – Whether or not to remove the initial protein hydrogens.

  • add_solvent – Whether or not to add solvent to the protein.

Returns:

The prepared protein.

posebench.models.minimize_energy.randomly_rotate_protein_side_chains(structure: Any, violation_residue_idx: list[int], min_angle: float = 0.6, max_angle: float = 3.141592653589793, num_chi_groups: int = 5, eps: float = 1e-06) Any[source]

Randomly rotate side chains of the protein residues with local geometry violations.

Parameters:
  • structure – Structure to rotate.

  • violation_residue_idx – List of residue indices with local geometry violations.

  • min_angle – Minimum angle to rotate.

  • max_angle – Maximum angle to rotate.

  • num_chi_groups – Number of side chain groups.

  • eps – Epsilon value.

posebench.models.minimize_energy.remove_hydrogens_from_pdb(input_pdb_filepath: str, output_pdb_filepath: str)[source]

Remove hydrogen atoms from a PDB file.

Parameters:
  • input_pdb_filepath – Path to the input PDB file.

  • output_pdb_filepath – Path to the output PDB file.

posebench.models.minimize_energy.remove_mol_hydrogens_and_reorder(mol: Mol) Mol[source]

Remove hydrogens from a molecule and re-order the atoms.

Parameters:

mol – The molecule to process.

Returns:

The processed molecule.

posebench.models.minimize_energy.save_biopython_structure_to_pdb(structure: Any, output_pdb_filepath: str, ca_only: bool = False)[source]

Saves a BioPython protein Structure objects to a PDB file.

Parameters:
  • structure – Structure to save.

  • output_pdb_filepath – Path to the input PDB file.

  • ca_only – Whether to save only the CA atoms or not.

posebench.models.minimize_energy.save_with_rdkit(molecule: Molecule, file_path: Path, conformer_index: int = 0, name: str | None = None, mol: Mol | None = None)[source]

Save a molecule to a file using RDKit.

Parameters:
  • molecule – The molecule to save.

  • file_path – The path to save the molecule to.

  • conformer_index – The index of the conformer to save.

  • name – The name to give the molecule.

  • mol – The optional RDKit molecule to directly use for the ligand.

posebench.models.minimize_energy.serialize(system: System, path: str) str[source]

Serialize an OpenMM system to a file.

Parameters:
  • system – The system to serialize.

  • path – Path to the file to serialize the system to.

Returns:

The path to the serialized system.

posebench.models.minimize_energy.setup_simulation(modeller: ~openmm.app.modeller.Modeller, system: ~openmm.openmm.System, platform: ~openmm.openmm.Platform, platform_properties: dict[str, ~typing.Any], temperature: float = Quantity(value=300.0, unit=kelvin), friction_coeff: float = Quantity(value=1.0, unit=/picosecond), step_size: float = Quantity(value=0.002, unit=picosecond)) Simulation[source]

Set up an OpenMM simulation for the protein-ligand complex.

Parameters:
  • modeller – The Modeller object for the protein-ligand complex.

  • system – The OpenMM system for the protein-ligand complex.

  • platform – The OpenMM platform to use.

  • platform_properties – The properties to use with the OpenMM platform.

  • temperature – The temperature to use with a LangevinIntegrator.

  • friction_coeff – The friction coefficient to use with a LangevinIntegrator.

  • step_size – The step size to use with a LangevinIntegrator.

Returns:

The setup OpenMM simulation.

posebench.models.minimize_energy.will_restrain(atom: Atom, residue_set: str) bool[source]

Returns True if the atom will be restrained by the given restraint set.

Parameters:
  • atom – Atom to check.

  • residue_set – Residue set to use for checking.