Non-protein electron transfer active moieties
Introduction
Frequently, protein crystal structures contain residues which are not amino acids, and do not belong to the polypeptide chain(s). Many of these residues can play significant roles in electron/hole transfer. PyeMap automatically identifies non-protein electron/hole transfer (ET) active moieties, and gives users the option to include them in the analysis. In the current implementation, non-protein ET active moieties identified by PyeMap are non-amino acid aromatic sites, extended conjugated systems, and a pre-defined list of metal clusters and redox active metal ions. For a given non-standard co-factor (e.g., flavin adenine dinucleotide), there can be multiple non-protein ET active moieties identified by PyeMap, and they will appear as separate nodes on the graph if selected for analysis.
Identification
Aromatic moieties and extended conjugated chains
After initial parsing, non-protein residues are analyzed for detection of ET active moieties. For each non-standard residue, a chemical graph is constructed using the NetworkX library, consisting of the O, C, N, P and S atoms in the residue. To isolate the conjugated systems, an edge is only drawn between two atoms j and k if:
where x is the mean single-bond distance between those two elements, and \(\sigma_\bar{x}\) is the standard deviation. If there are any conjugated systems, the resulting chemical graph will be a forest of connected component subgraphs. Each subgraph that contains a cycle, or consists of 10 or more atoms will be considered a non-protein ET active moiety, and can be selected for the analysis.
As one would expect, the structures generated by this procedure are not always correct,
often due to poor resolution in PDBs. We have two functions which try to correct the chemical graph in order to
generate proper structures. The first is cleanup_bonding(), which connects atoms that are
within experimental single bond lengths and are only connected to 1 or 2 neighbors. This function often fixes broken aromaticity.
The second is remove_side_chains(), which recursively removes non-aromatic side chains from aromatic moieties.
The final chemical graph is used to construct a SMILES string using the pysmiles package, which can be visualized using standard cheminformatics tools such as RDKIT.
Clusters
The PyeMap repository contains a list of 66 inorganic clusters which are automatically identified by their 3 character residue names. All atoms in the residue are collected as part of the customized residue object, and a pre-rendered image is used for visualization of chemical structure. Otherwise, they can be used and interacted with just like any other residue. The list of clusters and pre-rendered images were obtained from the Protein Data Bank in Europe (PDBe).
Redox-active Metal ions
PyeMap automatically identifies a set of redox-active metal ions to include as residues in the graph.
Element |
Charges |
|---|---|
Cu |
+1, +2, +3 |
Fe |
+2, +3 |
Mn |
+2, +3 |
Co |
+2, +3 |
Mo |
0, +4, +6 |
Ni |
+2 |
Cr |
+3 |
Visualization
Chemical structures of residues(not including user-specified residues) can be visualized using the residue_to_Image(), init_graph_to_Image() functions.
SMILES strings and NGL Viewer selection strings are also accessible through the emap object. Note that SMILES strings are
not available for clusters and user-specified residues.
Source
Finds conjugated systems within a BioPython residue object, and returns them as individual customized BioPython Residue objects. |
|
Identifies and returns customized Bio.PDB.Residue objects corresponding to electron transfer active moieties. |
|
|
Connects nodes that should be connected to fix broken aromaticity. |
|
Removes non-aromatic sides chains on aromatic eta moieties. |