Non-protein electron transfer active moieties

Introduction

Frequently, protein crystal structures contain residues which are not amino acids, and do not belong to the polypeptide chain(s). Many of these residues can play significant roles in electron/hole transfer. PyeMap automatically identifies non-protein electron/hole transfer (ET) active moieties, and gives users the option to include them in the analysis. In the current implementation, non-protein ET active moieties identified by PyeMap are non-amino acid aromatic sites, extended conjugated systems, and a pre-defined list of metal clusters and redox active metal ions. For a given non-standard co-factor (e.g., flavin adenine dinucleotide), there can be multiple non-protein ET active moieties identified by PyeMap, and they will appear as separate nodes on the graph if selected for analysis.

Identification

Aromatic moieties and extended conjugated chains

After initial parsing, non-protein residues are analyzed for detection of ET active moieties. For each non-standard residue, a chemical graph is constructed using the NetworkX library, consisting of the O, C, N, P and S atoms in the residue. To isolate the conjugated systems, an edge is only drawn between two atoms j and k if:

\[r_{\text{jk}} \leq \ \ \overline{x} - 2\sigma_{\overline{x}}\]

where x is the mean single-bond distance between those two elements, and \(\sigma_\bar{x}\) is the standard deviation. If there are any conjugated systems, the resulting chemical graph will be a forest of connected component subgraphs. Each subgraph that contains a cycle, or consists of 10 or more atoms will be considered a non-protein ET active moiety, and can be selected for the analysis.

As one would expect, the structures generated by this procedure are not always correct, often due to poor resolution in PDBs. We have two functions which try to correct the chemical graph in order to generate proper structures. The first is cleanup_bonding(), which connects atoms that are within experimental single bond lengths and are only connected to 1 or 2 neighbors. This function often fixes broken aromaticity. The second is remove_side_chains(), which recursively removes non-aromatic side chains from aromatic moieties.

The final chemical graph is used to construct a SMILES string using the pysmiles package, which can be visualized using standard cheminformatics tools such as RDKIT.

Clusters

The PyeMap repository contains a list of 66 inorganic clusters which are automatically identified by their 3 character residue names. All atoms in the residue are collected as part of the customized residue object, and a pre-rendered image is used for visualization of chemical structure. Otherwise, they can be used and interacted with just like any other residue. The list of clusters and pre-rendered images were obtained from the Protein Data Bank in Europe (PDBe).

Redox-active Metal ions

PyeMap automatically identifies a set of redox-active metal ions to include as residues in the graph.

Table 1 Available metal ions

Element

Charges

Cu

+1, +2, +3

Fe

+2, +3

Mn

+2, +3

Co

+2, +3

Mo

0, +4, +6

Ni

+2

Cr

+3

Visualization

Chemical structures of residues(not including user-specified residues) can be visualized using the residue_to_Image(), init_graph_to_Image() functions. SMILES strings and NGL Viewer selection strings are also accessible through the emap object. Note that SMILES strings are not available for clusters and user-specified residues.

Source

pyemap.custom_residues.find_conjugated_systems(...)

Finds conjugated systems within a BioPython residue object, and returns them as individual customized BioPython Residue objects.

pyemap.custom_residues.process_custom_residues(...)

Identifies and returns customized Bio.PDB.Residue objects corresponding to electron transfer active moieties.

pyemap.structures.cleanup_bonding(res_graph)

Connects nodes that should be connected to fix broken aromaticity.

pyemap.structures.remove_side_chains(res_graph)

Removes non-aromatic sides chains on aromatic eta moieties.