Saturday, February 10, 2018

take_elementary_step.py: exploring chemical space one elementary step at a time

I have written a program called take_elementary_step.py. It is basically an implementation of the idea described in this paper by Zimmerman, but using the atom connectivity matrix approach described in this paper by Woo Youn Kim.

Zimmerman defines an elementary reaction step as a chemical rearrangement "with no more than two connections breaking and two connections forming simultaneously, while maintaining the upper and lower limits for coordination number of each atom." Two structures that are related in this way are very likely to be connected by a single transition state.

The code generates all such states for a given molecule and then selects the "best ones" based on a crude energy function based on bond dissociation energies. The idea is then to optimize the molecules using a QM method to refine the selection and then locate the TS connecting the selected structures to the input structures.  The atom ordering should be preserved (haven't checked yet) so they are suitable for interpolation-based methods such as NEB or GSM.

The default is to lists all structures with energies no higher than 200 kJ/mol compared to the input structure and to use homolytic bond cleavage, i.e. a C-H bonds is broken into a C and H radical, rather than, say, C- and H+, but this can be controlled by the "heterolytic" keyword.

Using the heterolytic option will result in incorrectly charged fragments in some cases, but these should be weeded out by the energy criterion. For example, if you input acetylene the code will generate HC3- + HC3- instead of HC3- + HC3+, but I didn't feel like writing complicated code to fix this since these possibilities will no be presented to the user because they are too high in energy.

The figure at the bottom shows the results using 1,3-butadiene + ethene as input. For some reason, RDKit has trouble displaying H2 molecules so they have been removed and acetylene looks very weird (e.g. for comp64).  There are 88 structures in all, but that number could be reduced significantly be removing molecules with unsaturated atoms and/or three- and four-membered rings.

In his paper, Zimmerman shows 5 selected structures obtained using his implementation:

Taken from P.M. Zimmerman, J. Comp. Chem. 2013, 34, 1385. Copyright John Wiley & Sons. 

The first three also appear below as comp8, comp21, and comp54, while the last two are seemingly missing. However, they actually don't correspond to elementary steps since more than 2 bonds are broken or formed, but result from the subsequent QM optimization of comp7 and comp52 where additional bonds are formed between unsaturated C atoms.





This work is licensed under a Creative Commons Attribution 4.0

No comments: