Development of Accurate Coarse-grained Models for Molecular Systems


Atomically-detailed, or all-atom (AA), molecular simulations have emerged as one of the most powerful theoretic tools for studying complex, condensed phase processes of specific systems. However, these simulations are too computationally expensive to simulate biological systems on relevant length and time scales. Consequently, lower resolution, coarse-grained (CG) models have gained much interest in recent years. In particular, the increased sophistication of AA models has motivated the development of “bottom-up” CG approaches which parameterize the low resolution model based on simulations of an AA model. Bottom-up CG models have the potential to predict properties of specific molecular systems, but only if they incorporate the correct physics of the underlying model.

Bottom-up Methods

Bottom-up models are often parameterized to reproduce structural properties of the underlying system. In this case, there exists a configuration-dependent free energy function which is the proper CG energy function for reproducing all structural distributions of the AA model, at the CG level of resolution, known as the many-body potential of mean force (PMF). In any practical case, the PMF is a high dimensional function which is too complicated to simulate or even write down. Consequently, structure-based bottom-up CG methods attempt to approximate the PMF with some simple set of interaction functions, usually of a molecular mechanics type form.

Several methods have been proposed which implicitly approximate the PMF by iteratively refining the CG potential energy function to reproduce a set of low order structural distribution functions. These methods have been used to model a wide range of condensed-phase systems. However, for molecules with complex intramolecular structure, it is not obvious that reproducing a set of low order distribution functions will, in general, correspond to reproducing higher order structure, e.g., the propensity of internal states, of the molecule.

In contrast to the above methods, a different class of structure-based bottom-up approaches employ variational principles to systematically and explicitly approximate the PMF. The multiscale coarse-graining (MS-CG) method is a force-matching-based approach which minimizes the difference between the net force on each CG site given by the AA and CG models in a least squares sense. The MS-CG method directly (i.e., not iteratively) calculates the “optimal” approximation to the PMF by projecting the many-body mean force field into the space of force fields spanned by the basis functions chosen for the calculation. This projection is rigorously equivalent to solving a generalized-Yvon-Born-Green (g-YBG) equation which relates the force field parameters to structural correlation functions.

In my graduate work, I investigated the fundamental limitations and approximations of the MS-CG and g-YBG methods. For more details about these methods, see my Graduate Publications page. For a review on CG modeling of biological systems, see Will Noid’s JCP Perspective.

Noid, W.G. “Perspective: Coarse-grained models for biomolecular systems” Journal of Chemical Physics (2013) 139, 090901.

CG Dynamics

Unfortunately, because CG models smooth out the features of the underlying potential energy function, their dynamical properties are inherently difficult to interpret.  A uniform time rescaling factor is often applied in order to interpret the dynamical implications of CG simulation data. In general, however, the dynamical rescaling between the AA and CG models may be arbitrarily complex.  This complication severely limits the potential utility of CG models for investigating specific processes beyond the current capabilities of AA models. Although several recent bottom-up methods have been proposed for parameterizing the equations of motion of the CG model in order to reproduce certain “local” dynamical properties of an underlying model, it is not yet clear how this parameterization will ensure the reproduction of overarching kinetic features of the underlying system.  

Consequently, we have approached the problem in a very different way. We have recently proposed a method for reweighting the dynamical paths sampled in a molecular simulation to be consistent with a set external kinetic data.  In this way, we can quantitatively investigate the relationship between CG and AA kinetic properties for particular model systems. In order to efficiently determine the reweightings of dynamical paths, we employ well-developed Markov state modeling techniques.  The Markov state model resulting from the reweighting of paths implies an effective change in timescale for particular dynamical paths sampled in the CG trajectory, providing a non-uniform time rescaling between the AA and CG description of dynamics.  This rescaling information may be either 1. extrapolated to other, distinct systems or thermodynamics state points or 2. employed in a reparameterization of the CG model. (See Postdoctoral Publications page for more details).