5.6 Discussion
The Bayesian models have been shown to be less precise than the random forest trained on sequence features and pseudo-likelihood couplings. Eventhough the coupling prior modelled as a Gaussian mixture seems to reproduce the empirical distributions of high evident couplings very well, there might be possible drawbacks of the current implementation.
First of all, the Gaussian components are modelles with diagonal covariance matrices. Much more information can be learned by using full covariance matrices whcih would also require less components. However, using full covariance matrices would increase computational complexity because the inverse of the matrix has to be computed.
It is unlikely that using more data would improve the models. For one thing, the neg log likelihood has been monitored on a validation set during optimization and there seems to be no overfitting. Further on, the hyperparameter statistics are almost identical regardless of the training set size.
There is only one assumption in the theoretical framework that might be incorrect, but unfortunately it is not easy to verify: setting the off-diagonal block matrices in the Hessian to zero (see method section 5.7.4). These off-diagonal block matrices describe the interdependency between specific couplings in different pairs of columns. However, in our view the entries in these off-diagonal matrices should be negligible.
Another important point is that the quality of the Gaussian approximation to the posterior distribution of couplings \(p(\w | \X , \v^*)\) depends on two points,
- how well is the posterior distribution of couplings approximated by a Gaussian
- how closely does the mode of the posterior distribution of couplings lie near the mode of the integrand in equation (5.9).
The second point can be addressed quite effectively by learning a simple isotropic Gaussian prior with the same framework that is used to infer the hyperparameters for the Gaussian mixture of the coupling prior. Since the new regularisation prior will be very close to the mode of the integrand in the marginal likelihood, the approximation for the second iteration has improved in comparison to the first iteration. The new regularisation prior would also be used to generate new MAP estimates for the couplings.
Furthermore, a proof of concept that the full information in the coupling matrices can be used to improve the precision of contact predictions was given in the work of Golkov and colleagues [237]. The developed a convolutional neural network for the prediction of protein residue-residue contacts that uses only coupling matrices as input features. In their benchmark the convolutional network predictor improved over Meta-PSICOV, which is a meta predictor combining several coevolution methods and sequence features.
References
237. Golkov, V., Skwark, M.J., Golkov, A., Dosovitskiy, A., Brox, T., Meiler, J., and Cremers, D. (2016). Protein contact prediction from amino acid co-evolution using convolutional networks for graph-valued images. In Adv. neural inf. process. syst. 29, D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, eds. (Curran Associates, Inc.), pp. 4222–4230.