The adoption of Machine Learning (ML) for building emulators for complex physical processes has seen an exponential rise in the recent years. While ML models are good function approximators, optimizing the hyper-parameters of the model to reach a global minimum is not trivial, and often needs human knowledge and expertise. In this light, automatic ML or autoML methods have gained large interest as they automate the process of network hyper-parameter tuning. In addition, Neural Architecture Search (NAS) has shown promising outcomes for improving model performance. While autoML methods have grown in popularity for image, text and other applications, their effectiveness for high-dimensional, complex scientific datasets remains to be investigated. In this work, a data driven emulator for turbulence closure terms in the context of Large Eddy Simulation (LES) models is trained using Artificial Neural Networks and an autoML framework based on Bayesian Optimization, incorporating priors to jointly optimize the hyper-parameters as well as conduct a full neural network architecture search to converge to a global minima, is proposed. Additionally the effect of using different network weight initialization and optimizers such as ADAM, SGDM and RMSProp, are explored. Weight and function space similarities during the optimization trajectory are investigated, and critical differences in the learning process evolution are noted and compared to theory. We observe ADAM optimizer and Glorot initialization consistently performs better, while RMSProp outperforms SGDM as the latter appears to have been stuck at a local optima. Therefore, this autoML BayesOpt framework provides a means to choose the best hyper-parameter settings for a given dataset.