Protein sequences follow a discrete alphabet rendering gradient-based techniques a poor choice for optimization-driven protein design. Contemporary approaches instead perform optimization in a continuous latent representation, but unfortunately the representation metric is generally a poor measure similarity between the represented proteins. This make (global) Bayesian optimization over such latent representations inefficient as commonly applied covariance functions are strongly dependent on the representation metric. Here we argue in favor of using the Jensen-Shannon divergence between the represented protein sequences to define a covariance function over the latent representation. Our exploratory experiments indicate that this kernel is worth further investigation.