Figure 1. B Conservation of 6 critical amino acid residues in the spike S protein. C Three candidate positively selected sites marked with inverted triangles in the zhao jie dating show domain RBD of spike protein S N, S V and SQ and the surrounding 10 amino acids. Notably, we found that the nucleotide divergence at synonymous sites between SARS-CoV-2 and other viruses was much higher than previously anticipated.

Note that nonsynonymous sites are usually under stronger negative selection than synonymous sites, and calculating sequence differences without separating these two classes of sites may underestimate the extent of molecular divergence by several folds.

In particular, the spike gene S consistently exhibited larger dS values than other genes Table 1. This pattern became clear when we calculated the dS value for each branch in Fig. In each branch, the dS of spike was 2.

This extremely elevated dS value of spike could be caused either by a high mutation rate or by natural selection that favors synonymous substitutions. Synonymous substitutions may serve as another layer of genetic regulation, guiding the efficiency of mRNA translation by changing codon usage [ 23 ].

If positive selection is the driving force for the higher synonymous substation rate seen in spike, we expect the frequency of optimal codons FOP of spike to be different from that of other genes.

However, our codon usage bias analysis Table S2 suggests the FOP of spike was only slightly higher than that of the genomic average 0.

Thus, we believe that the elevated synonymous substitution rate measured in spike is more likely caused by higher mutational rates; however, the underlying molecular mechanism remains unclear.

Although several ancient recombination events have been described in spike [ 2930 ], it also seems likely that the identical functional sites in SARS-CoV-2 and GD Pangolin-CoV may actually result from coincidental convergent evolution [ 18 ]. By assuming the synonymous substitution rate u of 1.

