Two aspects:
- Does it add jitter/latency?
- If jitter/latency is being added, is it sufficiently large, when compared to other factors, such as actual external network transit jitter/latency, that it materially changes the overall performance?
On the first, I don’t think anything noticeable is added with modern systems. Unlike the phrasing “it has to go through two network stacks” might suggest, it is not really two full network stacks. Details obviously depend on, and may be different, depending on the virtualization implementation. But in many cases, the virtual network card in the guest is not fully emulating an actual hardware NIC, but actually aware of its own virtualized nature (just like, e.g., the block storage, or memory management). I.e., packets don’t go up one entire network stack, then down again that same one, to then enter and go up a second network stack. Rather, at least the lower parts of the stack are entirely on the host system, and only the upper parts are actually in the guest. Very roughly like the separation that also exists on a bare metal instance, where the packet enters, and is initially handled within the kernel, before it is passed to the user-space time daemon.
What is more of a challenge is the typical resource sharing nature of virtualization. I.e., does the guest running the time daemon get the resources it needs in a timely fashion. On a well-managed system, that shouldn’t be a problem. On over-committed systems, it may become one.
On the second item, are any jitter/delay incurred relevant in the grand scheme of things? In many cases probably not. E.g., “pool standards” to me means potentially some network distance between a client and the server. In those cases, unless the virtualization is somewhat poor, e.g., high over-commitment, the jitter/delay incurred on the external network would by far outweigh anything that is going on between a virtualized server and its host.
On the other hand, if one needs to synchronize to high accuracy on a local, high-speed unloaded network, it might become more relevant. But then, maybe NTP isn’t even the right protocol anymore, if the requirements are really high. Or at least maybe even a bare-metal but general-purpose server isn’t good enough anymore, and one needs a specialized hardware device with external reference and high-accuracy oscillator.
Now, I don’t have any specific numbers, or statistics, but I think the references mentioned previously will, or will have references to such data. And my description above is based on my take-away from sources such as the above