All we need is an easy explanation of the problem, so here it is.
I am observing low log_send_rate in my distributed AG setup.
I understand that AG uses log stream and so I assume it should not have anything to do with data, but I was wondering if this has anything to do with data it is transmitting also and not only OS resources (Network, I/O)?
Basic metrics for consideration:
- SQL Server 2019-CU16
- Source RAM 1.5 TB, 48 CPU <> Destination RAM 128 GB, 48 CPU – Does difference in memory play any part here?
- Both servers are in same DC, ping latency is <1ms. Destination server is VM.
- ROBOCOPY test shows file transfer rate of ~100 MB/s
- When high transaction log generation activities (like index maintenance or creation) is sent over to other replica – it transfers with rate of maximum 20 MB/s (Which is not expected). This is when log_send_queue piles up.
- REDO rate on other side is good, no REDO queue piling up there.
On source AG I don’t see anything for ‘Bytes Sent to Transport/Sec’ counter so I cannot determine if that is bottleneck or not.
Please suggest if I missed anything that I should have included.
How to solve :
I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.
A great resource for digging into this is Sean Gallardy’s post: Network Throughput Hysteria
The first thing would be to make sure that you are comparing equivalent things. From the post:
I almost never see anyone test their network the way that SQL Server uses it for AG traffic, which is as I stated before, a thread per database replica.
Sean recommends using ntttcp to measure single-thread, single-core network throughput. Doing that will give you a better baseline to compare the 20 MB/s to.
If there is still a really large gap that needs explaining, you may need to delve deeper into exactly where in the process the latency is occurring. Here’s an excellent article on doing just that from Microsoft support:
As you can see from the diagram there, there are a lot of steps in the process of transmitting and hardening log blocks to a secondary. The slowdown could be anywhere in there. There’s a link to a free tool at the end of that blog post that will analyze the Extended Event traces for you FYI.
As far as the data you provided:
The difference in memory on the destination is not ideal from a failover standpoint (can your workload run effectively with 1/12 of the RAM?), but since you are not seeing a high REDO queue, I’d be surprised if it’s contributing to the send queue pile ups you’re seeing.
It’s good that the replicas are in the same DC – that makes it less likely that overall network latency is to blame (you’re not trying to replicate to the cloud, or halfway across the world).
And again, the ROBOCOPY test might not be a good comparison.
Some additional factors to consider for low SQL Server Availbility Group log send rate.
This article gives good description and troubleshooting steps for a number of conditions that can result in Availability Group flow control and lower-than-expected log send rates.
If the Availability Group is synchronous, compression before send is disabled by default(save a little time and a little CPU, use a little more bandwidth). If the Availability Group is asynchronous, compression before send is enabled by default (take a little more time, use a little more CPU in order to reduce bandwidth needs).
For a given workload, the tradeoff might not work out favorably. Trace flags are available to override the default state of compression before send for both synchronous and asynchronous Availablility Groups.
For SQL Server vms, I always recommend boosting the TCP receive buffers for the network adapters. If this is a VMware VM, i strongly recommend the vmxnet3 network adapter. For each vmxnet3 adapter in a VM, the net difference in vRAM use within the vm is less than 18 mb when parameters "small rx buffers" and "rx ring #1 size" are increased from default values to maximum values. (These are the parameters to change if the "Jumbo Packet" parameter has value "Standard 1500" or anything close to 1500 like 1512. If "Jumbo Packet" is 8000 have to find jumbo packet TCP recieve parameters to modify.) In the case of an Availability Group, ensuring adequate TCP recieve resource on the secondary reduces transmission slowdowns due to packet loss along the way.
Make sure that power plan is high performance on sending and recieving side of the Availability Group. For the VM secondary target, may need to verify at VM and host level. For older processors, if fewer than half the cores on a socket are busy all cores on the socket could be slowed down(rather than power-tuning of individual cores as in later processors). And when cores are slowed down, other components such as network adapters can be slowed down as well (even memory access times can be slowed with some power plan implementations).
Finally, although the log send rate is lower than desired, for that workload it might be at the peak. Many sites avoid large index rebuilds as much as possible in Availability Groups due to the stress, instead focusing on index reorgs.
Thanks a lot @sqL_handLe and @Josh Darnell for your comments on this.
But actual cause in this particular issue was byte sector size missmatch on source (512) and destination (4096).
While checking which stage of AG was slowing down the process using aglatency-report-tool (because for distributed AG I can’t generate latency report from SSMS), I figured it was at destination and not source!
Checked Error log on destination and it was filled with IO misaligned errors. (I know I should have looked at this earlier 😛)
"There have been 43983616 misaligned log IOs which required falling back to synchronous IO. The current IO is on file F:\Log\mydb_log.ldf."
According to this article – this could mean
What Does This Message Mean?
SQL Server storage engine logic detects the disk Sector Size and will
align the Transaction Log Files metadata and internal boundaries to
match the Sector Size (either 512 or 4096 bytes). The error message
9012 is generated when SQL Server has detected that Log Entries have
been written assuming a Sector Size different to the Sector Size found
on the current SQL Server instance. This can happen in scenarios such
as the below:
Log Shipping from a Production server with one disk sector size to a DR server with a larger sector size
AlwaysOn or Database Mirroring from a server with one disk sector size to a DR server with a larger sector size
When adding a 4K Native SSD card to hold the Transaction Log on a server where the Transaction Log was created conventional SAN disk
with 512 sector size
To ensure consistency SQL Server may switch from performing
asynchronous IO to synchronous IO. This can have an adverse
performance impact if technologies like synchronous AlwaysOn or
Mirroring are used.
In this scenario it is possible that log writes on an AlwaysOn replica
or DBM secondary are switched to Synchronous IO. This could lead to
higher than expected replication delays between the Primary and
So I took some help from articles (MS tech community & KB3009974) to reach to the conclusion of adding TF 1800 to startup parameter on source, restarted SQL services and log_send_rate just boosted upto 200 MB/s.
Of course, The points you have mentioned have been very helpful to me reaching to the root cause of this issue and I thank you for that!
Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂