Looking for visualization to understand how an allocation order scan results in missing/duplicate rows?

All we need is an easy explanation of the problem, so here it is.

In read uncommitted isolation mode, SQL server engine may decide to use the index-order scan or allocation order scan.

An index order scan can result in missing/duplicate rows due to the reasons mentioned in the answer from this answer: Are missed rows and duplicate rows the symptoms due to allocation order scan or due to the no lock and page splits? . The linked answer contains a diagram that shows the visualization.

An allocation order scan can also result in missing/duplicate rows, and as mentioned in above answer, it is because of page splits. I am looking for a visualization of this.

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

In read uncommitted isolation mode, SQL server engine may decide to use the index-order scan or index order allocation scan.

There is no such thing as an "index order allocation scan". There are index-order scans and allocation-order scans.

I am looking for visualization of this.

I’m not going to draw a picture, but imagine an index with pages like this:

page keys next page
500 000-099 300
300 100-199 700
700 200-299

The page numbers are their physical position in the file. So page 500 is the 500th 8KB page in the physical file.

Notice physical page IDs do not match index key order.

For example, physical page 500 contains keys (000-099), which are lower than those on page 300 (100-199). This is perfectly normal and allowed.

In this (very) simplified example, a single IAM page covers this allocation range. The IAM page tells us that pages 300, 500, and 700 belong to this index (and that’s all it tells us).

Now an allocation-order scan of the index starts at page 300 (the lowest physical page number that belongs to this index according to the IAM).

Meanwhile, another process causes page 500 to split:

page keys next page
500 000-049 100
300 100-199 700
700 200-299
100 050-099 300

Half the rows have moved from page 500 to (new) page 100. Since we are using an allocation-order scan, we do not follow the next-page pointers. Instead, we continue with the next higher-numbered page after page 300, which is 500, then 700.

We never encounter the rows on page 100. These are ‘missing’ due to the page split and allocation-order scan.

If the split of page 500 had resulted in a new page numbered higher than 300, we would have encountered it during the allocation-order scan. No missing rows, and no duplicates.

As a final example, imagine we have read page 300, and are now processing page 500. At this point, another process causes page 300 to split, creating new page 600. We will encounter the rows on page 600 twice: Once when we read page 300 before the split, and once when we encounter page 600 after the split.

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply