David Fraticelli raised an interesting bug that became apparent in SQL 4 CDS. Many thanks to David for all his help in tracking down what was happening.
I’ve mentioned before the many ways that paging in FetchXML can give rise to unexpected results. One key thing to ensure you get the correct results is to ensure you have a consistent sort order applied.
For aggregate queries it’s normally enough to add a sort on each of the grouping columns. Each row will always have a unique value for these columns, so these sorts will make sure the server can know exactly where each page starts and ends.
With David’s original SQL query:
SELECT incidentid, knowledgearticleid, COUNT(*) FROM knowledgearticleincident GROUP BY incidentid, knowledgearticleid HAVING COUNT(*) > 1
that gets translated to the FetchXML:
<fetch xmlns:generator='MarkMpn.SQL4CDS' aggregate='true'> <entity name='knowledgearticleincident'> <attribute name='incidentid' alias='incidentid' groupby='true' /> <attribute name='knowledgearticleid' alias='knowledgearticleid' groupby='true' /> <attribute name='knowledgearticleincidentid' alias='count' aggregate='count' /> <order alias='incidentid' /> <order alias='knowledgearticleid' /> </entity> </fetch>
which looks fine. The data is being grouped by
knowledgearticleid and the results are sorted by both those fields too.
However, ordering on a lookup field (e.g.
incidentid) doesn’t sort by the guid valid itself, but by the name of the corresponding record. This becomes important when we try to move to the second page of results.
Only 5,000 results are returned at a time, along with a paging cookie. This encodes where the last page finished, so that the server can efficiently find where the second page should start. However, if we take a look at the paging cookie that we get:
<cookie page="1"> <incidentidname lastnull="1" firstnull="1" /> <knowledgearticleidname lastnull="1" firstnull="1" /> </cookie>
we can see that this will start the second page based on the names of the associated case and knowledge base article, not the unique identifiers that we’re grouping by.
If we have a lot of cases with the same name, there’s no definite marker to say where the next page should start from. We might get records in the second page that we’ve already seen in the first page, and some records might be skipped altogether.
In this example I’ve got a whole page full of 5,000 different
knowledgearticleid combinations, but the names of the associated cases & knowledge base articles are all null. When I request the next page of these results the server is going to skip 5,000 of these no-name records, but there’s no guarantee that it’s going to be the same 5,000 I’ve already received in the first page. If it happens to skip some different records I could end up never seeing those records in the results, while I could see some of the same records from the first page again in their place.
In a non-aggregate query this isn’t a problem; we can add another sort order using the primary key (
incidentknowledgearticleid in this case). This can act as a tie-breaker in these situations and provides a definite starting point for the next page.
In an aggregate query though, there’s no way of doing this without fundamentally changing the grouping of the query. In this case the only alternative seems to be to detect that we’ve hit this situation and fall back on extracting the entire data set and performing the aggregation ourselves. This is less efficient but at least gives accurate results.
In summary, if your FetchXML query:
groupby="true"on a lookup column, and
- Has results split over multiple pages
there doesn’t appear to be a way to guarantee correct results.
An update to SQL 4 CDS will be coming out shortly that identifies and handles these situations.
Update 2022-01-18: This is now resolved in SQL 4 CDS 5.4.1