When you use top to limit the number of records, you don’t have the option to move on to subsequent pages. You can’t use this to increase the number of records beyond 5,000 – you can only use a number between 1 and 5,000.
In contrast, count and page allow you to move through multiple pages of results and control the size of each page:
So when should you use top and when should you use count and page?
If you are only ever interested in the first n records (e.g. you just want the first record that matches your query), use top as you have no need to retrieve multiple result pages.
On the other hand, if you do need to retrieve all the results (or at least, more than “just” the first 5,000), use page to indicate which number page you want. In addition you can optionally use count to control the size of each page – you might set this after some trial-and-error testing of the performance of your query.
In SQL terms, top is equivalent to the SELECT TOP (n) clause. count and page are similar to the OFFSET ... FETCH ... clause.
If you’re using any hierarchies in your data, you’ve probably noticed the icon in your main grid views that you can click on to show the hierarchy view when records have a parent or children. It’s simple to work out if a record has a parent – just check if the parent lookup attribute contains data. But have you ever stopped to think about how it works out if a record has got any children?
It would be horribly inefficient to execute a separate query for each row to get the number of children. Instead, it will use a FetchXML feature called rowaggregate.
You don’t have any control within your query of what is meant by “children” in this context. This is defined by the hierarchy configuration of the entity being queried.
You can only apply the rowaggregate option to an attribute that is the primary key in a hierarchical relationship (the “Related (One)” side in the screenshot above). You’ll receive an error if you try to apply it to any other attribute, e.g. if you try to apply it on the telephone1 attribute:
Entity: Account 'rowaggregate' is not supported on attribute, telephone1. Hierarchy relationship accountid <- parentaccountid.
You can only use rowaggregate on the top-level entity in your query. You’ll receive an unhelpful error An unexpected error occurred if you use it within a link-entity.
I’ve just released SQL 4 CDS 1.0.9. Please update it when XrmToolBox prompts you to take advantage of these latest features:
I’m very pleased to be able to release SQL 4 CDS as open source, thanks to the support of Data8. You can now grab all the code on GitHub, submit your bug reports or suggestions as issues or even open a pull request with your own changes!
public string ConvertSqlToFetchXml(IOrganizationService org, string selectStatement)
var metadataCache = new AttributeMetadataCache(org);
var converter = new Sql2FetchXml(metadataCache, true);
var queries = converter.Convert(selectStatement);
// queries could contain many queries, including SELECT, INSERT, UPDATE or DELETE
// We'll assume it's just a single SELECT statement for now
var select = (SelectQuery) queries;
// The FetchXML is returned as an object model, convert it to a string using XmlSerializer
using (var writer = new StringWriter())
var serializer = new XmlSerializer(typeof(FetchType));
If you get an error when opening SQL 4 CDS:
An error occured when trying to display this tool: Method not found: 'McTools.Xrm.Connection.ConnectionDetail ConnectionUpdatedEventArgs.get_ConnectionDetail()'.
after installing the latest version of XrmToolBox, please update SQL 4 CDS to 1.0.9 as well to fix it.
Cancel running queries
If you need to stop a query while it’s running, just click the 🟥 button in the toolbar to stop it.
Improved results display
The grid view showing the results of a query is improved with:
total record count at the bottom
record numbers on the left
right-click options to copy data with or without headers
click on lookup guid values to open the related record or quickly create a SELECT query to get the details of the record
columns in a SELECT * query are sorted by name to make it easier to find what you’re looking for
option to show lookup values as the name of the related record instead of the GUID
option to show times as your local time zone instead of UTC
Improved query support
You will now get the expected results when running queries that combine * and a field name in the SELECT clause, e.g. SELECT name, * FROM account, and when ordering a SELECT * query, e.g. SELECT * FROM account ORDER BY name
By default, SQL treats double quotes as enclosing the name of a table or column, and single quotes for strings, so you can’t write:
WHERE telephone1 = "01513554555"
as the account entity doesn’t contain a field called 01513554555
You can now switch this behaviour so double quotes and single quotes are both used to enclose strings by unticking the new “Quoted Identifiers” option in the settings window.
FROM invoice AS i
invoicedetail AS id
ON i.invoiceid = id.invoiceid
product AS p
ON id.productid = p.productid
WHERE p.productnumber = 'A'
AND p.productnumber = 'B';
This is a common mistake I see, and one you can make easily if you try to build this query in Advanced Find. When the query is run, it tries to find invoices that have an invoice item that has a product that has the product number “A” and “B”. Since it’s impossible for a product to have two different product numbers in the same field, you’ll never get any results for this query.
The key change we need to make is to have two separate links to the invoicedetail entity, one filtered for product A and the other for product B:
FROM invoice AS i
invoicedetail AS id_a
ON i.invoiceid = id_a.invoiceid
product AS p_a
ON id_a.productid = p_a.productid and p_a.productnumber = 'A'
invoicedetail AS id_b
ON i.invoiceid = id_b.invoiceid
product AS p_b
ON id_b.productid = p_b.productid and p_b.productnumber = 'B';
With this change the query will now find invoices that have an invoice item that has a product that has the product number “A”, and also an invoice item that has a product that has the product number “B”. It’s more long-winded to write but gets us to the right answer.
Unfortunately it’s not possible to build this query in Advanced Find, so if you need to use a query like this in your app you’ll need to get the query ready in FetchXML Builder and save it from there as a view or marketing list (but see the workaround at the end).
Combining with “Not In” Queries
Taking another example:
Visits that have a page view of the checkout page BUT NOT the “thank you” page?
If you’re using ClickDimensions you get a lot of powerful web tracking information recorded in custom entities. Two of these are Page View, which records an individual web page that someone has viewed, and Visit, which represents a session that can cover multiple pages. Each Page View is linked to a Visit. We can find abandoned shopping carts by querying these to find sessions which include a visit to the shopping cart page but not to the page that it shown after a purchase is complete. This makes use of the “not in” pattern of using a left outer join and null query.
FROM cdi_visit AS v
cdi_pageview AS checkout
ON v.cdi_visitid = checkout.cdi_visitid
AND checkout.cdi_uri = 'https://example.com/checkout'
LEFT OUTER JOIN
cdi_pageview AS finished
ON v.cdi_visitid = finished.cdi_visitid
AND finished.cdi_uri = 'https://example.com/checkout/thankyou'
WHERE finished.cdi_pageviewid IS NULL;
“All” / “Only” Queries
There’s also a related type of query that checks if something is true for all related entities, e.g.:
Accounts that only have contacts with an info@ email address?
Users that have sales on all their accounts this month?
It’s actually possible to re-phrase these queries into the same pattern as above of asking two separate questions of the link entity. For example:
Accounts that only have contacts with an info@ email address?
You could re-write this as:
Accounts that have a contact with an info@ email address and don’t have any contacts that don’t have an info@ email address?
FROM account AS a
contact AS info
ON a.accountid = info.parentcustomerid
AND info.emailaddress1 LIKE 'info@%'
LEFT OUTER JOIN
contact AS noninfo
ON a.accountid = noninfo.parentcustomerid
AND noninfo.emailaddress1 NOT LIKE 'info@%'
WHERE noninfo.contactid IS NULL;
I mentioned earlier that you couldn’t build these queries in Advanced Find, and that’s mostly true. If you try to create one of these queries exactly as I’ve shown you’ll get an error when you try to add the second instance of the related entity:
The relationship you are adding already exists in the query.
But with a bit of creativity we can switch the query around to something that Advanced Find can handle, although as a database person it’s not as “nice”.
Instead of having our main entity listed only once, we can join back to it from the related entity. Instead of joining from invoice to invoicedetail twice, we join from invoice to invoicedetail, back to invoice and on to invoicedetail again:
This works fine for this case, but you may not get the results you want if you try this in combination with a “not in” type query. That would also apply to all the other link entities you chain off it, and could have the effect of changing an “in” query to a “not in”.
Because the dategrouping attribute in FetchXML supports a few more options than the standard T-SQL DATEPART function, I’ve extended it in SQL 4 CDS to allow some additional values as the first parameter. You can also use fiscalperiod and fiscalyear to group your data by your fiscal periods instead of calendar periods, e.g.:
SELECT datepart(fiscalyear, createdon) AS year,
datepart(fiscalmonth, createdon) AS month,
count(*) AS leads
GROUP BY datepart(fiscalyear, createdon), datepart(fiscalmonth, createdon)
ORDER BY 1, 2
FROM contact AS c
account AS a
ON c.parentcustomerid = a.accountid
ORDER BY a.name, c.firstname
If you use an order that isn’t supported by FetchXML you’ll get the error Order already applied to later link-entity
UPDATE & DELETE Batch Sizes
You know you can write more than just SELECT statements in SQL 4 CDS? You can now control the number of records that are processed for UPDATE and DELETE in each batch, which can help avoid timeouts if each one takes a long time because of plugins etc. To control the batch size, click on the Settings button in the toolbar:
As well as these new features there are various bug fixes included:
Fixed “Unhandled INSERT optimizer hints” error when running INSERT queries
Fixed UPDATE query to set values to NULL
Fixed IN & NOT IN criteria
Fixed converting FetchXML to SQL (e.g. in FetchXML Builder) when the query contains a <link-entity> with no content and aliases that aren’t valid SQL identifiers
Grouping works quite simply for data types such as text, picklist or lookup values, but a simple grouping of date/time values wouldn’t work very usefully. They are stored with second (or in some cases, millisecond) precision, so in most cases no two records will have exactly the same value and each record will end up in its own group.
More typically you’ll want to group records that have a value in the same day or month to generate summary charts or reports, and that’s exactly what FetchXML does for you using the dategrouping attribute:
This example will get the number of contacts created each month.
Date/time values can be grouped by:
If you are grouping by a date/time column, you must use one of these groupings. You can’t group by the raw date/time value – if you try you’ll get the error Invalid value '' for dategrouping.
You’ll notice that there’s no support for anything more precise than a day – you can’t group by hours for example. If you have a need for that you’ll need to retrieve each individual record and implement your own grouping logic.
When you group by day/week/month/quarter, you get the appropriate day/week/month/quarter number. For example, when grouping by month you get a month number 1 – 12. Note that this means records in the same month from different years will be included in the same month. If this isn’t the behaviour you want, e.g. you want to keep January 2019 separate from January 2020, you need to include a separate grouping by year as well:
For week grouping you get a number 1 – 53, quarter is 1 – 4 and day is 1 – 31.
These date/time grouping options are equivalent to using the DATEPART function in SQL:
SELECT DATEPART(month, createdon) AS month,
count(*) AS count
GROUP BY DATEPART(month, createdon)
There’s no direct SQL equivalent for the fiscal period / year grouping options as these are application specific. However, in SQL 4 CDS I’ve extended the DATEPART function to allow fiscalperiod or fiscalyear to be specified as the part type name parameter.
You’ll need SQL 4 CDS 1.0.5 (which I should be releasing soon) or later to use the DATEPART function.
One feature of FetchXML that sets it apart from other CDS query options is aggregates. This is the option to get summary data (number of records, sum of opportunity value etc.) rather than the individual values from each record. If your only concept of how many records you have is “5,000+” then this is for you.
Let’s take a look at an example to get the most common first names of your active contacts:
There are 4 main points to note that are different from a regular query:
The <fetch> element has the aggregate="true" attribute
All the <attribute> elements have an alias. This is required for all aggregate queries, and is the one exception to my “don’t use attribute aliases” rule.
All the <attribute> elements also have EITHER a groupby="true" or aggregate="xxx" attribute
The <order> element uses an alias="xxx" attribute to refer to the previously aliased <attribute> instead of the normal attribute="xxx"
Run this query and you’ll get a list of different firstnames along with the number of active contacts that have that name. The list will be sorted with the most popular name at the top.
There are only a few aggregate functions you can apply:
If you’re familiar with SQL these will mostly be familiar as the ways of retrieving the number of records, and the lowest, highest, total and average values in a column.
The one that’s not so obvious is the distinction between count and countcolumn. The only reference I could find for this was a blog dating back to CRM 4.0. In brief, count will give the total number of records, while countcolumn gives the number of records that have a value in the attribute you apply it to. Note that it doesn’t matter which column you apply count to, you’ll get the same total number of records either way.
The countcolumn aggregate also supports a distinct attribute. This lets you get the number of different values in that column, e.g.
This will get a list of countries and the number of different cities in each country.
What “different” means in this context depends on the collation settings for your organisation. For the English installations that I have access to this defaults to ignoring case and accents, so “Münich” and “MUNICH” would be counted as being the same.
You can use <filter> specifications in aggregate queries in the same way as normal. See part 3 of this series for more details about filters. The filter will be applied before the aggregate. There is no way to filter the results of the aggregation, e.g. show results where the count is greater than 1, as helpful as that would be.
You can run an aggregate query without including any groupby="true" attributes. In this case, all the records that match your filter will be aggregated into a single result record. This is useful for finding the total number of records or highest/lowest values across all your records, e.g.
This query gives the number of open opportunities, along with the minimum and maximum values of those opportunities.
However, it is often more useful to group the records to see the differences in the summary values in different groups. For example, you can see the pipelines of different sales people by grouping opportunities by their owners and getting the total value of the opportunities in each group:
This is the first example we’ve looked at that uses a <link-entity>, but they work in aggregate queries in exactly the same way as in regular queries.
It’s very useful to sort the results of aggregate queries. For producing reports such as the opportunity pipeline query earlier, the results could be sorted by user to make it easy to find a particular individual, or by the pipeline value to highlight the best or worst achievers.
With the first example to find the number of contacts with each first name, sorting this to find the most and least popular names can often highlight data quality issues. Try it on your data and see what you find!
There are a few limitations to aggregate queries you should be aware of
Maximum of 50,000 records. If you try to run an aggregate query that uses more than 50,000 records you’ll get the error AggregateQueryRecordLimit exceeded
No paging support. As for normal queries you can get back up to 5,000 records from your query, but if you have more groups than that there is no way to move on to the next page.
To get around these limitations you’ll need to break up your query with more restrictive filters and combine the results. In SQL 4 CDS I worked around these limits by retrieving all the individual records and applying the aggregation in code when necessary.
For one special case of getting the total number of records in an entity without any filters, i.e.
there is a new RetrieveTotalRecordCountRequest message available since version 9 to get this more efficiently and bypassing the 50,000 record limit. SQL 4 CDS will use this optimisation where possible.
These options are equivalent to the GROUP BY clause and aggregate functions in SQL.
The count aggregate is equivalent to count(*) in SQL, and countcolumn is equivalent to count(attributename).
The SQL HAVING clause doesn’t have a FetchXML equivalent – this is one improvement to FetchXML I’d love to see.
RetrieveTotalRecordCountRequest is equivalent to SELECT count(*) FROM entity, with no option to apply a filter. I have a feeling it may actually map to the sp_spaceused SQL Server stored procedure under the hood.
By adding more <order> elements, the results will be sorted by each attribute in sequence. The orders are applied, well, in order. The results will be sorted by the first <order> attribute, then the second one, and so on.
If you run this query however, you’ll see that the results are actually ordered according to the full name of the primary contact.
This is true of any lookup field – the results are sorted according to the primary name attribute of the related entity. This is even true for polymorphic lookup fields such as customer fields. This makes sense from the perspective of the user interface – click on the header of the lookup column and you’d expect it to sort by the name. But this does not guarantee that records with the same guid will be retrieved consecutively, which can cause problems if you are using these IDs in integrations.
If you have code that relies on getting records with the same guid from a lookup field next to each other, order by the primary key of the related record instead of the foreign key lookup field. For example, rewriting the previous query as follows will order the results by the contact ID instead of the name:
As we’ve just seen in the last example, an <order> can be placed within a <link-entity> as well as within the main <entity> to apply the ordering to attributes within the linked entity.
Bear in mind there is no way to specify the sequence that orderings are applied across multiple entities. Orders from the main <entity> are always applied first, then the first <link-entity>, then any sub-link-entities, and so on. In this example:
When you execute any query, by default you will only get up to 5,000 records back. To get any more you need to use paging to get the next 5,000, and the next, and so on.
For paging to work correctly, the data must be sorted so that CDS can work out where the next page should start from.
Because a sort on any field apart from the primary key can result in two records with the same value being next to each other, it would be impossible to use just those values to determine where the next page should start. If you sort by the name of contacts and the last record in the first page has a name of Mark Carrington, the next page could start with another Mark Carrington record as well. There is no way of telling exactly which of these records should be the start of the next page.
To resolve this problem, CDS automatically adds an extra sort on the primary key field. This makes sure there is a unique value for the system to use to tell exactly where the next page should start.
Remember though that a sort can’t be applied to the top-level entity after one that was applied to a link entity. This means that if you’ve sorted the query by a link entity, this automatic sort by the primary key of the main entity can’t be added without giving a different sort order to what the user asked for, so it’s not added and you won’t get a paging cookie back. You can still ask for the next page, but it will be retrieved less efficiently.
Sorting & Aliases
If you choose to give aliases to the attributes in your query (I looked at this in part 1 of this series, and recommended against it), be careful about sorting by it. If you execute this query:
you will get the expected results, but when you try to move on to the next page you’ll get the same records again. If you keep moving on to the next page you’ll get stuck in an infinite loop as you never actually move on to the next page.
If you have a look into the contents of the paging cookie you receive, you’ll see it shows that the last record in the page has a value of null for the fullname attribute, so the first record of the next page should be the next one that also has a null name or the first non-null value. This will obviously give the first page of data again.
I can see why this would happen – the record that was retrieved doesn’t have a fullname attribute, as it’s been aliased to name instead.
To avoid this problem, see my earlier advice and don’t use aliases unless you’re doing aggregate queries! If you must, add the attribute again without an alias, e.g.
As you edit your query, everything else now updates in real-time. No more clicking away to see your changes reflected in the query builder, FetchXML view or any other tool windows! Check it out in this video:
SQL 4 CDS Integration
If you’ve ever used the SQL Query window in FetchXML Builder, you might have noticed it missed various aspects of more complex queries. With the work I’ve done lately on the SQL 4 CDS tool it made sense to use that to enhance the SQL support in FXB too.
To get this to work you need to have SQL 4 CDS version 1.0.4 or later installed, then enable the integration in FXB under Options > Use SQL 4 CDS for SQL conversion. SQL 4 CDS will then be used to generate the SQL query instead of the built-in FXB logic:
As well as showing the query, you can also use the new Edit in SQL 4 CDS button. Edit your query as SQL, then bring it back as FetchXML to continue using all the other great FXB features like converting the query to Power Automate parameters.
So far we’ve looked at using the <attribute> and <link-entity> elements to control what columns to include in your query, but for a truly useful query you’ll likely want to select which rows to include instead of taking everything. Enter <filter>!
Filter or Condition?
A <filter> is a group of individual <condition>s. Each condition applies a filter on a single field. The job of the filter is to combine those conditions with an AND or OR.
The most common operator you’ll use will be eq, short for equals. However, there’s a huge range of operators you can use, especially if you’re filtering on date or user fields. I won’t list them all here, but you can find the full list in the Microsoft documentation. The main ones you’re likely to use are:
It’s worth highlighting that the like operator can be used on lookup fields as well as text fields – in this case it is applied to the primary name attribute of the related record, while eq will be applied to the unique identifier stored in the lookup field itself.
For date values there are many operators [olderthan/last/next]-x-[hours/days/weeks/months], e.g. last-x-days. The value you use in the condition becomes the “x”, e.g. to find records created in the last 3 days:
For user lookup fields, such as ownerid, there are various operators to check if the user matches the current user (eq-user), or any of the same teams the user is part of (eq-userteams, eq-useroruserteams). It can also look through the user hierarchy to match users below the current user in the user hierarchy (eq-useroruserhierarchy, eq-useroruserhierarchyandteams). You can also filter by the same business unit as the current user (eq-businessid)
While we’re talking about hierarchies, remember that any other entity type can also have a hierarchy defined, and you can use the above, under and similar operators to match any record in them rather than having to navigate them manually yourself.
There’s some more good information about querying date and hierarchy data in the Microsoft documentation.
All conditions compare a field to a specific value. Importantly, there is no condition to compare one field to another field.
Most conditions require a single value to be specified in the value attribute, such as the equals operator:
Conditions on lookup fields require the primary key guid value to filter on. If you build a query in Advanced Find and download the FetchXML you’ll also see the uiname and uitype attributes included in the condition to identify the name and type of the record respectively, but these are not used when the query is actually executed.
We saw some of the more advanced operators earlier, especially the date and user operators. These are particularly important when building views for your users to use on a regular basis.
For example, a view that shows active leads owned by the current user created in the last 7 days is probably more useful than one that shows leads created since a fixed date and owned by you, the author of the view.
Filtering Linked Entities
There are two ways filters can be applied to linked entities – within the <link-entity> element, or within the main <entity> element using the entityname attribute on the conditions. Both are valid, but do subtly different things.
When the filter is applied within the <link-entity>, the filter is applied to the entity before it is joined. By adding an entityname attribute to a condition within the main filter, the filter is applied after the join.
When using inner joins there is no practical difference in the results, but with outer joins these approaches can be combined to generate a query to find records that do not have any related records that match some criteria, e.g.:
This will retrieve all the accounts that do not have an active contact called Mark.
The list of contacts is filtered to only active Marks, and the list of accounts is linked to that filtered list using an outer join. As we looked at in the previous part of this series, the outer join type will include all the records from the first list (the accounts). If there are any matches from the second list (the contacts) then those are included, otherwise, all the contact fields in the result are left blank (null). The outer filter then restricts the final result to only those joined records that have a null contactid, and therefore didn’t find a matching contact.
Note that, if you give the link entity an alias, it is that alias you should use in the entityname attribute, not the actual logical name of the entity.
The main filter is equivalent to the SQL WHERE clause. For example, the first query in this post would be equivalent to:
WHERE statecode = 0
AND firstname = 'Mark'
The from and to attributes on a <link-entity> form an ON clause for a join, and a filter within the link entity extend that ON clause. The filter created by the from and to attributes will always be combined to the additional filter with an AND. For example, the query from earlier:
LEFT OUTER JOIN
ON account.accountid = contact.parentcustomerid
AND contact.statecode = 0
AND contact.firstname = 'Mark'
WHERE contact.contactid IS NULL;
The more advanced operators do not have any direct equivalent in SQL. Many of the date conditions compare values only on the date part and ignore the time part to make it easier for the user to build useful queries, but would require a range comparison in SQL. The hierarchical operators would likely require more complex CTEs, and are a great example of how FetchXML delivers a much more powerful application-level query language rather than just a thin veneer over SQL.
In SQL 4 CDS these operators can be used through the pattern of attribute = operator(value), e.g.
WHERE ownerid = equseroruserteams()
OR createdon = lastxweeks(4);