Inside FetchXML pt 9 – distinct

You can ensure your query only produces each row once using the distinct attribute. For example, if you have multiple entries for the same company name you can get the list of unique names using:

<fetch distinct="true">
  <entity name="account">
    <attribute name="name" />
  </entity>
</fetch>

This option gets applied automatically whenever you create a query using Advanced Find. This is particularly useful when you add a join onto a related child entity. For example, this query:

<fetch>
  <entity name="account">
    <attribute name="name" />
    <link-entity name="contact" from="parentcustomerid" to="accountid" link-type="inner">
      <filter>
        <condition attribute="firstname" operator="eq" value="Mark" />
      </filter>
    </link-entity>
  </entity>
</fetch>

The intent here is to get a list of accounts that have a contact called Mark. However, if an account has two Marks, that account will be in the resulting list twice. Adding the distinct attribute makes sure the user gets what they were probably expecting, which is a list of unique accounts.

Unique Identifiers

One other interesting side effect of using distinct is that you no longer get the primary key of the entity back automatically. If you run the query:

<fetch>
  <entity name="account">
    <attribute name="name" />
  </entity>
</fetch>

As well as getting the account name back, you’ll also automatically get the accountid as well.

When you apply distinct, this goes away. This makes sense if you think about it. Imagine you have two accounts with the same name. If the accountid was automatically included, that would make the records unique. You would still get two rows back from your query, regardless of whether or not you used distinct.

If the purpose of using distinct is to eliminate duplicate rows introduced by a join onto a child entity as shown above though, you probably do want to get a row back for each of your top-level entity type. In that case you will need to explicitly include the primary key attribute in your query, e.g.

<fetch distinct="true">
  <entity name="account">
    <attribute name="accountid" />
    <attribute name="name" />
    <link-entity name="contact" from="parentcustomerid" to="accountid" link-type="inner">
      <filter>
        <condition attribute="firstname" operator="eq" value="Mark" />
      </filter>
    </link-entity>
  </entity>
</fetch>

Truncation

One “gotcha” to be aware of. Any text values will be truncated to 2,000 characters before the distinct operation is applied. If you have longer values, e.g. descriptions or the contents of attachments in the annotation entity, any extra data after the first 2,000 characters will be lost.

Quite why this would be is a bit of a mystery. My best guess is that these fields used to be stored in a SQL Server ntext field which couldn’t be used in a SELECT DISTINCT query, so to work around this problem Dynamics CRM would cast these values to an nvarchar(2000) value which could be used instead. You can see this in the generated SQL:

<fetch distinct="true" top="10">
  <entity name="annotation">
    <attribute name="documentbody" />
  </entity>
</fetch>
select 
 DISTINCT  top 10  cast( "annotation0".DocumentBody as varchar(2000) ) as "documentbody" 
from
 AnnotationBase as "annotation0"

As these columns are now stored using an nvarchar(max) field which can be used directly in a SELECT DISTINCT query I guess this behaviour is still in place for backwards compatibility reasons, although it does give about a 35% performance improvement:

SQL Equivalent

Apart from this once caveat of long text fields, this is directly equivalent to the SQL DISTINCT keyword.

Inside FetchXML pt 8 – page, count & top

By default you’ll get up to 5,000 records that match your query. The top-level <fetch> element supports a few options to help you control this:

top will limit the number of records further, e.g.

<fetch top="10">
  <entity name="account">
    <all-attributes />
  </entity>
</fetch>

When you use top to limit the number of records, you don’t have the option to move on to subsequent pages. You can’t use this to increase the number of records beyond 5,000 – you can only use a number between 1 and 5,000.

In contrast, count and page allow you to move through multiple pages of results and control the size of each page:

<fetch count="100" page="3">
  <entity name="account">
    <all-attributes />
    <order attribute="createdon" />
  </entity>
</fetch>

This example will get records 201 – 300. If you are working you way through sequential pages you should also set the paging-cookie attribute with the value of the paging cookie from the previous page.

So when should you use top and when should you use count and page?

If you are only ever interested in the first n records (e.g. you just want the first record that matches your query), use top as you have no need to retrieve multiple result pages.

On the other hand, if you do need to retrieve all the results (or at least, more than “just” the first 5,000), use page to indicate which number page you want. In addition you can optionally use count to control the size of each page – you might set this after some trial-and-error testing of the performance of your query.

SQL Equivalents

In SQL terms, top is equivalent to the SELECT TOP (n) clause. count and page are similar to the OFFSET ... FETCH ... clause.

Inside FetchXML pt 7 – rowaggregate

If you’re using any hierarchies in your data, you’ve probably noticed the icon in your main grid views that you can click on to show the hierarchy view when records have a parent or children. It’s simple to work out if a record has a parent – just check if the parent lookup attribute contains data. But have you ever stopped to think about how it works out if a record has got any children?

Hierarchical records highlighted in grid view

It would be horribly inefficient to execute a separate query for each row to get the number of children. Instead, it will use a FetchXML feature called rowaggregate.

Similar to applying standard aggregations, this is applied to an <attribute> in your query, e.g.

<fetch>
  <entity name="account">
    <attribute name="name" />
    <attribute name="accountid" rowaggregate="countchildren" alias="NumberOfChildren" />
  </entity>
</fetch>

You don’t have any control within your query of what is meant by “children” in this context. This is defined by the hierarchy configuration of the entity being queried.

Hierarchical relationship setting

You can only apply the rowaggregate option to an attribute that is the primary key in a hierarchical relationship (the “Related (One)” side in the screenshot above). You’ll receive an error if you try to apply it to any other attribute, e.g. if you try to apply it on the telephone1 attribute:

Entity: Account 'rowaggregate' is not supported on attribute, telephone1. Hierarchy relationship accountid <- parentaccountid.

You can only use rowaggregate on the top-level entity in your query. You’ll receive an unhelpful error An unexpected error occurred if you use it within a link-entity.

SQL 4 CDS 1.0.9 Released!

I’ve just released SQL 4 CDS 1.0.9. Please update it when XrmToolBox prompts you to take advantage of these latest features:

Open Source!

I’m very pleased to be able to release SQL 4 CDS as open source, thanks to the support of Data8. You can now grab all the code on GitHub, submit your bug reports or suggestions as issues or even open a pull request with your own changes!

As well as the XrmToolBox tool, I’ve also separated out the core SQL ⇔ FetchXML conversions into the MarkMpn.Sql4Cds.Engine NuGet package you can include in your own projects. A simple code example to convert from SQL to FetchXML would look like:

public string ConvertSqlToFetchXml(IOrganizationService org, string selectStatement)
{
  var metadataCache = new AttributeMetadataCache(org);
  var converter = new Sql2FetchXml(metadataCache, true);
  var queries = converter.Convert(selectStatement);

  // queries could contain many queries, including SELECT, INSERT, UPDATE or DELETE
  // We'll assume it's just a single SELECT statement for now
  var select = (SelectQuery) queries[0];

  // The FetchXML is returned as an object model, convert it to a string using XmlSerializer
  using (var writer = new StringWriter())
  {
    var serializer = new XmlSerializer(typeof(FetchType));
    serializer.Serialize(select.FetchXml);

    return writer.ToString();
  }
}

XrmToolBox 1.2020.2.36

If you get an error when opening SQL 4 CDS:

An error occured when trying to display this tool: Method not found: 'McTools.Xrm.Connection.ConnectionDetail ConnectionUpdatedEventArgs.get_ConnectionDetail()'.

after installing the latest version of XrmToolBox, please update SQL 4 CDS to 1.0.9 as well to fix it.

Cancel running queries

If you need to stop a query while it’s running, just click the 🟥 button in the toolbar to stop it.

Improved results display

The grid view showing the results of a query is improved with:

  • total record count at the bottom
  • record numbers on the left
  • right-click options to copy data with or without headers
  • click on lookup guid values to open the related record or quickly create a SELECT query to get the details of the record
  • columns in a SELECT * query are sorted by name to make it easier to find what you’re looking for
  • option to show lookup values as the name of the related record instead of the GUID
  • option to show times as your local time zone instead of UTC

Improved query support

You will now get the expected results when running queries that combine * and a field name in the SELECT clause, e.g. SELECT name, * FROM account, and when ordering a SELECT * query, e.g. SELECT * FROM account ORDER BY name

Double quotes

By default, SQL treats double quotes as enclosing the name of a table or column, and single quotes for strings, so you can’t write:

SELECT name
FROM   account
WHERE  telephone1 = "01513554555"

as the account entity doesn’t contain a field called 01513554555

You can now switch this behaviour so double quotes and single quotes are both used to enclose strings by unticking the new “Quoted Identifiers” option in the settings window.

Feedback

Please open an issue in the GitHub repository to let me know of any problems you encounter or with any suggestions you might have for improvement.

Please also use the rating feature in XrmToolBox to let me know what you think!

FetchXML Multiple Links

One common pattern of queries I see about FetchXML is how to write queries to ask two different questions about the same related entity. For example:

  • Invoices that include product A AND product B?
  • Contacts that have pending emails AND no sent emails?
  • Visits that have a page view of the checkout page BUT NOT the “thank you” page?

Multiple Links

The key with these sorts of queries is that each question you want to ask of the related entity requires a separate link. Taking the first example:

Invoices that include product A AND product B?

If we write this as:

Advanced Find

Finding invoices with two products – attempt 1!

FetchXML

<fetch xmlns:generator="MarkMpn.SQL4CDS">
  <entity name="invoice">
    <link-entity name="invoicedetail" to="invoiceid" from="invoiceid" alias="id" link-type="inner">
      <link-entity name="product" to="productid" from="productid" alias="p" link-type="inner">
        <all-attributes />
      </link-entity>
      <all-attributes />
    </link-entity>
    <all-attributes />
    <filter>
      <condition attribute="productnumber" entityname="p" operator="eq" value="A" />
      <condition attribute="productnumber" entityname="p" operator="eq" value="B" />
    </filter>
  </entity>
</fetch>

SQL

SELECT *
FROM   invoice AS i
       INNER JOIN
       invoicedetail AS id
       ON i.invoiceid = id.invoiceid
       INNER JOIN
       product AS p
       ON id.productid = p.productid
WHERE  p.productnumber = 'A'
       AND p.productnumber = 'B';

This is a common mistake I see, and one you can make easily if you try to build this query in Advanced Find. When the query is run, it tries to find invoices that have an invoice item that has a product that has the product number “A” and “B”. Since it’s impossible for a product to have two different product numbers in the same field, you’ll never get any results for this query.

The key change we need to make is to have two separate links to the invoicedetail entity, one filtered for product A and the other for product B:

FetchXML

<fetch xmlns:generator="MarkMpn.SQL4CDS">
  <entity name="invoice">
    <link-entity name="invoicedetail" to="invoiceid" from="invoiceid" alias="id_a" link-type="inner">
      <link-entity name="product" to="productid" from="productid" alias="p_a" link-type="inner">
        <filter>
          <condition attribute="productnumber" operator="eq" value="A" />
        </filter>
        <all-attributes />
      </link-entity>
      <all-attributes />
    </link-entity>
    <link-entity name="invoicedetail" to="invoiceid" from="invoiceid" alias="id_b" link-type="inner">
      <link-entity name="product" to="productid" from="productid" alias="p_b" link-type="inner">
        <filter>
          <condition attribute="productnumber" operator="eq" value="B" />
        </filter>
        <all-attributes />
      </link-entity>
      <all-attributes />
    </link-entity>
    <all-attributes />
  </entity>
</fetch>

SQL

SELECT *
FROM   invoice AS i
       INNER JOIN
       invoicedetail AS id_a
       ON i.invoiceid = id_a.invoiceid
       INNER JOIN
       product AS p_a
       ON id_a.productid = p_a.productid and p_a.productnumber = 'A'
       INNER JOIN
       invoicedetail AS id_b
       ON i.invoiceid = id_b.invoiceid
       INNER JOIN
       product AS p_b
       ON id_b.productid = p_b.productid and p_b.productnumber = 'B';

With this change the query will now find invoices that have an invoice item that has a product that has the product number “A”, and also an invoice item that has a product that has the product number “B”. It’s more long-winded to write but gets us to the right answer.

Unfortunately it’s not possible to build this query in Advanced Find, so if you need to use a query like this in your app you’ll need to get the query ready in FetchXML Builder and save it from there as a view or marketing list (but see the workaround at the end).

Combining with “Not In” Queries

Taking another example:

Visits that have a page view of the checkout page BUT NOT the “thank you” page?

If you’re using ClickDimensions you get a lot of powerful web tracking information recorded in custom entities. Two of these are Page View, which records an individual web page that someone has viewed, and Visit, which represents a session that can cover multiple pages. Each Page View is linked to a Visit. We can find abandoned shopping carts by querying these to find sessions which include a visit to the shopping cart page but not to the page that it shown after a purchase is complete. This makes use of the “not in” pattern of using a left outer join and null query.

FetchXML

<fetch xmlns:generator="MarkMpn.SQL4CDS">
  <entity name="cdi_visit">
    <link-entity name="cdi_pageview" to="cdi_visitid" from="cdi_visitid" alias="checkout" link-type="inner">
      <filter>
        <condition attribute="cdi_uri" operator="eq" value="https://example.com/checkout" />
      </filter>
      <all-attributes />
    </link-entity>
    <link-entity name="cdi_pageview" to="cdi_visitid" from="cdi_visitid" alias="finished" link-type="outer">
      <filter>
        <condition attribute="cdi_uri" operator="eq" value="https://example.com/checkout/thankyou" />
      </filter>
      <all-attributes />
    </link-entity>
    <all-attributes />
    <filter>
      <condition attribute="cdi_pageviewid" entityname="finished" operator="null" />
    </filter>
  </entity>
</fetch>

SQL

SELECT *
FROM   cdi_visit AS v
       INNER JOIN
       cdi_pageview AS checkout
       ON v.cdi_visitid = checkout.cdi_visitid
          AND checkout.cdi_uri = 'https://example.com/checkout'
       LEFT OUTER JOIN
       cdi_pageview AS finished
       ON v.cdi_visitid = finished.cdi_visitid
          AND finished.cdi_uri = 'https://example.com/checkout/thankyou'
WHERE  finished.cdi_pageviewid IS NULL;

“All” / “Only” Queries

There’s also a related type of query that checks if something is true for all related entities, e.g.:

  • Accounts that only have contacts with an info@ email address?
  • Users that have sales on all their accounts this month?

It’s actually possible to re-phrase these queries into the same pattern as above of asking two separate questions of the link entity. For example:

Accounts that only have contacts with an info@ email address?

You could re-write this as:

Accounts that have a contact with an info@ email address and don’t have any contacts that don’t have an info@ email address?

Now we can apply the same pattern again:

FetchXML

<fetch xmlns:generator="MarkMpn.SQL4CDS">
  <entity name="account">
    <link-entity name="contact" to="accountid" from="parentcustomerid" alias="info" link-type="inner">
      <filter>
        <condition attribute="emailaddress1" operator="like" value="info@%" />
      </filter>
      <all-attributes />
    </link-entity>
    <link-entity name="contact" to="accountid" from="parentcustomerid" alias="noninfo" link-type="outer">
      <filter>
        <condition attribute="emailaddress1" operator="not-like" value="info@%" />
      </filter>
      <all-attributes />
    </link-entity>
    <all-attributes />
    <filter>
      <condition attribute="contactid" entityname="noninfo" operator="null" />
    </filter>
  </entity>
</fetch>

SQL

SELECT *
FROM   account AS a
       INNER JOIN
       contact AS info
       ON a.accountid = info.parentcustomerid
          AND info.emailaddress1 LIKE 'info@%'
       LEFT OUTER JOIN
       contact AS noninfo
       ON a.accountid = noninfo.parentcustomerid
          AND noninfo.emailaddress1 NOT LIKE 'info@%'
WHERE  noninfo.contactid IS NULL;

Advanced Find

I mentioned earlier that you couldn’t build these queries in Advanced Find, and that’s mostly true. If you try to create one of these queries exactly as I’ve shown you’ll get an error when you try to add the second instance of the related entity:

The relationship you are adding already exists in the query.

But with a bit of creativity we can switch the query around to something that Advanced Find can handle, although as a database person it’s not as “nice”.

Instead of having our main entity listed only once, we can join back to it from the related entity. Instead of joining from invoice to invoicedetail twice, we join from invoice to invoicedetail, back to invoice and on to invoicedetail again:


Finding invoices with two products – attempt 2. Longer but gets the right answer!

This works fine for this case, but you may not get the results you want if you try this in combination with a “not in” type query. That would also apply to all the other link entities you chain off it, and could have the effect of changing an “in” query to a “not in”.

SQL 4 CDS 1.0.6 Released!

I’m very pleased to announce the release of SQL 4 CDS version 1.0.6! Please update your existing versions in XrmToolBox to get some great new features…

COUNT(DISTINCT) Support

You can now use the count(DISTINCT col) aggregate function, e.g.:

SELECT count(DISTINCT firstname) AS distinct_names,
       count(firstname) AS names,
       count(*) AS contacts
FROM   contact
WHERE  statecode = 0

This will get the number of different first names, the number of active contacts with a first name and the total number of active contacts, and gets translated to the following FetchXML:

<fetch xmlns:generator="MarkMpn.SQL4CDS" aggregate="true">
  <entity name="contact">
    <attribute name="firstname" alias="distinct_names" aggregate="countcolumn" distinct="true" />
    <attribute name="firstname" alias="names" aggregate="countcolumn" />
    <attribute name="contactid" alias="contacts" aggregate="count" />
    <filter>
      <condition attribute="statecode" operator="eq" value="0" />
    </filter>
  </entity>
</fetch>

DATEPART support

You can also now group data using the DATEPART function, e.g. find out how many new leads you’ve received each month using:

SELECT   datepart(year, createdon) AS year,
         datepart(month, createdon) AS month,
         count(*) AS leads
FROM     lead
GROUP BY datepart(year, createdon), datepart(month, createdon)
ORDER BY 1, 2

This returns the year number, month number and the number of leads created in that month. The results are returned in date order. This translates to the following FetchXML:

<fetch xmlns:generator="MarkMpn.SQL4CDS" aggregate="true">
  <entity name="lead">
    <attribute name="createdon" alias="year" groupby="true" dategrouping="year" />
    <attribute name="createdon" alias="month" groupby="true" dategrouping="month" />
    <attribute name="leadid" alias="leads" aggregate="count" />
    <order alias="year" />
    <order alias="month" />
  </entity>
</fetch>

Because the dategrouping attribute in FetchXML supports a few more options than the standard T-SQL DATEPART function, I’ve extended it in SQL 4 CDS to allow some additional values as the first parameter. You can also use fiscalperiod and fiscalyear to group your data by your fiscal periods instead of calendar periods, e.g.:

SELECT   datepart(fiscalyear, createdon) AS year,
         datepart(fiscalmonth, createdon) AS month,
         count(*) AS leads
FROM     lead
GROUP BY datepart(fiscalyear, createdon), datepart(fiscalmonth, createdon)
ORDER BY 1, 2

Sorting on link entities

You can now sort your query based on a field from a joined table, e.g.:

SELECT   c.firstname,
         a.name
FROM     contact AS c
         INNER JOIN
         account AS a
         ON c.parentcustomerid = a.accountid
ORDER BY c.firstname, a.name

Remember though that you can’t order by the joined table first and then the first table, so you can’t do:

SELECT   c.firstname,
         a.name
FROM     contact AS c
         INNER JOIN
         account AS a
         ON c.parentcustomerid = a.accountid
ORDER BY a.name, c.firstname

If you use an order that isn’t supported by FetchXML you’ll get the error Order already applied to later link-entity

UPDATE & DELETE Batch Sizes

You know you can write more than just SELECT statements in SQL 4 CDS? You can now control the number of records that are processed for UPDATE and DELETE in each batch, which can help avoid timeouts if each one takes a long time because of plugins etc. To control the batch size, click on the Settings button in the toolbar:

Control UPDATE and DELETE batch size

Bug Fixes

As well as these new features there are various bug fixes included:

  • Fixed “Unhandled INSERT optimizer hints” error when running INSERT queries
  • Fixed UPDATE query to set values to NULL
  • Fixed IN & NOT IN criteria
  • Fixed converting FetchXML to SQL (e.g. in FetchXML Builder) when the query contains a <link-entity> with no content and aliases that aren’t valid SQL identifiers

Inside FetchXML pt 6 – datetime grouping

Last time I looked at aggregates, but unforgivably I missed out how to apply grouping to date/time attributes:

Grouping works quite simply for data types such as text, picklist or lookup values, but a simple grouping of date/time values wouldn’t work very usefully. They are stored with second (or in some cases, millisecond) precision, so in most cases no two records will have exactly the same value and each record will end up in its own group.

More typically you’ll want to group records that have a value in the same day or month to generate summary charts or reports, and that’s exactly what FetchXML does for you using the dategrouping attribute:

<fetch aggregate="true">
  <entity name="contact">
    <attribute name="createdon" alias="month" groupby="true" dategrouping="month" />
    <attribute name="contactid" alias="occurrences" aggregate="count" />
    <filter>
      <condition attribute="statecode" operator="eq" value="0" />
    </filter>
    <order alias="occurrences" descending="true" />
  </entity>
</fetch>

This example will get the number of contacts created each month.

Available Groupings

Date/time values can be grouped by:

  • day
  • week
  • month
  • quarter
  • year
  • fiscal period
  • fiscal year

If you are grouping by a date/time column, you must use one of these groupings. You can’t group by the raw date/time value – if you try you’ll get the error Invalid value '' for dategrouping.

You’ll notice that there’s no support for anything more precise than a day – you can’t group by hours for example. If you have a need for that you’ll need to retrieve each individual record and implement your own grouping logic.

When you group by day/week/month/quarter, you get the appropriate day/week/month/quarter number. For example, when grouping by month you get a month number 1 – 12. Note that this means records in the same month from different years will be included in the same month. If this isn’t the behaviour you want, e.g. you want to keep January 2019 separate from January 2020, you need to include a separate grouping by year as well:

<fetch aggregate="true">
  <entity name="contact">
    <attribute name="createdon" alias="year" groupby="true" dategrouping="year" />
    <attribute name="createdon" alias="month" groupby="true" dategrouping="month" />
    <attribute name="contactid" alias="occurrences" aggregate="count" />
    <filter>
      <condition attribute="statecode" operator="eq" value="0" />
    </filter>
    <order alias="occurrences" descending="true" />
  </entity>
</fetch>

For week grouping you get a number 1 – 53, quarter is 1 – 4 and day is 1 – 31.

SQL Equivalents

These date/time grouping options are equivalent to using the DATEPART function in SQL:

SELECT   DATEPART(month, createdon) AS month,
         count(*) AS count
FROM     contact
GROUP BY DATEPART(month, createdon)

There’s no direct SQL equivalent for the fiscal period / year grouping options as these are application specific. However, in SQL 4 CDS I’ve extended the DATEPART function to allow fiscalperiod or fiscalyear to be specified as the part type name parameter.

You’ll need SQL 4 CDS 1.0.5 (which I should be releasing soon) or later to use the DATEPART function.

Inside FetchXML pt 5 – aggregate

One feature of FetchXML that sets it apart from other CDS query options is aggregates. This is the option to get summary data (number of records, sum of opportunity value etc.) rather than the individual values from each record. If your only concept of how many records you have is “5,000+” then this is for you.

Let’s take a look at an example to get the most common first names of your active contacts:

<fetch aggregate="true">
  <entity name="contact">
    <attribute name="firstname" alias="firstname" groupby="true" />
    <attribute name="contactid" alias="occurrences" aggregate="count" />
    <filter>
      <condition attribute="statecode" operator="eq" value="0" />
    </filter>
    <order alias="occurrences" descending="true" />
  </entity>
</fetch>

There are 4 main points to note that are different from a regular query:

  1. The <fetch> element has the aggregate="true" attribute
  2. All the <attribute> elements have an alias. This is required for all aggregate queries, and is the one exception to my “don’t use attribute aliases” rule.
  3. All the <attribute> elements also have EITHER a groupby="true" or aggregate="xxx" attribute
  4. The <order> element uses an alias="xxx" attribute to refer to the previously aliased <attribute> instead of the normal attribute="xxx"

Run this query and you’ll get a list of different firstnames along with the number of active contacts that have that name. The list will be sorted with the most popular name at the top.

Available Aggregates

There are only a few aggregate functions you can apply:

  • count
  • countcolumn
  • min
  • max
  • sum
  • avg

If you’re familiar with SQL these will mostly be familiar as the ways of retrieving the number of records, and the lowest, highest, total and average values in a column.

The one that’s not so obvious is the distinction between count and countcolumn. The only reference I could find for this was a blog dating back to CRM 4.0. In brief, count will give the total number of records, while countcolumn gives the number of records that have a value in the attribute you apply it to. Note that it doesn’t matter which column you apply count to, you’ll get the same total number of records either way.

Distinct Counts

The countcolumn aggregate also supports a distinct attribute. This lets you get the number of different values in that column, e.g.

<fetch aggregate="true">
  <entity name="contact">
    <attribute name="address1_country" alias="country" groupby="true" />
    <attribute name="address1_city" alias="cities" aggregate="countcolumn" distinct="true" />
    <filter>
      <condition attribute="statecode" operator="eq" value="0" />
    </filter>
    <order alias="cities" descending="true" />
  </entity>
</fetch>

This will get a list of countries and the number of different cities in each country.

What “different” means in this context depends on the collation settings for your organisation. For the English installations that I have access to this defaults to ignoring case and accents, so “Münich” and “MUNICH” would be counted as being the same.

Filtering

You can use <filter> specifications in aggregate queries in the same way as normal. See part 3 of this series for more details about filters. The filter will be applied before the aggregate. There is no way to filter the results of the aggregation, e.g. show results where the count is greater than 1, as helpful as that would be.

Grouping

You can run an aggregate query without including any groupby="true" attributes. In this case, all the records that match your filter will be aggregated into a single result record. This is useful for finding the total number of records or highest/lowest values across all your records, e.g.

<fetch aggregate="true">
  <entity name="opportunity">
    <attribute name="opportunityid" alias="count" aggregate="count" />
    <attribute name="estimatedvalue" alias="min" aggregate="min" />
    <attribute name="estimatedvalue" alias="max" aggregate="max" />
    <filter>
      <condition attribute="statecode" operator="eq" value="0" />
    </filter>
  </entity>
</fetch>

This query gives the number of open opportunities, along with the minimum and maximum values of those opportunities.

However, it is often more useful to group the records to see the differences in the summary values in different groups. For example, you can see the pipelines of different sales people by grouping opportunities by their owners and getting the total value of the opportunities in each group:

<fetch aggregate="true">
  <entity name="opportunity">
    <link-entity name="systemuser" to="ownerid" from="systemuserid" link-type="inner">
      <attribute name="fullname" alias="fullname" groupby="true" />
      <order alias="fullname" />
    </link-entity>
    <attribute name="estimatedvalue" alias="value" aggregate="sum" />
  </entity>
</fetch>

This is the first example we’ve looked at that uses a <link-entity>, but they work in aggregate queries in exactly the same way as in regular queries.

Ordering

It’s very useful to sort the results of aggregate queries. For producing reports such as the opportunity pipeline query earlier, the results could be sorted by user to make it easy to find a particular individual, or by the pipeline value to highlight the best or worst achievers.

With the first example to find the number of contacts with each first name, sorting this to find the most and least popular names can often highlight data quality issues. Try it on your data and see what you find!

Limitations

There are a few limitations to aggregate queries you should be aware of

  1. Maximum of 50,000 records. If you try to run an aggregate query that uses more than 50,000 records you’ll get the error AggregateQueryRecordLimit exceeded
  2. No paging support. As for normal queries you can get back up to 5,000 records from your query, but if you have more groups than that there is no way to move on to the next page.

Alternatives

To get around these limitations you’ll need to break up your query with more restrictive filters and combine the results. In SQL 4 CDS I worked around these limits by retrieving all the individual records and applying the aggregation in code when necessary.

For one special case of getting the total number of records in an entity without any filters, i.e.

<fetch aggregate="true">
  <entity name="contact">
    <attribute name="contactid" alias="count" aggregate="count" />
  </entity>
</fetch>

there is a new RetrieveTotalRecordCountRequest message available since version 9 to get this more efficiently and bypassing the 50,000 record limit. SQL 4 CDS will use this optimisation where possible.

SQL Equivalents

These options are equivalent to the GROUP BY clause and aggregate functions in SQL.

The count aggregate is equivalent to count(*) in SQL, and countcolumn is equivalent to count(attributename).

The SQL HAVING clause doesn’t have a FetchXML equivalent – this is one improvement to FetchXML I’d love to see.

RetrieveTotalRecordCountRequest is equivalent to SELECT count(*) FROM entity, with no option to apply a filter. I have a feeling it may actually map to the sp_spaceused SQL Server stored procedure under the hood.

Inside FetchXML pt 4 – order

It’s often important to sort the results of your query, and in FetchXML you can do this with the <order> element. This is equivalent to the ORDER BY clause in SQL.

<fetch>
  <entity name="contact">
    <attribute name="fullname" />
    <order attribute="fullname" />
  </entity>
</fetch>

Sorting by multiple fields

By adding more <order> elements, the results will be sorted by each attribute in sequence. The orders are applied, well, in order. The results will be sorted by the first <order> attribute, then the second one, and so on.

<fetch>
  <entity name="contact">
    <attribute name="fullname" />
    <order attribute="fullname" />
    <order attribute="createdon" />
  </entity>
</fetch>

This will order the results by fullname, then by createdon. In this way we can ensure that two records with the same name will be sorted so that the earliest created record comes before later records.

Reversing ordering

Each <order> is sorted in ascending order by default, so numerically lower, alphabetically earlier or earlier date values come first.

This ordering is reversible by applying the descending attribute. This can be selectively applied to individual <order> elements, so some sorts are applied in ascending order and others descending.

<fetch>
  <entity name="contact">
    <attribute name="fullname" />
    <order attribute="fullname" />
    <order attribute="createdon" descending="true" />
  </entity>
</fetch>

In this way we can reverse the previous logic and prefer a more recent record over older ones where two have the same name, but keep the alphabetical ordering.

Sorting on lookup fields

You might expect an ordering applied to a lookup field to sort by the guid value in the lookup field itself. Indeed, you can write a sort as:

<fetch>
  <entity name="account">
    <attribute name="accountid" />
    <attribute name="primarycontactid" />
    <link-entity name="contact" to="primarycontactid" from="contactid">
      <attribute name="fullname" />
    </link-entity>
    <order attribute="primarycontactid" />
  </entity>
</fetch>

If you run this query however, you’ll see that the results are actually ordered according to the full name of the primary contact.

This is true of any lookup field – the results are sorted according to the primary name attribute of the related entity. This is even true for polymorphic lookup fields such as customer fields. This makes sense from the perspective of the user interface – click on the header of the lookup column and you’d expect it to sort by the name. But this does not guarantee that records with the same guid will be retrieved consecutively, which can cause problems if you are using these IDs in integrations.

If you have code that relies on getting records with the same guid from a lookup field next to each other, order by the primary key of the related record instead of the foreign key lookup field. For example, rewriting the previous query as follows will order the results by the contact ID instead of the name:

<fetch>
  <entity name="account">
    <attribute name="accountid" />
    <attribute name="primarycontactid" />
    <link-entity name="contact" to="primarycontactid" from="contactid">
      <attribute name="fullname" />
      <order attribute="contactid" />
    </link-entity>
  </entity>
</fetch>

Sorting on linked entities

As we’ve just seen in the last example, an <order> can be placed within a <link-entity> as well as within the main <entity> to apply the ordering to attributes within the linked entity.

Bear in mind there is no way to specify the sequence that orderings are applied across multiple entities. Orders from the main <entity> are always applied first, then the first <link-entity>, then any sub-link-entities, and so on. In this example:

<fetch>
  <entity name="account">
    <attribute name="accountid" />
    <link-entity name="contact" to="primarycontactid" from="contactid">
      <link-entity name="systemuser" to="ownerid" from="systemuserid">
        <attribute name="fullname" />
        <order attribute="fullname" />
      </link-entity>
      <attribute name="fullname" />
      <order attribute="fullname" />
    </link-entity>
    <link-entity name="systemuser" to="ownerid" from="systemuserid">
      <attribute name="fullname" />
      <order attribute="fullname" />
    </link-entity>
    <order attribute="name" />
    <order attribute="createdon" />
  </entity>
</fetch>

CDS sorts the results by:

  • Account name
  • Account creation date
  • Primary contact name
  • Primary contact owner name
  • Account owner name

In programming terms, this is a depth first search.

Sorting & Paging

When you execute any query, by default you will only get up to 5,000 records back. To get any more you need to use paging to get the next 5,000, and the next, and so on.

For paging to work correctly, the data must be sorted so that CDS can work out where the next page should start from.

Because a sort on any field apart from the primary key can result in two records with the same value being next to each other, it would be impossible to use just those values to determine where the next page should start. If you sort by the name of contacts and the last record in the first page has a name of Mark Carrington, the next page could start with another Mark Carrington record as well. There is no way of telling exactly which of these records should be the start of the next page.

To resolve this problem, CDS automatically adds an extra sort on the primary key field. This makes sure there is a unique value for the system to use to tell exactly where the next page should start.

Remember though that a sort can’t be applied to the top-level entity after one that was applied to a link entity. This means that if you’ve sorted the query by a link entity, this automatic sort by the primary key of the main entity can’t be added without giving a different sort order to what the user asked for, so it’s not added and you won’t get a paging cookie back. You can still ask for the next page, but it will be retrieved less efficiently.

Sorting & Aliases

If you choose to give aliases to the attributes in your query (I looked at this in part 1 of this series, and recommended against it), be careful about sorting by it. If you execute this query:

<fetch>
  <entity name="contact">
    <attribute name="fullname" alias="name" />
    <order attribute="fullname" />
  </entity>
</fetch>

you will get the expected results, but when you try to move on to the next page you’ll get the same records again. If you keep moving on to the next page you’ll get stuck in an infinite loop as you never actually move on to the next page.

If you have a look into the contents of the paging cookie you receive, you’ll see it shows that the last record in the page has a value of null for the fullname attribute, so the first record of the next page should be the next one that also has a null name or the first non-null value. This will obviously give the first page of data again.

I can see why this would happen – the record that was retrieved doesn’t have a fullname attribute, as it’s been aliased to name instead.

To avoid this problem, see my earlier advice and don’t use aliases unless you’re doing aggregate queries! If you must, add the attribute again without an alias, e.g.

<fetch>
  <entity name="contact">
    <attribute name="fullname" alias="name" />
    <attribute name="fullname" />
    <order attribute="fullname" />
  </entity>
</fetch>

When CDS calculates the paging cookie to give back, the final record in the page will have a fullname value to use and therefore the cookie will correctly identify where to start the next page.

FetchXML Builder Updates

I’m very proud to have some more of my contributions included in the latest release of the popular FetchXML Builder tool!

There are two main updates I’d like to show you:

Live Updates

As you edit your query, everything else now updates in real-time. No more clicking away to see your changes reflected in the query builder, FetchXML view or any other tool windows! Check it out in this video:

SQL 4 CDS Integration

If you’ve ever used the SQL Query window in FetchXML Builder, you might have noticed it missed various aspects of more complex queries. With the work I’ve done lately on the SQL 4 CDS tool it made sense to use that to enhance the SQL support in FXB too.

To get this to work you need to have SQL 4 CDS version 1.0.4 or later installed, then enable the integration in FXB under Options > Use SQL 4 CDS for SQL conversion. SQL 4 CDS will then be used to generate the SQL query instead of the built-in FXB logic:

FetchXML Builder / SQL 4 CDS integration

As well as showing the query, you can also use the new Edit in SQL 4 CDS button. Edit your query as SQL, then bring it back as FetchXML to continue using all the other great FXB features like converting the query to Power Automate parameters.