CDS T-SQL endpoint pt 2 – First Thoughts

Let me just say, this thing is going to be great. It’s the biggest advancement in the platform for some years, and I don’t think any competitor will come close.

However, it is a preview feature, and it does have some issues. Hopefully some of these at least will get looked at before it goes GA.

Supported Syntax

I expected simple SELECT col1, ..., coln FROM table1 INNER JOIN table2 ON ... to work, but I didn’t expect to be able to use PIVOT, APPLY, sub-queries etc which do all seem to work as expected 😀

In fact, the only query syntax I regularly use that I’ve spotted not being supported yet is CTEs. This means that querying hierarchical data is still easier in FetchXML (for now), but for other ad-hoc queries then SQL seems the obvious way to go for me.

You need aliases!

You can run this query quite happily:

SELECT *
FROM   account

This works perfectly too:

SELECT *
FROM   account AS a
       INNER JOIN
       contact AS c 
       ON a.primarycontactid = c.contactid

However, this perfectly valid SQL doesn’t:

SELECT *
FROM   account
       INNER JOIN
       contact
       ON account.primarycontactid = contact.contactid

In this case you get the error:

Msg 40000, Level 16, State 1, Line 2
The multi-part identifier "account.primarycontactid" could not be bound.
The multi-part identifier "contact.contactid" could not be bound.

It appears that, whenever you use a join, the tables MUST be aliased. Not the end of the world, but in easy gotcha that could confuse new users.

Missing tables & fields

Some data is not accessible through this endpoint. My best guess is that it’s only the data that’s stored in SQL that’s available, as the ones I’ve spotted that are missing are stored elsewhere now, e.g.

  • annotation.documentbody
  • activitymimeattachment.body
  • audit
  • plugintracelog

Even more confusingly, the documentbody and body fields appear in the results if you run SELECT * FROM annotation or SELECT * FROM activitymimeattachment respectively (although they are always null), but they do not appear in the metadata for the tables.

Ideally this data would be available too, but if not I’d like some consistency between the actual results and the metadata.

OAuth timeouts

I’m not sure if this is a general problem with Active Directory authentication for SQL Server or something specific to this implementation, but once your connection has been open for an hour any more queries will receive the error:

Msg 40000, Level 16, State 1, Line 7
The OAuth token has expired

At this point you need to force it to reconnect (in SSMS, clicking “New Query” appears to be sufficient) and carry on.

Given that this is (presumably) not running direct SQL commands, I’ve been pleasantly surprised by the range of SQL that it implements.

CDS T-SQL Endpoint pt 1 – Connecting

The biggest announcement of MBAS for me by a long way was the new T-SQL endpoint. For those who missed it, go and check out the recordings.

In short, Microsoft have added a (preview) way of querying your CDS data using T-SQL, which anyone familiar with SQL Server will already know. It was originally done to support live querying from Power BI, but because they used the same TDS protocol that SQL Server uses, immediately the entire ecosystem of SQL Server tools is opened up. So you can now use SQL Server Management Studio amongst others to query your CDS / Dynamics 365 data. Cool!

I can’t deny I didn’t feel a bit put-out by this announcement coming only a few days after I’d released the latest version of my own SQL 4 CDS tool, but I’m sure I’ll get over it eventually…

Connecting

Obviously the first thing I did was carefully read the documentation 😜 before moving on to try and connect to my own instances.

The key bit of information you need when connecting is the server name (the same as you see in the address bar of your browser when you’re in your app) and the port number 5558. Put these together with a comma, select the Active Directory - Password authentication option and you should be away.

The result was a bit of a let-down:

TDS protocol endpoint is disabled for this organization

I assumed at this point that this new preview feature just wasn’t available for me yet. A few people suggested creating new instances in the Canada (crm3) region as they seem to get new features earlier, but I got the same error again and eventually gave up for a few days.

Update 2020-05-24: This error message has been updated to give some more help on fixing it now:

Enabling the TDS protocol endpoint

Andrew Bibby then helpfully pointed me to some more documentation on how to enable the new endpoint. Unfortunately, even when I got the correct command line command to run it still didn’t work. Instead it gave me the error:

Error occurred in OrgDBOrgSettings and the error details are GDS resource provider is unable to get instances for tenantId: '<guid>', response status:'NotFound' and reason 'Not Found'

Instead I ended up actually using SQL 4 CDS to update the orgdborgsettings attribute of my organization entity:

select orgdborgsettings, organizationid from organization

I then added <EnableTDSEndpoint>true</EnableTDSEndpoint> at the end of the list of settings and updated it using:

update organization set orgdborgsettings = '<OrgSettings><IsCommandingModifiedOnEnabled>true</IsCommandingModifiedOnEnabled><EnableActivitiesTimeLinePerfImprovement>1</EnableActivitiesTimeLinePerfImprovement><EnableActivitiesFeatures>1</EnableActivitiesFeatures><CanCreateApplicationStubUser>false</CanCreateApplicationStubUser><AllowRoleAssignmentOnDisabledUsers>false</AllowRoleAssignmentOnDisabledUsers><EnableTDSEndpoint>true</EnableTDSEndpoint></OrgSettings>' where organizationid = '<guid>'

I could then immediately connect using SQL Server Management Studio, so now I’m off to play with it! More updates to come…

Update 2020-05-24

The official instructions appear to work for other people, but not for me. Hopefully you’ll have more luck than I did, but I’ve also included an update in SQL 4 CDS 2.1.0 to simplify this process too.

D365UG UK: Data Integrity & Quality – Tips & Tools

A bit of a mouthful of a title! I had a great time earlier today presenting at my first ever virtual event, D365UG UK. Many thanks to the organising team for a smoothly run event!

My topic today was using a variety of free tools to unearth some data quality issues in your CRM installation that you may be unaware of, and some suggestions on common causes and possible resolutions.

The tools I used were:

  • Advanced Find
  • FetchXML Builder
  • SQL 4 CDS
  • Data Export Validation Tool

Those last three are all free XrmToolBox tools, so head over to the tool store to install them today.

The slides should be available on the D365UG UK forum shortly, but in the meantime I promised to share the script that you can run in SQL 4 CDS to check many of the common scenarios we looked at. Many of these could also be run in either Advanced Find or FetchXML Builder if you prefer.

-- Accounts assigned to inactive users
select a.name
from
    account a
    inner join systemuser u on a.owninguser = u.systemuserid
where
    a.statecode = 0 and
    u.isdisabled = 1

-- Accounts with a disabled primary contact
select a.name
from
    account a
    inner join contact c on a.primarycontactid = c.contactid
where
    a.statecode = 0 and
    c.statecode = 1

-- Accounts with a primary contact that isn't part of the account
select a.name
from
    account a
    inner join contact c on a.primarycontactid = c.contactid
where
    a.statecode = 0 and
    (c.parentcustomerid is null or c.parentcustomerid <> a.accountid)

-- Contacts with the same firstname and lastname
select contactid, firstname, lastname
from
    contact
where
    statecode = 0 and
    firstname = lastname

-- Contacts with the same lastname and email address
select contactid, firstname, lastname
from
    contact
where
    statecode = 0 and
    lastname = emailaddress1
    
-- Common firstnames
select firstname, count(*)
from
    contact
where
    statecode = 0 and
    firstname is not null
group by
    firstname
order by
    count(*) desc
    
-- Common lastnames
select lastname, count(*)
from
    contact
where
    statecode = 0 and
    lastname is not null
group by
    lastname
order by
    count(*) desc
    
-- Contacts with a parent contact
select c1.contactid, c1.firstname, c1.lastname
from
    contact c1
    inner join contact parent on c1.parentcustomerid = parent.contactid
where
    c1.statecode = 0

-- Orphaned contacts
select contactid, firstname, lastname
from
    contact
where
    statecode = 0 and
    parentcustomerid is null

-- Active contacts in inactive accounts
select contactid, firstname, lastname, account.name
from
    contact
    inner join account on contact.parentcustomerid = account.accountid
where
    contact.statecode = 0 and
    account.statecode = 1

SQL 4 CDS 2.0.0 released!

I’m very pleased to release version 2.0.0 of my SQL 4 CDS tool today!

If you haven’t come across it before, SQL 4 CDS is an XrmToolBox tool to query and manipulate your CDS data using regular SQL queries, making it accessible to a much wider range of users. You can install it today from the XrmToolBox Tool Library.

It takes quite a lot to jump from a version 1.0.x to a version 2, so what’s new?

Beyond FetchXML

Version 1 translated your SQL query into FetchXML and executed it, which made it easier to run the queries but didn’t give you the full power of SQL. There were still plenty of queries you couldn’t run because there was no FetchXML equivalent.

While still not supporting every bit of T-SQL (that would be near-enough impossible!), this release brings it a lot closer. While it still uses FetchXML as much as possible, it can now also transform those results to implement more SQL queries than before.

For example, you can now run queries that compare two field values, execute common functions such as DATEDIFF and DATEADD, use a HAVING clause, sort data with linked tables without having to worry about the order of your joins and much more.

Some example queries that you can run now that you can’t do with FetchXML alone:

Find accounts with more than 10 contacts

This query would not normally be possible as FetchXML does not have an equivalent for the HAVING clause, but SQL 4 CDS 2 will handle it:

SELECT   a.name
FROM     account AS a
         INNER JOIN
         contact AS c
         ON a.accountid = c.parentcustomerid
GROUP BY a.name
HAVING   count(*) > 10

Find accounts that haven’t been modified since being created

FetchXML doesn’t have an option for comparing the values of two fields. SQL 4 CDS 2 handles this again, but watch out when running this on large data sets as it’s got to retrieve all the accounts in order to filter them itself.

SELECT name
FROM   account
WHERE  createdon = modifiedon

Find contacts created over a year after their account

A variation on the one above, you can now use more complex expressions such as selected functions, mathematical expressions etc. as part of your query too.

SELECT a.name,
       c.fullname
FROM   account AS a
       INNER JOIN
       contact AS c
       ON a.accountid = c.parentcustomerid
WHERE  c.createdon > DATEADD(year, 1, a.createdon)

Increment the score for all contacts called Mark

These same improvements can also be used as part of assignments in UPDATE statements, e.g.:

UPDATE contact
SET    cdi_score = cdi_score + 1
WHERE  firstname = 'Mark'

Sort by related tables

FetchXML imposes a restriction on sort orders. They have to be applied at the top-level entity first and then work down through joined tables. You can now apply sorts in whatever order you like:

SELECT a.name,
       c.fullname
FROM   account AS a
       INNER JOIN
       contact AS c
       ON a.accountid = c.parentcustomerid
ORDER BY
       c.fullname,
       a.name

Notifications

If you run a query that can’t be executed fully as FetchXML, you’ll see a warning like this:

“Extra processing required” notification

Be sure to not simply copy & paste this FetchXML into your own code. Running this FetchXML directly will give you all the information you need to calculate the results, but you will need to do some post-processing on the results.

Intellisense

This release also includes Intellisense-style suggestions as you type. There are a few caveats to be aware of – as you type a SELECT statement, the tool won’t offer suggestions until you start entering the FROM clause, so if you want suggestions on the SELECT clause itself then just enter SELECT * FROM, enter the FROM clause and then go back.

The metadata used to drive the suggestions is loaded in the background, so you may not get suggestions immediately. Give it a little time after entering the FROM clause and you should start seeing suggestions appear.

I’ve tried to get the suggestions to appear when they feel natural to me. I’m interested in any feedback on this feature and how you’d like to see it work.

Managing many-to-many relationships

You can already use INSERT, UPDATE and DELETE to edit your data, but this release improves on that to cover the intersect entities used in many-to-many relationships. For example, you can use this query to add all your contacts called Mark to a specific marketing list:

INSERT INTO listmember (listid, entityid)
SELECT '5610d244-aac2-4fa8-9b2f-56537c80bad0',
       contactid
FROM   contact
WHERE  firstname = 'Mark'

Managing polymorphic lookups

If you INSERT or UPDATE a lookup field you can use the guid of the related record, but this only works for lookups that can only reference one entity type. For example, this would work:

UPDATE account
SET    primarycontactid = 'e3dd3640-9890-4371-b1cb-fb1b26f20ff3'
WHERE  name = 'Data8 Ltd'

However, this would not work as the parentcustomerid field can reference both account and contact records, and there’s nothing to indicate which type is being referenced here:

UPDATE contact
SET    parentcustomerid = '65a7e7f7-0f09-4649-9182-56abc489a1e2'
WHERE  fullname = 'Mark Carrington'

This release introduces the CREATELOOKUP function which can be used to solve this:

UPDATE contact
SET    parentcustomerid = CREATELOOKUP('account', '65a7e7f7-0f09-4649-9182-56abc489a1e2')
WHERE  fullname = 'Mark Carrington'

Inside FetchXML pt 10 – link-entity (again)

A follow-up on my previous post on link-entity, prompted by this tweet from Daryl LaBar:

I’d previously stated (incorrectly):

FetchXML supports two link types – inner and outer.

Me, in more ignorant times

While these are the most common link types – and the only two that FetchXML Builder exposes – the platform has supported a third for some time, and now appears to have several more too!

TL;DR

There are some tantalising hints at new join types, but you probably shouldn’t be jumping into using them unless Microsoft support advises you to just yet!

Natural Join

The documentation for the natural join type is not exactly clear to me:

Only one value of the two joined attributes is returned if an equal-join operation is performed and the two values are identical

docs.microsoft.com

I really have no idea what this is trying to say – if you can enlighten me, please do! Given that all the joins it does are based on the attributes being equal, and therefore the two values must be identical, I can’t see what this is trying to describe.

As I can’t tell from the docs what it is trying to do, I’ve tried to work it out by running some queries and having a look at the generated SQL (thank you on-premise!) and the results.

My conclusion: natural is a synonym for inner.

If you substitute natural for inner, you will get exactly the same results, as it generates exactly the same SQL.

Many to many relationships

One thing I did wonder based on the docs for the natural join type was whether it would try to imply the join attributes by name, i.e. you could miss out the from and to attributes on the link-entity if they were both the same and it would imply them from the metadata. This failed, but it did produce this intriguing error message:

No system many-to-many relationship exists between account and contact.  If attempting to link through a custom many-to-many relationship ensure that you provide the from and to attributes.

This seemed interesting, so I gave it a go with this simple query:

<fetch>
  <entity name="account">
    <attribute name="name" />
    <link-entity name="list" link-type="inner">
      <attribute name="listname" />
    </link-entity>
  </entity>
</fetch>

And it worked! This is a simple shortcut for navigating built-in many-to-many relationships without having to use the more explicit version of joining via the intersect entity:

<fetch>
  <entity name="account">
    <attribute name="name" />
    <link-entity name="listmember" to="accountid" from="entityid" alias="listmember" link-type="inner">
      <link-entity name="list" to="listid" from="listid" alias="list" link-type="inner">
        <attribute name="listname" />
      </link-entity>
    </link-entity>
  </entity>
</fetch>

New Join Types

Anyway, onto the original challenge. The docs now lists a total of 10 different join types:

  • Inner
  • LeftOuter
  • Natural
  • MatchFirstRowUsingCrossApply
  • In
  • Exists
  • Any
  • NotAny
  • All
  • NotAll

As there’s no documentation on any of these I headed over to the simple but trusty FetchXml Tester XrmToolBox tool and tried one at random:

<fetch>
  <entity name="account">
    <attribute name="name" />
    <link-entity name="contact" to="accountid" from="parentcustomerid" link-type="notall">
    </link-entity>
  </entity>
</fetch>

This gave the remarkably helpful error message:

An error occured: Invalid link-type specified, valid values are: 'natural', 'inner', 'in', 'matchfirstrowusingcrossapply','exists' and 'outer'. link-type = notall

So maybe CDS doesn’t support all 10 join types yet, but it does appear to support some we didn’t have before:

  • In
  • MatchFirstRowUsingCrossApply
  • Exists

I also tried the same thing on an on-premise installation, and that does only support the original 3: inner, outer and natural, so this is online-only for now at least.

So, in the absence of any documentation on these, or the ability to see what SQL gets generated for them, what are we to do? Well, these all seem to have directly equivalent SQL keywords, so starting from a simple query:

<fetch>
  <entity name="account">
    <attribute name="name" />
    <link-entity name="contact" from="parentcustomerid" to="accountid" link-type="inner">
      <attribute name="lastname" />
      <filter>
        <condition attribute="firstname" operator="eq" value="Mark" />
      </filter>
    </link-entity>
  </entity>
</fetch>

My guess for the generated SQL when changing the link-type to different options would be:

Inner

SELECT name,
       lastname
FROM   account
       INNER JOIN
       contact
       ON
       account.accountid = contact.parentcustomerid
WHERE  contact.lastname = 'Mark'

In

SELECT name
FROM   account
WHERE  accountid IN (
  SELECT parentcustomerid
  FROM   contact
  WHERE  firstname = 'Mark'
)

MatchFirstRowUsingCrossApply

SELECT name,
       contact.lastname
FROM   account
       CROSS APPLY (
         SELECT TOP 1 lastname
         FROM         contact
         WHERE firstname = 'Mark' AND
               parentcustomerid = account.accountid
       ) AS contact

Exists

SELECT name
FROM   account
WHERE  EXISTS (
  SELECT *
  FROM   contact
  WHERE  firstname = 'Mark' AND
         parentcustomerid = account.accountid
)

This is just conjecture at this point, but if correct it would mean:

  1. Each of the new join types would only produce one row per account that has a contact called Mark, regardless of whether there was one such contact or 100. The standard inner join type however would produce one row for each contact
  2. The in and exists versions would not be able to include any attributes from the linked entity (lastname in this example), as it is only referenced from the WHERE clause, but the inner and matchfirstrowusingcrossapply would be able to

So, let’s give it a go!

I’ve created two accounts, one with no contacts called Mark and one with two:

And the results?

Inner Join

No surprise here, but good for a simple baseline:

<fetch>
  <entity name="account">
    <attribute name="name" />
    <link-entity name="contact" to="accountid" from="parentcustomerid"  link-type="inner">
      <attribute name="lastname" />
      <filter>
        <condition attribute="firstname" operator="eq" value="Mark" />
      </filter>
    </link-entity>
    <filter>
      <condition attribute="name" operator="like" value="Join Operator %" />
    </filter>
  </entity>
</fetch>
namelastname
Join Operator 2Carrington
Join Operator 2Duplicate

In

Executing the exact same query but replacing inner with in gave the error:

An error occured: In doesn't support attribute inside linkentity expression.

Removing the <attribute> from inside the <link-entity> gave the expected result:

name
Join Operator 2

So far, so good!

Exists

Going through the same process with exists, again we get the same error about not being able to use an <attribute> within the <link-entity>. Removing that gives us the same results again!

MatchFirstRowUsingCrossApply

Trying again with matchfirstrowusingcrossapply, and it almost works as I expected. I can run the query with the <attribute name="lastname" /> element inside the <link-entity> without an error, but I don’t actually get that column back in the result set – I get the same results again as I did with in and exists!

Given how close the other predictions were, I’m going to chalk this down overall as a success, with the missing column in the results possibly being due to a CDS bug that will hopefully be fixed at some point. As always though, I’d be very happy to get any firm information on it one way or another!

Why?

So now we have a reasonable idea of what these new join operators are doing, why might we want to use them? We’ve managed quite happily with inner and outer until now, thank you very much.

Uniqueness

The main thing that these operators give us that we couldn’t do before is removing the duplication problem. We saw earlier that we can use distinct to remove duplicate rows in query results that were introduced by joining to multiple child records, but this also removed the automatic primary key from the results. Because these operators do not produce duplicate rows regardless of the number of child records, there is no need to use distinct. This was never that much of a problem though, as we could always include the primary key column explicitly anyway. So what else is there?

Single Child Entity

The matchfirstrowusingcrossapply option looks the most powerful, although as I haven’t managed to actually get any attributes out of the linked entity I can’t realise that yet. It would be most powerful if we could combine that with a sort order, so you could get a list of accounts along with the “best” contact for each account:

<fetch>
  <entity name="account">
    <attribute name="name" />
    <link-entity name="contact" to="accountid" from="parentcustomerid"  link-type="matchfirstrowusingcrossapply">
     <attribute name="contactid" />
     <attribute name="fullname" />
     <order attribute="cdi_score" descending="true" />
    </link-entity>
  </entity>
</fetch>

That looks like it could give us a unique list of accounts, along with the details of the top contact in the account according to the ClickDimensions score. Unfortunately, running that gives us an error:

MatchFirstRowUsingCrossApply doesn't support order clause inside linkentity expression.

So far, that looks like a missed opportunity, as the corresponding SQL would work perfectly fine:

SELECT name,
       contact.contactid,
       contact.fullname
FROM   account
       CROSS APPLY (
         SELECT TOP 1 contactid,
                      fullname
         FROM         contact
         WHERE        parentcustomerid = account.accountid
         ORDER BY     cdi_score DESC
       ) AS contact

Performance

So what else? How about performance? If we compare the four simple examples we looked at before, we get three quite different query plans:

Comparative performance of different join operators

Even when adding in a DISTINCT option to the simple inner join type, this takes effectively no time at all to execute. All three of the other new options take significantly longer, with the matchfirstrowusingcrossapply option taking the vast majority of the time. So there’s no quick performance win here by just turning one of these options on blindly.

My best guess then is that these options have been added following some performance analysis of other, more complex queries that Microsoft have seen from their telemetry. This could possibly be for their 1st party Dynamics 365 apps, AI or other applications that have a particular need for more generating specific queries. This would probably explain the lack of documentation as much as anything else.

As there are still some other join types listed in the documentation which the server doesn’t appear to support yet, it could also be the case of the documentation getting ahead of the code and we’ve got some more exciting platform extensions coming in a future release!

Inside FetchXML pt 9 – distinct

You can ensure your query only produces each row once using the distinct attribute. For example, if you have multiple entries for the same company name you can get the list of unique names using:

<fetch distinct="true">
  <entity name="account">
    <attribute name="name" />
  </entity>
</fetch>

This option gets applied automatically whenever you create a query using Advanced Find. This is particularly useful when you add a join onto a related child entity. For example, this query:

<fetch>
  <entity name="account">
    <attribute name="name" />
    <link-entity name="contact" from="parentcustomerid" to="accountid" link-type="inner">
      <filter>
        <condition attribute="firstname" operator="eq" value="Mark" />
      </filter>
    </link-entity>
  </entity>
</fetch>

The intent here is to get a list of accounts that have a contact called Mark. However, if an account has two Marks, that account will be in the resulting list twice. Adding the distinct attribute makes sure the user gets what they were probably expecting, which is a list of unique accounts.

Unique Identifiers

One other interesting side effect of using distinct is that you no longer get the primary key of the entity back automatically. If you run the query:

<fetch>
  <entity name="account">
    <attribute name="name" />
  </entity>
</fetch>

As well as getting the account name back, you’ll also automatically get the accountid as well.

When you apply distinct, this goes away. This makes sense if you think about it. Imagine you have two accounts with the same name. If the accountid was automatically included, that would make the records unique. You would still get two rows back from your query, regardless of whether or not you used distinct.

If the purpose of using distinct is to eliminate duplicate rows introduced by a join onto a child entity as shown above though, you probably do want to get a row back for each of your top-level entity type. In that case you will need to explicitly include the primary key attribute in your query, e.g.

<fetch distinct="true">
  <entity name="account">
    <attribute name="accountid" />
    <attribute name="name" />
    <link-entity name="contact" from="parentcustomerid" to="accountid" link-type="inner">
      <filter>
        <condition attribute="firstname" operator="eq" value="Mark" />
      </filter>
    </link-entity>
  </entity>
</fetch>

Truncation

One “gotcha” to be aware of. Any text values will be truncated to 2,000 characters before the distinct operation is applied. If you have longer values, e.g. descriptions or the contents of attachments in the annotation entity, any extra data after the first 2,000 characters will be lost.

Quite why this would be is a bit of a mystery. My best guess is that these fields used to be stored in a SQL Server ntext field which couldn’t be used in a SELECT DISTINCT query, so to work around this problem Dynamics CRM would cast these values to an nvarchar(2000) value which could be used instead. You can see this in the generated SQL:

<fetch distinct="true" top="10">
  <entity name="annotation">
    <attribute name="documentbody" />
  </entity>
</fetch>
select 
 DISTINCT  top 10  cast( "annotation0".DocumentBody as varchar(2000) ) as "documentbody" 
from
 AnnotationBase as "annotation0"

As these columns are now stored using an nvarchar(max) field which can be used directly in a SELECT DISTINCT query I guess this behaviour is still in place for backwards compatibility reasons, although it does give about a 35% performance improvement:

SQL Equivalent

Apart from this once caveat of long text fields, this is directly equivalent to the SQL DISTINCT keyword.

Inside FetchXML pt 8 – page, count & top

By default you’ll get up to 5,000 records that match your query. The top-level <fetch> element supports a few options to help you control this:

top will limit the number of records further, e.g.

<fetch top="10">
  <entity name="account">
    <all-attributes />
  </entity>
</fetch>

When you use top to limit the number of records, you don’t have the option to move on to subsequent pages. You can’t use this to increase the number of records beyond 5,000 – you can only use a number between 1 and 5,000.

In contrast, count and page allow you to move through multiple pages of results and control the size of each page:

<fetch count="100" page="3">
  <entity name="account">
    <all-attributes />
    <order attribute="createdon" />
  </entity>
</fetch>

This example will get records 201 – 300. If you are working you way through sequential pages you should also set the paging-cookie attribute with the value of the paging cookie from the previous page.

So when should you use top and when should you use count and page?

If you are only ever interested in the first n records (e.g. you just want the first record that matches your query), use top as you have no need to retrieve multiple result pages.

On the other hand, if you do need to retrieve all the results (or at least, more than “just” the first 5,000), use page to indicate which number page you want. In addition you can optionally use count to control the size of each page – you might set this after some trial-and-error testing of the performance of your query.

SQL Equivalents

In SQL terms, top is equivalent to the SELECT TOP (n) clause. count and page are similar to the OFFSET ... FETCH ... clause.

Inside FetchXML pt 7 – rowaggregate

If you’re using any hierarchies in your data, you’ve probably noticed the icon in your main grid views that you can click on to show the hierarchy view when records have a parent or children. It’s simple to work out if a record has a parent – just check if the parent lookup attribute contains data. But have you ever stopped to think about how it works out if a record has got any children?

Hierarchical records highlighted in grid view

It would be horribly inefficient to execute a separate query for each row to get the number of children. Instead, it will use a FetchXML feature called rowaggregate.

Similar to applying standard aggregations, this is applied to an <attribute> in your query, e.g.

<fetch>
  <entity name="account">
    <attribute name="name" />
    <attribute name="accountid" rowaggregate="countchildren" alias="NumberOfChildren" />
  </entity>
</fetch>

You don’t have any control within your query of what is meant by “children” in this context. This is defined by the hierarchy configuration of the entity being queried.

Hierarchical relationship setting

You can only apply the rowaggregate option to an attribute that is the primary key in a hierarchical relationship (the “Related (One)” side in the screenshot above). You’ll receive an error if you try to apply it to any other attribute, e.g. if you try to apply it on the telephone1 attribute:

Entity: Account 'rowaggregate' is not supported on attribute, telephone1. Hierarchy relationship accountid <- parentaccountid.

You can only use rowaggregate on the top-level entity in your query. You’ll receive an unhelpful error An unexpected error occurred if you use it within a link-entity.

SQL 4 CDS 1.0.9 Released!

I’ve just released SQL 4 CDS 1.0.9. Please update it when XrmToolBox prompts you to take advantage of these latest features:

Open Source!

I’m very pleased to be able to release SQL 4 CDS as open source, thanks to the support of Data8. You can now grab all the code on GitHub, submit your bug reports or suggestions as issues or even open a pull request with your own changes!

As well as the XrmToolBox tool, I’ve also separated out the core SQL ⇔ FetchXML conversions into the MarkMpn.Sql4Cds.Engine NuGet package you can include in your own projects. A simple code example to convert from SQL to FetchXML would look like:

public string ConvertSqlToFetchXml(IOrganizationService org, string selectStatement)
{
  var metadataCache = new AttributeMetadataCache(org);
  var converter = new Sql2FetchXml(metadataCache, true);
  var queries = converter.Convert(selectStatement);

  // queries could contain many queries, including SELECT, INSERT, UPDATE or DELETE
  // We'll assume it's just a single SELECT statement for now
  var select = (SelectQuery) queries[0];

  // The FetchXML is returned as an object model, convert it to a string using XmlSerializer
  using (var writer = new StringWriter())
  {
    var serializer = new XmlSerializer(typeof(FetchType));
    serializer.Serialize(select.FetchXml);

    return writer.ToString();
  }
}

XrmToolBox 1.2020.2.36

If you get an error when opening SQL 4 CDS:

An error occured when trying to display this tool: Method not found: 'McTools.Xrm.Connection.ConnectionDetail ConnectionUpdatedEventArgs.get_ConnectionDetail()'.

after installing the latest version of XrmToolBox, please update SQL 4 CDS to 1.0.9 as well to fix it.

Cancel running queries

If you need to stop a query while it’s running, just click the 🟥 button in the toolbar to stop it.

Improved results display

The grid view showing the results of a query is improved with:

  • total record count at the bottom
  • record numbers on the left
  • right-click options to copy data with or without headers
  • click on lookup guid values to open the related record or quickly create a SELECT query to get the details of the record
  • columns in a SELECT * query are sorted by name to make it easier to find what you’re looking for
  • option to show lookup values as the name of the related record instead of the GUID
  • option to show times as your local time zone instead of UTC

Improved query support

You will now get the expected results when running queries that combine * and a field name in the SELECT clause, e.g. SELECT name, * FROM account, and when ordering a SELECT * query, e.g. SELECT * FROM account ORDER BY name

Double quotes

By default, SQL treats double quotes as enclosing the name of a table or column, and single quotes for strings, so you can’t write:

SELECT name
FROM   account
WHERE  telephone1 = "01513554555"

as the account entity doesn’t contain a field called 01513554555

You can now switch this behaviour so double quotes and single quotes are both used to enclose strings by unticking the new “Quoted Identifiers” option in the settings window.

Feedback

Please open an issue in the GitHub repository to let me know of any problems you encounter or with any suggestions you might have for improvement.

Please also use the rating feature in XrmToolBox to let me know what you think!

FetchXML Multiple Links

One common pattern of queries I see about FetchXML is how to write queries to ask two different questions about the same related entity. For example:

  • Invoices that include product A AND product B?
  • Contacts that have pending emails AND no sent emails?
  • Visits that have a page view of the checkout page BUT NOT the “thank you” page?

Multiple Links

The key with these sorts of queries is that each question you want to ask of the related entity requires a separate link. Taking the first example:

Invoices that include product A AND product B?

If we write this as:

Advanced Find

Finding invoices with two products – attempt 1!

FetchXML

<fetch xmlns:generator="MarkMpn.SQL4CDS">
  <entity name="invoice">
    <link-entity name="invoicedetail" to="invoiceid" from="invoiceid" alias="id" link-type="inner">
      <link-entity name="product" to="productid" from="productid" alias="p" link-type="inner">
        <all-attributes />
      </link-entity>
      <all-attributes />
    </link-entity>
    <all-attributes />
    <filter>
      <condition attribute="productnumber" entityname="p" operator="eq" value="A" />
      <condition attribute="productnumber" entityname="p" operator="eq" value="B" />
    </filter>
  </entity>
</fetch>

SQL

SELECT *
FROM   invoice AS i
       INNER JOIN
       invoicedetail AS id
       ON i.invoiceid = id.invoiceid
       INNER JOIN
       product AS p
       ON id.productid = p.productid
WHERE  p.productnumber = 'A'
       AND p.productnumber = 'B';

This is a common mistake I see, and one you can make easily if you try to build this query in Advanced Find. When the query is run, it tries to find invoices that have an invoice item that has a product that has the product number “A” and “B”. Since it’s impossible for a product to have two different product numbers in the same field, you’ll never get any results for this query.

The key change we need to make is to have two separate links to the invoicedetail entity, one filtered for product A and the other for product B:

FetchXML

<fetch xmlns:generator="MarkMpn.SQL4CDS">
  <entity name="invoice">
    <link-entity name="invoicedetail" to="invoiceid" from="invoiceid" alias="id_a" link-type="inner">
      <link-entity name="product" to="productid" from="productid" alias="p_a" link-type="inner">
        <filter>
          <condition attribute="productnumber" operator="eq" value="A" />
        </filter>
        <all-attributes />
      </link-entity>
      <all-attributes />
    </link-entity>
    <link-entity name="invoicedetail" to="invoiceid" from="invoiceid" alias="id_b" link-type="inner">
      <link-entity name="product" to="productid" from="productid" alias="p_b" link-type="inner">
        <filter>
          <condition attribute="productnumber" operator="eq" value="B" />
        </filter>
        <all-attributes />
      </link-entity>
      <all-attributes />
    </link-entity>
    <all-attributes />
  </entity>
</fetch>

SQL

SELECT *
FROM   invoice AS i
       INNER JOIN
       invoicedetail AS id_a
       ON i.invoiceid = id_a.invoiceid
       INNER JOIN
       product AS p_a
       ON id_a.productid = p_a.productid and p_a.productnumber = 'A'
       INNER JOIN
       invoicedetail AS id_b
       ON i.invoiceid = id_b.invoiceid
       INNER JOIN
       product AS p_b
       ON id_b.productid = p_b.productid and p_b.productnumber = 'B';

With this change the query will now find invoices that have an invoice item that has a product that has the product number “A”, and also an invoice item that has a product that has the product number “B”. It’s more long-winded to write but gets us to the right answer.

Unfortunately it’s not possible to build this query in Advanced Find, so if you need to use a query like this in your app you’ll need to get the query ready in FetchXML Builder and save it from there as a view or marketing list (but see the workaround at the end).

Combining with “Not In” Queries

Taking another example:

Visits that have a page view of the checkout page BUT NOT the “thank you” page?

If you’re using ClickDimensions you get a lot of powerful web tracking information recorded in custom entities. Two of these are Page View, which records an individual web page that someone has viewed, and Visit, which represents a session that can cover multiple pages. Each Page View is linked to a Visit. We can find abandoned shopping carts by querying these to find sessions which include a visit to the shopping cart page but not to the page that it shown after a purchase is complete. This makes use of the “not in” pattern of using a left outer join and null query.

FetchXML

<fetch xmlns:generator="MarkMpn.SQL4CDS">
  <entity name="cdi_visit">
    <link-entity name="cdi_pageview" to="cdi_visitid" from="cdi_visitid" alias="checkout" link-type="inner">
      <filter>
        <condition attribute="cdi_uri" operator="eq" value="https://example.com/checkout" />
      </filter>
      <all-attributes />
    </link-entity>
    <link-entity name="cdi_pageview" to="cdi_visitid" from="cdi_visitid" alias="finished" link-type="outer">
      <filter>
        <condition attribute="cdi_uri" operator="eq" value="https://example.com/checkout/thankyou" />
      </filter>
      <all-attributes />
    </link-entity>
    <all-attributes />
    <filter>
      <condition attribute="cdi_pageviewid" entityname="finished" operator="null" />
    </filter>
  </entity>
</fetch>

SQL

SELECT *
FROM   cdi_visit AS v
       INNER JOIN
       cdi_pageview AS checkout
       ON v.cdi_visitid = checkout.cdi_visitid
          AND checkout.cdi_uri = 'https://example.com/checkout'
       LEFT OUTER JOIN
       cdi_pageview AS finished
       ON v.cdi_visitid = finished.cdi_visitid
          AND finished.cdi_uri = 'https://example.com/checkout/thankyou'
WHERE  finished.cdi_pageviewid IS NULL;

“All” / “Only” Queries

There’s also a related type of query that checks if something is true for all related entities, e.g.:

  • Accounts that only have contacts with an info@ email address?
  • Users that have sales on all their accounts this month?

It’s actually possible to re-phrase these queries into the same pattern as above of asking two separate questions of the link entity. For example:

Accounts that only have contacts with an info@ email address?

You could re-write this as:

Accounts that have a contact with an info@ email address and don’t have any contacts that don’t have an info@ email address?

Now we can apply the same pattern again:

FetchXML

<fetch xmlns:generator="MarkMpn.SQL4CDS">
  <entity name="account">
    <link-entity name="contact" to="accountid" from="parentcustomerid" alias="info" link-type="inner">
      <filter>
        <condition attribute="emailaddress1" operator="like" value="info@%" />
      </filter>
      <all-attributes />
    </link-entity>
    <link-entity name="contact" to="accountid" from="parentcustomerid" alias="noninfo" link-type="outer">
      <filter>
        <condition attribute="emailaddress1" operator="not-like" value="info@%" />
      </filter>
      <all-attributes />
    </link-entity>
    <all-attributes />
    <filter>
      <condition attribute="contactid" entityname="noninfo" operator="null" />
    </filter>
  </entity>
</fetch>

SQL

SELECT *
FROM   account AS a
       INNER JOIN
       contact AS info
       ON a.accountid = info.parentcustomerid
          AND info.emailaddress1 LIKE 'info@%'
       LEFT OUTER JOIN
       contact AS noninfo
       ON a.accountid = noninfo.parentcustomerid
          AND noninfo.emailaddress1 NOT LIKE 'info@%'
WHERE  noninfo.contactid IS NULL;

Advanced Find

I mentioned earlier that you couldn’t build these queries in Advanced Find, and that’s mostly true. If you try to create one of these queries exactly as I’ve shown you’ll get an error when you try to add the second instance of the related entity:

The relationship you are adding already exists in the query.

But with a bit of creativity we can switch the query around to something that Advanced Find can handle, although as a database person it’s not as “nice”.

Instead of having our main entity listed only once, we can join back to it from the related entity. Instead of joining from invoice to invoicedetail twice, we join from invoice to invoicedetail, back to invoice and on to invoicedetail again:


Finding invoices with two products – attempt 2. Longer but gets the right answer!

This works fine for this case, but you may not get the results you want if you try this in combination with a “not in” type query. That would also apply to all the other link entities you chain off it, and could have the effect of changing an “in” query to a “not in”.