A follow-up on my previous post on link-entity, prompted by this tweet from Daryl LaBar:
I’d previously stated (incorrectly):
FetchXML supports two link types –Me, in more ignorant times
While these are the most common link types – and the only two that FetchXML Builder exposes – the platform has supported a third for some time, and now appears to have several more too!
There are some tantalising hints at new join types, but you probably shouldn’t be jumping into using them unless Microsoft support advises you to just yet!
The documentation for the natural join type is not exactly clear to me:
Only one value of the two joined attributes is returned if an equal-join operation is performed and the two values are identicaldocs.microsoft.com
I really have no idea what this is trying to say – if you can enlighten me, please do! Given that all the joins it does are based on the attributes being equal, and therefore the two values must be identical, I can’t see what this is trying to describe.
As I can’t tell from the docs what it is trying to do, I’ve tried to work it out by running some queries and having a look at the generated SQL (thank you on-premise!) and the results.
natural is a synonym for
If you substitute
inner, you will get exactly the same results, as it generates exactly the same SQL.
One thing I did wonder based on the docs for the
natural join type was whether it would try to imply the join attributes by name, i.e. you could miss out the
to attributes on the
link-entity if they were both the same and it would imply them from the metadata. This failed, but it did produce this intriguing error message:
No system many-to-many relationship exists between account and contact. If attempting to link through a custom many-to-many relationship ensure that you provide the from and to attributes.
This seemed interesting, so I gave it a go with this simple query:
<fetch> <entity name="account"> <attribute name="name" /> <link-entity name="list" link-type="inner"> <attribute name="listname" /> </link-entity> </entity> </fetch>
And it worked! This is a simple shortcut for navigating built-in many-to-many relationships without having to use the more explicit version of joining via the intersect entity:
<fetch> <entity name="account"> <attribute name="name" /> <link-entity name="listmember" to="accountid" from="entityid" alias="listmember" link-type="inner"> <link-entity name="list" to="listid" from="listid" alias="list" link-type="inner"> <attribute name="listname" /> </link-entity> </link-entity> </entity> </fetch>
Anyway, onto the original challenge. The docs now lists a total of 10 different join types:
As there’s no documentation on any of these I headed over to the simple but trusty FetchXml Tester XrmToolBox tool and tried one at random:
<fetch> <entity name="account"> <attribute name="name" /> <link-entity name="contact" to="accountid" from="parentcustomerid" link-type="notall"> </link-entity> </entity> </fetch>
This gave the remarkably helpful error message:
An error occured: Invalid link-type specified, valid values are: 'natural', 'inner', 'in', 'matchfirstrowusingcrossapply','exists' and 'outer'. link-type = notall
So maybe CDS doesn’t support all 10 join types yet, but it does appear to support some we didn’t have before:
I also tried the same thing on an on-premise installation, and that does only support the original 3:
natural, so this is online-only for now at least.
So, in the absence of any documentation on these, or the ability to see what SQL gets generated for them, what are we to do? Well, these all seem to have directly equivalent SQL keywords, so starting from a simple query:
<fetch> <entity name="account"> <attribute name="name" /> <link-entity name="contact" from="parentcustomerid" to="accountid" link-type="inner"> <attribute name="lastname" /> <filter> <condition attribute="firstname" operator="eq" value="Mark" /> </filter> </link-entity> </entity> </fetch>
My guess for the generated SQL when changing the
link-type to different options would be:
SELECT name, lastname FROM account INNER JOIN contact ON account.accountid = contact.parentcustomerid WHERE contact.lastname = 'Mark'
SELECT name, contact.lastname FROM account CROSS APPLY ( SELECT TOP 1 lastname FROM contact WHERE firstname = 'Mark' AND parentcustomerid = account.accountid ) AS contact
This is just conjecture at this point, but if correct it would mean:
- Each of the new join types would only produce one row per account that has a contact called Mark, regardless of whether there was one such contact or 100. The standard
innerjoin type however would produce one row for each contact
existsversions would not be able to include any attributes from the linked entity (
lastnamein this example), as it is only referenced from the
WHEREclause, but the
matchfirstrowusingcrossapplywould be able to
So, let’s give it a go!
I’ve created two accounts, one with no contacts called Mark and one with two:
And the results?
No surprise here, but good for a simple baseline:
<fetch> <entity name="account"> <attribute name="name" /> <link-entity name="contact" to="accountid" from="parentcustomerid" link-type="inner"> <attribute name="lastname" /> <filter> <condition attribute="firstname" operator="eq" value="Mark" /> </filter> </link-entity> <filter> <condition attribute="name" operator="like" value="Join Operator %" /> </filter> </entity> </fetch>
|Join Operator 2||Carrington|
|Join Operator 2||Duplicate|
Executing the exact same query but replacing
in gave the error:
An error occured: In doesn't support attribute inside linkentity expression.
<attribute> from inside the
<link-entity> gave the expected result:
|Join Operator 2|
So far, so good!
Going through the same process with
exists, again we get the same error about not being able to use an
<attribute> within the
<link-entity>. Removing that gives us the same results again!
Trying again with
matchfirstrowusingcrossapply, and it almost works as I expected. I can run the query with the
<attribute name="lastname" /> element inside the
<link-entity> without an error, but I don’t actually get that column back in the result set – I get the same results again as I did with
Given how close the other predictions were, I’m going to chalk this down overall as a success, with the missing column in the results possibly being due to a CDS bug that will hopefully be fixed at some point. As always though, I’d be very happy to get any firm information on it one way or another!
So now we have a reasonable idea of what these new join operators are doing, why might we want to use them? We’ve managed quite happily with
outer until now, thank you very much.
The main thing that these operators give us that we couldn’t do before is removing the duplication problem. We saw earlier that we can use
distinct to remove duplicate rows in query results that were introduced by joining to multiple child records, but this also removed the automatic primary key from the results. Because these operators do not produce duplicate rows regardless of the number of child records, there is no need to use
distinct. This was never that much of a problem though, as we could always include the primary key column explicitly anyway. So what else is there?
matchfirstrowusingcrossapply option looks the most powerful, although as I haven’t managed to actually get any attributes out of the linked entity I can’t realise that yet. It would be most powerful if we could combine that with a sort order, so you could get a list of accounts along with the “best” contact for each account:
<fetch> <entity name="account"> <attribute name="name" /> <link-entity name="contact" to="accountid" from="parentcustomerid" link-type="matchfirstrowusingcrossapply"> <attribute name="contactid" /> <attribute name="fullname" /> <order attribute="cdi_score" descending="true" /> </link-entity> </entity> </fetch>
That looks like it could give us a unique list of accounts, along with the details of the top contact in the account according to the ClickDimensions score. Unfortunately, running that gives us an error:
MatchFirstRowUsingCrossApply doesn't support order clause inside linkentity expression.
So far, that looks like a missed opportunity, as the corresponding SQL would work perfectly fine:
SELECT name, contact.contactid, contact.fullname FROM account CROSS APPLY ( SELECT TOP 1 contactid, fullname FROM contact WHERE parentcustomerid = account.accountid ORDER BY cdi_score DESC ) AS contact
So what else? How about performance? If we compare the four simple examples we looked at before, we get three quite different query plans:
Even when adding in a
DISTINCT option to the simple
inner join type, this takes effectively no time at all to execute. All three of the other new options take significantly longer, with the
matchfirstrowusingcrossapply option taking the vast majority of the time. So there’s no quick performance win here by just turning one of these options on blindly.
My best guess then is that these options have been added following some performance analysis of other, more complex queries that Microsoft have seen from their telemetry. This could possibly be for their 1st party Dynamics 365 apps, AI or other applications that have a particular need for more generating specific queries. This would probably explain the lack of documentation as much as anything else.
As there are still some other join types listed in the documentation which the server doesn’t appear to support yet, it could also be the case of the documentation getting ahead of the code and we’ve got some more exciting platform extensions coming in a future release!