You can ensure your query only produces each row once using the distinct
attribute. For example, if you have multiple entries for the same company name you can get the list of unique names using:
<fetch distinct="true"> <entity name="account"> <attribute name="name" /> </entity> </fetch>
This option gets applied automatically whenever you create a query using Advanced Find. This is particularly useful when you add a join onto a related child entity. For example, this query:
<fetch> <entity name="account"> <attribute name="name" /> <link-entity name="contact" from="parentcustomerid" to="accountid" link-type="inner"> <filter> <condition attribute="firstname" operator="eq" value="Mark" /> </filter> </link-entity> </entity> </fetch>
The intent here is to get a list of accounts that have a contact called Mark. However, if an account has two Marks, that account will be in the resulting list twice. Adding the distinct
attribute makes sure the user gets what they were probably expecting, which is a list of unique accounts.
Unique Identifiers
One other interesting side effect of using distinct
is that you no longer get the primary key of the entity back automatically. If you run the query:
<fetch> <entity name="account"> <attribute name="name" /> </entity> </fetch>
As well as getting the account name
back, you’ll also automatically get the accountid
as well.
When you apply distinct
, this goes away. This makes sense if you think about it. Imagine you have two accounts with the same name. If the accountid
was automatically included, that would make the records unique. You would still get two rows back from your query, regardless of whether or not you used distinct
.
If the purpose of using distinct
is to eliminate duplicate rows introduced by a join onto a child entity as shown above though, you probably do want to get a row back for each of your top-level entity type. In that case you will need to explicitly include the primary key attribute in your query, e.g.
<fetch distinct="true"> <entity name="account"> <attribute name="accountid" /> <attribute name="name" /> <link-entity name="contact" from="parentcustomerid" to="accountid" link-type="inner"> <filter> <condition attribute="firstname" operator="eq" value="Mark" /> </filter> </link-entity> </entity> </fetch>
Truncation
One “gotcha” to be aware of. Any text values will be truncated to 2,000 characters before the distinct operation is applied. If you have longer values, e.g. descriptions or the contents of attachments in the annotation
entity, any extra data after the first 2,000 characters will be lost.
Quite why this would be is a bit of a mystery. My best guess is that these fields used to be stored in a SQL Server ntext
field which couldn’t be used in a SELECT DISTINCT
query, so to work around this problem Dynamics CRM would cast these values to an nvarchar(2000)
value which could be used instead. You can see this in the generated SQL:
<fetch distinct="true" top="10"> <entity name="annotation"> <attribute name="documentbody" /> </entity> </fetch>
select DISTINCT top 10 cast( "annotation0".DocumentBody as varchar(2000) ) as "documentbody" from AnnotationBase as "annotation0"
As these columns are now stored using an nvarchar(max)
field which can be used directly in a SELECT DISTINCT
query I guess this behaviour is still in place for backwards compatibility reasons, although it does give about a 35% performance improvement:

SQL Equivalent
Apart from this once caveat of long text fields, this is directly equivalent to the SQL DISTINCT
keyword.
Great post