Content Type Hub Packages

Every now and then I like to spend some time understanding the internals of some of the various components that make up SharePoint. This week, while troubleshooting an issue at a customer’s, I decided to crack open the Content Type Hub to see how it is exactly that Content Types get published down to subscriber site collections. In this article I will explain in details the process involved in publishing a Content Type from the Content Type Hub to the various site collections that will be consuming them.

First off, let us be clear. The Content Type Hub is nothing more than a regular site collection onto which the Content Type Syndication Hub feature has been activated on (site collection feature).

The moment you activate this feature onto your site collection, a new hidden list called “Shared Packages” will be created in the root web of that same site collection.

The moment you activate the feature, that list will be empty and won’t contain any entries. You can view the list by navigating to http://Content Type Hub Url/Lists/PackageList/AllItems.aspx.

However, the moment you publish a Content Type in the Hub, you will see an entry for that Content Type appear.
Publish a SharePoint Content Type
SharePoint Content Type Hub Package

In the two figures above, we can see that we have publish Content Type “Message” which has an ID of 0x0107. Therefore the entry that gets created in the Shared Packages list has that same ID value (0x0107) set as its “Published Package ID” column. “Pulished Package ID” will always represent the ID of the Content Type that was recently Published/Updated/Unpublished and which is waiting to by synchronized down to the subscriber site collections. The “Taxonomy Service Store ID” column, contains the ID of the Managed Metadata Service that is used to syndicate the Content Type Hub changes to the subscriber sites. In my case, if I was to navigate to my instance of the Managed Metadata Service that takes care of the syndication process and look at its URL, I should be able to see that its ID matches the value of that column (see the two figures below).

SharePoint Taxonomy Service Store ID
Managed Metadata Service Application ID

The “Taxonomy Service Name” is self explanatory, it represents the display name of the Manage Metadata Service Application instance represented by the “Taxonomy Service Store ID” (see the two figures below).

SharePoint Taxonomy Service Name
SharePoint Managed Metadata Service Application Name

The “Published Package Type” column will always have a value of “{B4AD3A44-D934-4C91-8D1F-463ACEADE443}” which means it is a “Content Type Syndication Change”.
SharePoint Published Package Type

The last column, “Unpublished”, is a Boolean value (true or false) that indicates whether or not the operation that was added to the queue is a Publish/Update, in which case that value will be set to “No”, or if it was an “Unpublish” operation, in which case the value would be set to “Yes”. The two figures below show the results of sending an “Unpublish” operation on a previously published Content Type to the queue.
SharePoint Unpublish a Content Type

Now what is really interesting, is that even after the subscriber job (Content Type Subscriber timer job) has finished running, entries in the “Shared Packages” list persist. In fact, these are required for certain operations in the Hub to work as expected. For example, when you navigate to the Content Type Publishing page, if there are no entries in the “Shared Packages” list for that content type, you would never get “Republish” and “Unpublished” as an option. The page queries the list to see if there is an entry for the given Content Type that proves it was published at some point before offering the option to undo the publishing.

To better demonstrate what I am trying to explain, take the following scenario. Go to your Content Type Hub and publish the “Picture” Content Type. Now once published, simply go back to that same “Content Type Publishing” page. You will be presented with only two options: Republish or Unpublish. The “Publish” option will be greyed out, because SharePoint assumes that because you have an entry in the Shared Packages list marked with “Unpublished = No”, that the Content Type has already been published. Therefore you can only “Republish” or “Unpublish” it. No navigate to the “Shared Packages” list and delete the entry for that Content Type. Once the entry has been deleted, navigate back to the Content Type Publishing page for the Picture Content Type. The “Publish” option is now enabled, and the “Republish” and “Unpublish” ones are disabled. That is because SharePoint couldn’t find a proof in the “Shared Packages” list that this Content type has been published in the past.

Also, if you were to publish a Content Type, and later unpublish it, you would see two entries in the “Shared Packages” list (one for publish one for unpublish). The Unpublish operation simply updates the existing entry in the “Shared Packages” list and sets its “Unpublished” flag to “Yes”.

If you were to create a Custom Content Type, publish it, and then delete it. SharePoint is not going to automatically remove its associated entry in the “Shared Packages” list. Instead, the next time the “Content Type Hub” timer job runs, it will update the associated entry to set its “Unpublished” flag to “false”. Meaning that we want to make sure that deleted Content Type never makes it down to the Subscriber Site Collections.

How does synchronization Works

By now you are properly wondering how the Synchronization process works between the Hub and the Subscriber Site Collections if entries are always persisted in the “Shared Packages” list. The way this process works is actually quite simple. The “Content Type Subscriber” timer job is the one responsible for that operation. By default that timer job runs on an hourly basis, and indirectly (via Web Services) queries the “Shared Packages” list to retrieve all changes that have to be synchronized. The root web of every Site Collection that subscribes to the Content Type Hub exposes a property called “metadatatimestamp” that represents the last time the Content Type Gallery for that given Site Collection was synchronized. The following PowerShell script can help you obtain that value for any given subscriber Site Collection.

$url = Read-Host "URL for your subscriber Site Collection"
$site = Get-SPSite $url
Write-Host $site.RootWeb.Properties["metadatatimestamp"]

When the “Content Type Subscriber” timer job runs, it goes through every Subscriber Site Collection, retrieve its “metadatatimestamp” value, and queries the “Shared Packages” list passing that date stamp. The queries then returns only the list of entries that have their “Last Modified” date older than that time stamp. Upon receiving the list of changes to sync, the timer job retrieves the Content Type information associated with the listed changes from the Hub and applies them locally to the Subscriber Site Collection’s Content Type gallery. Once it finished synchronizing a Site Collection, it updates its “metadatatimestamp” to reflect the new timestamp.

If you really wanted to, you can make sure that every single Content Type listed in the “Shared Packages” list be synchronized against a given Site Collection by emptying the “metadatatimestamp” property on that given site. As an example, when you create a new Site Collection, that root web won’t have that value set and therefore every Content Type that was ever published in the Hub will make its way down to that new Site Collection. Using the interface, you can also blank out that property by going to the Content Type Publishing page and selecting the option to “Refresh all published content types on next update”. All that this option does is empty the value for that property on the site.

Refresh all published content types on next update

Document Sets

Let’s now look at a more complex scenario (which is really why I started taking a closer look at the Publishing process in the first place). Let us investigate what happens if we publish Content Types that inherit from the Document Set Content Type. In my example, I’ve created a new custom Content Type named “DocSetDemo”. This custom Content Type, since it inherits from its Document Set parent, defines a list of Allowed Content Types. In my case, I only allow for “Picture” and “Image” as Allowed Content Types.
Allowed Content Types

The moment you go and try to publish this custom Content Type, you will get the following prompt, which let’s you know that every Allowed Content Types identified within your custom Content Type will also be published.

What that truly means is that not only is SharePoint going to create an entry in the “Shared Packages” list for your custom Content Type, it will also create one for every Allowed Content Type identified. In the figure below, we can see that after publishing my custom Content Type “DocSetDemo”, which has an ID of 0x0120D5200049957D530FA0554787CFF785C7C5C693, there are 3 entries in the list. One for my custom Content Type itself, one for the Image Content Type (ID of 0x0101009148F5A04DDD49CBA7127AADA5FB792B00AADE34325A8B49CDA8BB4DB53328F214) and one for the Picture Content Type (ID of 0x010102).

How to use the ReverseDSC Core

The ReverseDSC.Core module is the heart of the ReverseDSC process. This module defines several functions that will help you dynamically extract the DSC configuration script for each resource within a DSC module. The ReverseDSC Core is generic, meaning that it applies to any technology, not only SharePoint. In this blog article I will describe in details how you can start using the ReverseDSC Core module today and integrate it into your existing solutions. To better illustrate the process, I will be using an example where I will be extracting the properties of a given user within Active Directory using the ReverseDSC Core.

Getting Started

If you were to take a look at the content of the ReverseDSC.Core.psm1 module (https://github.com/NikCharlebois/SharePointDSC.Reverse/blob/master/ReverseDSC.Core.psm1), you would see about a dozen functions defined. The one we are truly interested in is the Export-TargetResource one. This method takes in two mandatory parameters, the name of the DSC resource we wish to “Reverse”, and the list of mandatory parameter for the Get-TargetResource of that same resource. The mandatory parameters are essential because without them, the Get-TargetResource is not able to determine what instance of the resource we wish to obtain the current state for. The third optional parameter lets you define a DependsOn clause in the case the current instance depends on another one. However, let us not worry about that parameter for our current example.

As mentioned previously, for the sake of our example, we want to extract the information about the various users in our Active Directory. Active Directory users are represented by the MSFT_xADUser resource, so you will need to make sure the xActiveDirectory module is properly installed on the machine you are about to extract the information from.

Let us now take a look at the Get-TargetResource function of the MSFT_xADUser resource. The function only requires two mandatory parameters: DomainName and UserName.

Therefore we need to pass these two mandatory parameters to our Export-TargetResource function. Now, in my case, I do know for a fact that I have a user in my Active Directory named “John Smith”, who has a username of “contoso\JSmith”. In my case, I have a local copy of the ReverseDSC.Core.psm1 module located under c:\temp I can then initiate the ReverseDSC process for that user by calling the following lines of PowerShell:

Import-Module -Name "C:\temp\ReverseDSC.Core.psm1" -Force
$mandatoryParameters = @{DomainName="contoso.com"; UserName="JSmith"}
Export-TargetResource -ResourceName xADUser -MandatoryParameters $mandatoryParameters

Executing these lines of code will produce the following output:

Since the Export-TargetResource function simply outputs the resulting DSC resource block as a string, you would need to capture it in a variable somewhere and manually build your resulting DSC configuration. The following modifications to our script will allow us to build the resulting Desired Configuration Script and save it locally on disk, in my case under C:\temp\:

Import-Module -Name "C:\temp\ReverseDSC.Core.psm1" -Force
$output = "Configuration ReverseDSCDemo{`r`n    Import-DSCResource -ModuleName xActiveDirectory`r`n    Node localhost{`r`n"
$mandatoryParameters = @{DomainName="contoso.com"; UserName="JSmith"}
$output += Export-TargetResource -ResourceName xADUser -MandatoryParameters $mandatoryParameters
$output += "    }`r`n}`r`nReverseDSCDemo"
$output | Out-File "C:\Temp\ReverseDSCDemo.ps1"

Running this will generate the following DSC Configuration script:

Configuration ReverseDSCDemo{
Import-DSCResource -ModuleName xActiveDirectory
Node localhost{
xADUser baf586dd-3c2c-4131-9267-d4d8fb1d5d01
{
CannotChangePassword = $True;
HomePage = "http://Nikcharlebois.com";
DisplayName = "John Smith";
Description = "John's Account";
Notes = "These are my notes";
Office = "Basement of the building";
State = "Quebec";
Fax = "";
JobTitle = "Aquatic Plant Watering";
Country = "";
Division = "";
Initials = "";
POBox = "23";
HomeDirectory = "";
EmployeeID = "";
LogonScript = "";
GivenName = "John";
EmployeeNumber = "";
UserPrincipalName = "jsmith@contoso.com";
ProfilePath = "";
StreetAddress = "55 Mighty Suite";
CommonName = "John Smith";
Path = "CN=Users,DC=contoso,DC=com";
HomePhone = "555-555-5555";
City = "Gatineau";
Manager = "CN=Nik Charlebois,CN=Users,DC=contoso,DC=com";
MobilePhone = "";
Pager = "";
Company = "Contoso inc.";
HomeDrive = "";
OfficePhone = "555-555-5555";
Surname = "Smith";
Enabled = $True;
DomainController = "";
PostalCode = "J8P 2A9";
IPPhone = "";
EmailAddress = "JSmith@contoso.com";
PasswordNeverExpires = $True;
UserName = "JSmith";
DomainName = "contoso.com";
Ensure = "Present";
Department = "Plants";
}
}
}
ReverseDSCDemo

Executing the resulting ReverseDSCDemo.ps1 script will generate a MOF file that can be used with PowerShell Desired State Configuration.

Recap

The ReverseDSC Core allows you to easily extract all the parameters of an instance of a resource by simply specifying the few mandatory parameters its Get-TargetResource function requires. It does not take care of scanning all instance on an environment, this should be left to the user to create that script. It is not to say that we will not be looking at ways of automating this in the future, but the current focus is to keep it as unit calls.

For a DSC Resource to work with the ReverseDSC Core, you need to ensure it has a well written Get-TargetResource function that returns the proper parameters. This should already be the case for any well written resources out there, but it is not always the case. In the past, most of the development effort for new DSC Resource was put on the Set-TargetResource function to ensure the “Forward” DSC route was working well. However, in order for the whole DSC process to properly work, it is crucial that your Get-TargetResource function be as complete as possible. After all, the Test-TargetResource function also depends on it to check whether or not your machine has shifted away from its desired state.

SharePoint Reverse DSC

In my previous blog article I introduced the concept of Reverse DSC, which is nothing more than a dynamic way of extracting a Desired State Configuration (DSC) Script that represents the Current State of any given environment. In this blog article, I will guide you through the process of executing the Reverse DSC script against an existing SharePoint 2013 or 2016. Please note that while PowerShell v4 is supported by the SharePoint Reverse DSC, it is highly recommended that you upgrade your environment to PowerShell v5 to be able to fully leverage the goodness of the DSC engine.

While it is still not officially decided how the SharePoint Reverse DSC script will be distributed, I have taken the decision to go ahead and offer a temporary distribution via my Blog. A copy of the current script can be obtained here:

This version of the script currently supports the latest available bits of the SharePointDSC Module (1.5.0.0). The package is made of two files:

  • ReverseDSC.Util.psm1, the core ReverseDSC module which is generic to all DSC Modules (not just SharePoint)
  • SharePointDSC.Reverse.ps1, the main SharePoint specific PowerShell script responsible for extracting the current state of a SharePoint Environment.

As mentioned above, this script is optimized to run under an environment that has PowerShell v5. To determine what version of PowerShell you are using, simply run the following PowerShell command:

$PSVersionTable.PSVersion.Major

If you are running version 4, no worries, you can either upgrade to version 5 by downloading and installing the Windows Management Framework (WMF) 5.0 on your various servers (which will cause downtime). In case your organization is not yet ready to upgrade to WMF 5, you can either download and install the PackageManagement module for PowerShell 4, or simply manually install the SharePointDSC 1.5.0.0 module onto each of your servers. PackageManagement will simply be used to automatically download and install the proper version of the SharePointDSC module from the PowerShell gallery, assuming your server has internet connectivity (which it most likely won’t anyway).

How to Use

  1. Extract the content of the package onto one of the SharePoint server (Web Front-End or Application server). Make sure that both the .psm1 and .ps1 files are in the same folder.
  2. In an elevated PowerShell session (running as administrator), execute the SharePointDSC.Reverse.ps1 script.
  3. If you do not have the required version of the SharePointDSC module installed, you will be prompted to automatically download it or not. Note that this requires your server to have internet connectivity. (Note that I recommend you manually get the module v1.5.0.0. onto you server)
  4. When prompted to provide Farm admin credentials, simply enter credentials for any account that has Farm Admin privileges on your farm.
  5. The script may prompt you several times to enter credentials for various Managed Accounts in your environment. This is required in order for DSC to be able to retrieve any Password Change Schedules associated with your managed accounts. Simply provide the requested credentials for each prompt.
  6. The script will scan through all components supported by the SharePointDSC module and then compile the resulting DSC Configuration Script. Once finished, it will prompt you to specify the path to an existing folder where the resulting .ps1 DSC Configuration Script will be saved.
  7. The DSC Configuration Script will be saved with the name “SP-Farm.DSC.ps1” under the specified folder path. You can open the .ps1 file to take a close look at its content. the top comments section will provide insights about the Operating System versions, the SQL Server Versions, and all the patches installed in your farm.
  8. To validate that the Reverse DSC process was successful, simply execute the resulting SP-Farm.DSC.ps1 file. It will prompt you to pick a passphrase and will automatically compile a .meta.mof and a .mof file for each of the server in your farm.

Now that you have your resulting .MOF files, you can use them to replicate your environment to another location on-premises, upload the resulting .ps1 into Azure Automation to create a replica of your environment in the cloud, or on-board your existing environment onto DSC. The next Blog post in this series will go through the steps you need to take to on-board an existing SharePoint environment onto DSC using the ReverseDSC process.

Introducing Reverse DSC

Ever since becoming a Microsoft PowerShell MVP back in the summer of 2014, I have been heavily involved with various PowerShell Desired State Configuration (DSC) projects. The main initiative I have been involved with is the SharePointDSC module which is currently led by Brian Farnhill down in Australia. While my contributions to the core of the project have been limited, I have been spending numerous hours working on a new concept I came up with and which is very close to my heart. Reverse DSC is something I introduced back in late 2015 after spending some late night hours testing out my SharePointDSC scripts. It is the concept of extracting a DSC Configuration Script out of an existing environment in order to be able to better analyze it, replicate it or onboard it onto PowerShell DSC. Let me be very clear, this concept does not only apply to the SharePoint world; it applies to all software components that can be automated via DSC. I am of the opinion that this concept will be a game changer in the world of automation, and I strongly encourage you to read through this article to better understand the core concepts behind it.

Definitions

To get started, and to make sure we are all on the same page, let us define the following two terms:

  • Desired State: represents how we want a component to be configured. For example, the Desired State of a SharePoint Site (SPWeb) could be defining its title. The Desired State could in this case define that to be in its Desired State, a given Site needs to have a title of “Intranet”.
  • Current State: represents how a component is currently configured. In many cases the Current State can be the same as the Desired State, which is completely fine. PowerShell DSC aims at making sure that whenever the Current State is not equal to the Desired State, that we do everything in our power to bring the server node back in its Desired state.

Anatomy of a DSC Resource

Before we go any further, it is key to understand how DSC Resources work internally. Just as a refresher, a DSC Resource is responsible for configuring a specific component within a DSC module. For example, within the SharePointDSC module, the MSFT_SPWebApplication resource is responsible for configuring SharePoint Web Applications. Every DSC Resources are made of 3 core functions: Get-TargetResource, Test-TargetResource, and Set-TargetResource.

  • Set-TargetResource is the function responsible for bringing the server in its Desired State by configuring the given component represented by the resource. It is called on the initial configuration call (e.g. Start-DSCConfiguration for Push mode), and when the Local Configuration Manager (LCM) is in the ApplyAndAutocorrect mode and detects that the machine drifted away from its Desired State.
  • Get-TargetResource is the function responsible for analyzing what the current state is for the component represented by the DSC Resource.
  • Test-TargetResource is responsible for calling the Get-TargetResource function to obtain the current state, and compares it with the Desired State contained within the Local Configuration Manager. If it detects that the current state doesn’t match the Desired State, and the LCM is in ApplyAndAutocorrect mode, it will call the Set-TargetResource method to ensure the machine is brought back in its Desired State.

The figure above details the process of PowerShell DSC where the Local Configuration Manager is configured in ApplyAndAutocorrect mode. The LCM checks on a regular basis (defined by the Configuration Mode Frequency) to see if the server is still in its Desired State. To do so, it calls into the Test-TargetResource function. This function is aware of what the Desired State should be because it is stored in the LCM’s memory (use the Get-DSCConfiguration cmdlet to see what is in the LCM’s memory), but needs to call into the Get-TargetResource function to figure out what the current state is. Once that is done, the Test-TargetResource method has information about what both the Desired and Current states are and will compare them. If they are the same, we are done and we will check again later. If they differ, then we need to call into the Set-TargetResource method to try to bring the Current State back to being the same as the Desired State.

The Reverse DSC Concept

The magic of the Reverse DSC concept lies within the Get-TargetResource function. As explained in the section above, this function is responsible for obtaining information about the current state of the server node for a given component. So you may ask if the theory is that if, for example, I wanted to get information about all the Web Applications within my SharePoint environment, all I have to do is call into the Get-TargetResource function for the MSFT_SPWebApplication DSC Resource? Well, that is absolutely correct, and this is what Reverse DSC is all about. A Reverse DSC script is a dynamic PowerShell script that calls into the Get-TargetResource function for each DSC Resource contained within a DSC Module. In the case of SharePoint, that Reverse DSC script would be calling into the Get-TargetResource function for all DSC Resources listed in the following figure (note that the figure shows the components included in SharePointDSC v1.4).

The Reverse DSC script would then be responsible for compiling the current state of each DSC Resources into a complete DSC Configuration Script that would then represent the Current State of each components within our environment. If that ain’t cool, I don’t know what is!

Real-Life Usage

I am a Microsoft Premier Field Engineer, which means that most of my days are spent troubleshooting issues with my clients’ environments. When I came up with the idea of Reverse DSC, my main intent was to ask my clients to run the Reverse DSC script against their environment, and send me back the resulting DSC Configuration Script so that I can replicate their exact environment within my own lab to make it easier for me to troubleshoot their issues with my own set of tools. However, as it is often the case with any innovations, it ends up that the main use for it may be something totally different than what I originally anticipated. Here are some of the awesome real-life applications for Reverse DSC We can come up with:

  • Dev/Test: As mentioned above, one of the main use of Reverse DSC is to allow an as-is replica of an existing environment on-premises. Most organizations I work with don’t have good DEV and Quality Assurance environments that match their Production environment. Running the Reverse DSC script against the production environment will allow users to take the resulting scripts and create exact copies of that environment for DEV and Test purposes.
  • Azure Automation: Organizations that have an on-premises Production environment and that are looking at moving to the cloud (even if just for DEV/Test), can generate use the Reverse DSC script to generate the DSC Configuration matching their on-premises environment, and Publish it to Azure Automation to have Azure Virtual Machine created that will be an exact match of the on-premises environment.
  • Compare environments: How often have we heard the sentence: “It works on my machine!”. With Reverse DSC, we can now run the script against two environments and compare the resulting scripts to see what configuration settings differ between the two.
  • Documentation: While I don’t foresee this as being the most popular reason why organizations would be adopting Reverse DSC, it would still allow them to document (in DSC format) the configuration of an environment at any given point in time.
  • DSC On-boarding: This one is probably one of the key application for DSC adoption within an organization. Most companies today aren’t using DSC to automate the configuration of their environment and ensure they don’t drift away from the specified Desired State. By simply running the Reverse DSC script against an existing environment and then using the resulting script as its own Desired State Configuration script, will ensure the environment is now maintained by the DSC process. It is almost as if by running through this process you would tell the server: “Tell me what your Current state is. Oh and by the way, that Current State you just told me about has just become your Desired State”. By doing this, organization can then specify how the LCM should handle configuration drifts (ApplyAndMonitor or ApplyAndAutocorrect) and detect when the Current State (which is now also the Desired State) is drifting.

See it in Action

The Reverse DSC script for SharePoint is already a real thing. However it is still waiting final approval to become officially part of the SharePointDSC module. The following video shows the execution the Reverse DSC script against my SharePoint 2016 dev server.

Next Blog post in this series-> SharePoint Reverse DSC

Unique Document ID Across the Farm with Custom Document ID Provider

The Document ID Service of SharePoint is probably the single most interesting component of the platform for everyone trying to achieve true records management with the software. This service allows for unique document IDs to be assigned to documents within a site collection. The challenge encountered by many organizations with this service however is the fact that if you want to ensure documents gets a unique ID across your entire SharePoint farm, you need to specify a unique Prefix ID for each site collection. For most organizations, I work with, this means coming up with over 1,000 unique prefixes and applying them to each site collection either using PowerShell or any other automation solution.
While this may seem like it would be a show stopper for organizations to use the service, it is important to understand that the Document ID Service is extensible, and that Microsoft is making it very easy for organizations to implement their own Custom Document ID Provider (CDIP). By creating such a CDIP, we can control the entire logic of how document IDs will be generated and assigned to our documents, and therefore implement a logic that will ensure IDs are unique across our entire SharePoint farm, without having to come up with unique prefixes for each site collection. To get started with our CDIP creation, all we need to do is create a new SharePoint Server-Side using Visual Studio.
Within our newly created project, add a new class library named whatever you want (in my case I named it ContosoCDIP). Now let us add a reference to the Microsoft.Office.DocumentManagement.dll library to our project. This DLL file is normally found under /ISAPI.

1

Go back to the class library we just created, and make sure your class inherits from the Microsoft.Office.DocumentManagement.DocumentIDProvider. By doing this, you should automatically get an error letting you know that in order for your class to inherit from that parent class, you need to override 3 methods and 1 property:

2

What we are going to do next, is declare stubs for each of these 4 missing items. You can go ahead and manually retype all the missing override methods, or you can take the easy way and let Visual Studio do it all automatically for you. Using the Application Insight icon (lightbulb) that appears when you mouse over the error, you should see the option to let Visual Studio “Implement Abstract Class”. Select that option and Visual Studio will automatically go and create the stubs for the missing methods and property for you.

3

We currently have a fully valid Custom Document ID Provider. If you were to compile your project at this point, everything would come back fine and there should not be any compilation errors. Off course all items that we just added will throw an error upon being call, and we will get back to this shortly, but first let us ensure our CDIP can be properly deployed. For our CDIP to be deployed to our environment, we need to create a new feature receiver. First off, we need to create a new Feature which will allow us to deploy our Custom Document ID Provider. In the Solution Explorer window, right click on the Features folder and select Add.
4

Change the scope of your feature to be deployable at the Site Collection level.
5

You can now go ahead and right click on your feature in the Solution Explorer window, and select Add Event Receiver.

6

This will automatically create the Event Receiver class, with all the available methods being commented out. Make sure you add a clause at the top of your Event Receiver class to import the Microsoft.Office.DocumentManagement namespace.

using Microsoft.Office.DocumentManagement;

Let us start by uncommenting the “FeatureActivated” method. Inside that method, what we need to do is globally register a DocumentID provider on the current Site Collection where the feature was just activated. In there, type in the following line of code, where CDIPDemo.ContosoCDIP is the Document ID Provider class we created previously. What we are effectively doing here is creating a new instance of our Custom Document ID Provider, and setting it as the Default Provider for the current site collection.

DocumentId.SetProvider(properties.Feature.Parent as SPSite, new CDIPDemo.ContosoCDIP());

Because we are good coders, we also need to handle the case where the feature is Deactivated to ensure we remove our Custom Document ID Provider and set it back to the Out-of-the-box one for the given Site Collection. To ensure this scenario is handled, uncomment the “FeatureDeactivating” method. What we need to now is simply call into the SetDefaultProvider method of the DocumentID class to reset the Default Document ID Provider:

DocumentId.SetDefaultProvider(properties.Feature.Parent as SPSite);

At this point in time, we have a fully functional CDIP, and we are able to deploy it to our environment. In order to make sure our provider is able to leverage the Document ID Service, we first need to activate the Document ID Service feature on our site collection.

7

If we were to deploy our CDIP right now, and activate its feature, we can verify that the deployment was successful by going to Site Settings > Document ID Settings

8

If everything worked as expected, you should see the following two messages in red at the Top of your page:
Configuration of the Document ID feature is scheduled to be completed by an automated process.
A custom Document ID provider has been configured for this Site Collection. Refer to the documentation supplied with that custom provider to learn how to configure its settings.

The first one means that the Document ID enable/disable timer job has not yet run to completely configure your new CDIP. The second message confirms that a Custom Document ID Provider has been configured for the current Site Collection, which is exactly what we were looking for.

9

Now that we have a confirmation that our Custom Document ID Provider was successfully deployed, we can go back into Visual Studio and start playing with the code. By default, the Document ID Service relies on the Search engine to retrieve a document based on an ID. When a Document ID is assigned to a document, a hyperlink with the ID is automatically placed in the Document ID field of your libraries. When you click on it, it sends you to http:///_layouts/15/DocIdRedir.aspx?ID=. By default, the service will query the Search engine to retrieve the location of the document that has its Document ID field set to the ID received through the query string. We can however, decide to override this logic and come up with our own logic to retrieve a document’s URL based on its ID. Unless you have a good business rationale to change this behavior, I would recommend letting it use the Search engine to do its retrieval.

However, if you do need to put in your own custom logic, you will need to put your logic in the GetDocumentUrlsById method. This method will return a string array containing the URLs (because there can still be cases where you may want more than one document to have the same id) of the documents that match the received ID. In my case, I will simply have the method return a NotImplementedException() which indicates that I have not provided any logic to this method and will throw an error. You do not have to worry about the method throwing an exception, there is a way to make sure this method is never called.

Remember that when we created our CDIP class, we had to override 3 methods and 1 property. Well, that property we had to override is what controls what “retrieval” behavior our Custom Document ID Provider should have. If this value is set to false, then our provider will always use the Out-of-the-box search engine to retrieve the document, and the GetDocumentUrlsById method will never get called. If you set it to true, then the method will be called everytime a user clicks in the Document ID hyperlink. In my case, I will leave the value as false.

10

The next method we need to interact with is the GetSampleDocumentIdText method which simply returns an example of what the Document ID generated by our CDIP should look like. For this example, I have decided that my Document IDs should be in the form of “”. Therefore, my method will return any valid ID, for example “Wednesday- c32a6ab7-4c97-449b-a3d8-b5a82cf9eca7”. When you activate the Document ID Service on a site collection, it makes a new Web part named “Find Document by ID” available under the Search category. If you drop that web part on a page, it will automatically call into this method to retrieve an example of what a valid document ID could be and displays it as an insight on the web part.

11

Last by not the least, is the GenerateDocumentID method, which is where all the magic happens. This is the method where you will be generating your unique ID. In my case, every ID generated should be unique in nature because it uses a GUID in it (that is only in theory as it is possible that by chance I will get two IDs to be the same). If you wish to come up with a simpler naming convention, but still keep the IDs unique across your entire farm, what you can do is keep a reference to the last ID that was emitted in a list or in a property bag inside a given site, and increment that number every time you release a new ID. In my case, my logic will simply be the following:

Guid guid = Guid.NewGuid();
string dayName = DateTime.Now.DayOfWeek.ToString();
return dayName + " - " + guid;

Now that all of our methods have been covered, we are ready to deploy our CDIP. Once the feature is activated, simply upload a document to any document library and take a look at the Document ID column (you will need to add it to your view). You should see the assigned ID matching the logic of your CDIP!
12