Every now and then I like to spend some time understanding the internals of some of the various components that make up SharePoint. This week, while troubleshooting an issue at a customer’s, I decided to crack open the Content Type Hub to see how it is exactly that Content Types get published down to subscriber site collections. In this article I will explain in details the process involved in publishing a Content Type from the Content Type Hub to the various site collections that will be consuming them.
First off, let us be clear. The Content Type Hub is nothing more than a regular site collection onto which the Content Type Syndication Hub feature has been activated on (site collection feature).
The moment you activate this feature onto your site collection, a new hidden list called “Shared Packages” will be created in the root web of that same site collection.
The moment you activate the feature, that list will be empty and won’t contain any entries. You can view the list by navigating to http://Content Type Hub Url/Lists/PackageList/AllItems.aspx.
However, the moment you publish a Content Type in the Hub, you will see an entry for that Content Type appear.
In the two figures above, we can see that we have publish Content Type “Message” which has an ID of 0x0107. Therefore the entry that gets created in the Shared Packages list has that same ID value (0x0107) set as its “Published Package ID” column. “Pulished Package ID” will always represent the ID of the Content Type that was recently Published/Updated/Unpublished and which is waiting to by synchronized down to the subscriber site collections. The “Taxonomy Service Store ID” column, contains the ID of the Managed Metadata Service that is used to syndicate the Content Type Hub changes to the subscriber sites. In my case, if I was to navigate to my instance of the Managed Metadata Service that takes care of the syndication process and look at its URL, I should be able to see that its ID matches the value of that column (see the two figures below).
The “Taxonomy Service Name” is self explanatory, it represents the display name of the Manage Metadata Service Application instance represented by the “Taxonomy Service Store ID” (see the two figures below).
The “Published Package Type” column will always have a value of “{B4AD3A44-D934-4C91-8D1F-463ACEADE443}” which means it is a “Content Type Syndication Change”.
The last column, “Unpublished”, is a Boolean value (true or false) that indicates whether or not the operation that was added to the queue is a Publish/Update, in which case that value will be set to “No”, or if it was an “Unpublish” operation, in which case the value would be set to “Yes”. The two figures below show the results of sending an “Unpublish” operation on a previously published Content Type to the queue.
Now what is really interesting, is that even after the subscriber job (Content Type Subscriber timer job) has finished running, entries in the “Shared Packages” list persist. In fact, these are required for certain operations in the Hub to work as expected. For example, when you navigate to the Content Type Publishing page, if there are no entries in the “Shared Packages” list for that content type, you would never get “Republish” and “Unpublished” as an option. The page queries the list to see if there is an entry for the given Content Type that proves it was published at some point before offering the option to undo the publishing.
To better demonstrate what I am trying to explain, take the following scenario. Go to your Content Type Hub and publish the “Picture” Content Type. Now once published, simply go back to that same “Content Type Publishing” page. You will be presented with only two options: Republish or Unpublish. The “Publish” option will be greyed out, because SharePoint assumes that because you have an entry in the Shared Packages list marked with “Unpublished = No”, that the Content Type has already been published. Therefore you can only “Republish” or “Unpublish” it. No navigate to the “Shared Packages” list and delete the entry for that Content Type. Once the entry has been deleted, navigate back to the Content Type Publishing page for the Picture Content Type. The “Publish” option is now enabled, and the “Republish” and “Unpublish” ones are disabled. That is because SharePoint couldn’t find a proof in the “Shared Packages” list that this Content type has been published in the past.
Also, if you were to publish a Content Type, and later unpublish it, you would see two entries in the “Shared Packages” list (one for publish one for unpublish). The Unpublish operation simply updates the existing entry in the “Shared Packages” list and sets its “Unpublished” flag to “Yes”.
If you were to create a Custom Content Type, publish it, and then delete it. SharePoint is not going to automatically remove its associated entry in the “Shared Packages” list. Instead, the next time the “Content Type Hub” timer job runs, it will update the associated entry to set its “Unpublished” flag to “false”. Meaning that we want to make sure that deleted Content Type never makes it down to the Subscriber Site Collections.
How does synchronization Works
By now you are properly wondering how the Synchronization process works between the Hub and the Subscriber Site Collections if entries are always persisted in the “Shared Packages” list. The way this process works is actually quite simple. The “Content Type Subscriber” timer job is the one responsible for that operation. By default that timer job runs on an hourly basis, and indirectly (via Web Services) queries the “Shared Packages” list to retrieve all changes that have to be synchronized. The root web of every Site Collection that subscribes to the Content Type Hub exposes a property called “metadatatimestamp” that represents the last time the Content Type Gallery for that given Site Collection was synchronized. The following PowerShell script can help you obtain that value for any given subscriber Site Collection.
$url = Read-Host "URL for your subscriber Site Collection" $site = Get-SPSite $url Write-Host $site.RootWeb.Properties["metadatatimestamp"]
When the “Content Type Subscriber” timer job runs, it goes through every Subscriber Site Collection, retrieve its “metadatatimestamp” value, and queries the “Shared Packages” list passing that date stamp. The queries then returns only the list of entries that have their “Last Modified” date older than that time stamp. Upon receiving the list of changes to sync, the timer job retrieves the Content Type information associated with the listed changes from the Hub and applies them locally to the Subscriber Site Collection’s Content Type gallery. Once it finished synchronizing a Site Collection, it updates its “metadatatimestamp” to reflect the new timestamp.
If you really wanted to, you can make sure that every single Content Type listed in the “Shared Packages” list be synchronized against a given Site Collection by emptying the “metadatatimestamp” property on that given site. As an example, when you create a new Site Collection, that root web won’t have that value set and therefore every Content Type that was ever published in the Hub will make its way down to that new Site Collection. Using the interface, you can also blank out that property by going to the Content Type Publishing page and selecting the option to “Refresh all published content types on next update”. All that this option does is empty the value for that property on the site.
Document Sets
Let’s now look at a more complex scenario (which is really why I started taking a closer look at the Publishing process in the first place). Let us investigate what happens if we publish Content Types that inherit from the Document Set Content Type. In my example, I’ve created a new custom Content Type named “DocSetDemo”. This custom Content Type, since it inherits from its Document Set parent, defines a list of Allowed Content Types. In my case, I only allow for “Picture” and “Image” as Allowed Content Types.
The moment you go and try to publish this custom Content Type, you will get the following prompt, which let’s you know that every Allowed Content Types identified within your custom Content Type will also be published.
What that truly means is that not only is SharePoint going to create an entry in the “Shared Packages” list for your custom Content Type, it will also create one for every Allowed Content Type identified. In the figure below, we can see that after publishing my custom Content Type “DocSetDemo”, which has an ID of 0x0120D5200049957D530FA0554787CFF785C7C5C693, there are 3 entries in the list. One for my custom Content Type itself, one for the Image Content Type (ID of 0x0101009148F5A04DDD49CBA7127AADA5FB792B00AADE34325A8B49CDA8BB4DB53328F214) and one for the Picture Content Type (ID of 0x010102).