Unique Document ID Across the Farm with Custom Document ID Provider

The Document ID Service of SharePoint is probably the single most interesting component of the platform for everyone trying to achieve true records management with the software. This service allows for unique document IDs to be assigned to documents within a site collection. The challenge encountered by many organizations with this service however is the fact that if you want to ensure documents gets a unique ID across your entire SharePoint farm, you need to specify a unique Prefix ID for each site collection. For most organizations, I work with, this means coming up with over 1,000 unique prefixes and applying them to each site collection either using PowerShell or any other automation solution.
While this may seem like it would be a show stopper for organizations to use the service, it is important to understand that the Document ID Service is extensible, and that Microsoft is making it very easy for organizations to implement their own Custom Document ID Provider (CDIP). By creating such a CDIP, we can control the entire logic of how document IDs will be generated and assigned to our documents, and therefore implement a logic that will ensure IDs are unique across our entire SharePoint farm, without having to come up with unique prefixes for each site collection. To get started with our CDIP creation, all we need to do is create a new SharePoint Server-Side using Visual Studio.
Within our newly created project, add a new class library named whatever you want (in my case I named it ContosoCDIP). Now let us add a reference to the Microsoft.Office.DocumentManagement.dll library to our project. This DLL file is normally found under /ISAPI.

1

Go back to the class library we just created, and make sure your class inherits from the Microsoft.Office.DocumentManagement.DocumentIDProvider. By doing this, you should automatically get an error letting you know that in order for your class to inherit from that parent class, you need to override 3 methods and 1 property:

2

What we are going to do next, is declare stubs for each of these 4 missing items. You can go ahead and manually retype all the missing override methods, or you can take the easy way and let Visual Studio do it all automatically for you. Using the Application Insight icon (lightbulb) that appears when you mouse over the error, you should see the option to let Visual Studio “Implement Abstract Class”. Select that option and Visual Studio will automatically go and create the stubs for the missing methods and property for you.

3

We currently have a fully valid Custom Document ID Provider. If you were to compile your project at this point, everything would come back fine and there should not be any compilation errors. Off course all items that we just added will throw an error upon being call, and we will get back to this shortly, but first let us ensure our CDIP can be properly deployed. For our CDIP to be deployed to our environment, we need to create a new feature receiver. First off, we need to create a new Feature which will allow us to deploy our Custom Document ID Provider. In the Solution Explorer window, right click on the Features folder and select Add.
4

Change the scope of your feature to be deployable at the Site Collection level.
5

You can now go ahead and right click on your feature in the Solution Explorer window, and select Add Event Receiver.

6

This will automatically create the Event Receiver class, with all the available methods being commented out. Make sure you add a clause at the top of your Event Receiver class to import the Microsoft.Office.DocumentManagement namespace.

using Microsoft.Office.DocumentManagement;

Let us start by uncommenting the “FeatureActivated” method. Inside that method, what we need to do is globally register a DocumentID provider on the current Site Collection where the feature was just activated. In there, type in the following line of code, where CDIPDemo.ContosoCDIP is the Document ID Provider class we created previously. What we are effectively doing here is creating a new instance of our Custom Document ID Provider, and setting it as the Default Provider for the current site collection.

DocumentId.SetProvider(properties.Feature.Parent as SPSite, new CDIPDemo.ContosoCDIP());

Because we are good coders, we also need to handle the case where the feature is Deactivated to ensure we remove our Custom Document ID Provider and set it back to the Out-of-the-box one for the given Site Collection. To ensure this scenario is handled, uncomment the “FeatureDeactivating” method. What we need to now is simply call into the SetDefaultProvider method of the DocumentID class to reset the Default Document ID Provider:

DocumentId.SetDefaultProvider(properties.Feature.Parent as SPSite);

At this point in time, we have a fully functional CDIP, and we are able to deploy it to our environment. In order to make sure our provider is able to leverage the Document ID Service, we first need to activate the Document ID Service feature on our site collection.

7

If we were to deploy our CDIP right now, and activate its feature, we can verify that the deployment was successful by going to Site Settings > Document ID Settings

8

If everything worked as expected, you should see the following two messages in red at the Top of your page:
Configuration of the Document ID feature is scheduled to be completed by an automated process.
A custom Document ID provider has been configured for this Site Collection. Refer to the documentation supplied with that custom provider to learn how to configure its settings.

The first one means that the Document ID enable/disable timer job has not yet run to completely configure your new CDIP. The second message confirms that a Custom Document ID Provider has been configured for the current Site Collection, which is exactly what we were looking for.

9

Now that we have a confirmation that our Custom Document ID Provider was successfully deployed, we can go back into Visual Studio and start playing with the code. By default, the Document ID Service relies on the Search engine to retrieve a document based on an ID. When a Document ID is assigned to a document, a hyperlink with the ID is automatically placed in the Document ID field of your libraries. When you click on it, it sends you to http:///_layouts/15/DocIdRedir.aspx?ID=. By default, the service will query the Search engine to retrieve the location of the document that has its Document ID field set to the ID received through the query string. We can however, decide to override this logic and come up with our own logic to retrieve a document’s URL based on its ID. Unless you have a good business rationale to change this behavior, I would recommend letting it use the Search engine to do its retrieval.

However, if you do need to put in your own custom logic, you will need to put your logic in the GetDocumentUrlsById method. This method will return a string array containing the URLs (because there can still be cases where you may want more than one document to have the same id) of the documents that match the received ID. In my case, I will simply have the method return a NotImplementedException() which indicates that I have not provided any logic to this method and will throw an error. You do not have to worry about the method throwing an exception, there is a way to make sure this method is never called.

Remember that when we created our CDIP class, we had to override 3 methods and 1 property. Well, that property we had to override is what controls what “retrieval” behavior our Custom Document ID Provider should have. If this value is set to false, then our provider will always use the Out-of-the-box search engine to retrieve the document, and the GetDocumentUrlsById method will never get called. If you set it to true, then the method will be called everytime a user clicks in the Document ID hyperlink. In my case, I will leave the value as false.

10

The next method we need to interact with is the GetSampleDocumentIdText method which simply returns an example of what the Document ID generated by our CDIP should look like. For this example, I have decided that my Document IDs should be in the form of “”. Therefore, my method will return any valid ID, for example “Wednesday- c32a6ab7-4c97-449b-a3d8-b5a82cf9eca7”. When you activate the Document ID Service on a site collection, it makes a new Web part named “Find Document by ID” available under the Search category. If you drop that web part on a page, it will automatically call into this method to retrieve an example of what a valid document ID could be and displays it as an insight on the web part.

11

Last by not the least, is the GenerateDocumentID method, which is where all the magic happens. This is the method where you will be generating your unique ID. In my case, every ID generated should be unique in nature because it uses a GUID in it (that is only in theory as it is possible that by chance I will get two IDs to be the same). If you wish to come up with a simpler naming convention, but still keep the IDs unique across your entire farm, what you can do is keep a reference to the last ID that was emitted in a list or in a property bag inside a given site, and increment that number every time you release a new ID. In my case, my logic will simply be the following:

Guid guid = Guid.NewGuid();
string dayName = DateTime.Now.DayOfWeek.ToString();
return dayName + " - " + guid;

Now that all of our methods have been covered, we are ready to deploy our CDIP. Once the feature is activated, simply upload a document to any document library and take a look at the Document ID column (you will need to add it to your view). You should see the assigned ID matching the logic of your CDIP!
12

PowerShell DSC at a Glance – Frequently asked Questions (FAQ)

PowerShell Desired State Configuration (DSC) is a great way to automate the deployment of any type of environments. The DSC engine has matured a lot over the past 2 years, and is now ready for prime time within the enterprises. However, I feel like this is still a component of PowerShell that is too little known in the industry. Be it because people have not yet heard of it (it was only introduced in WMF4 after all), or because people get scare by its apparent complexity, PowerShell DSC does not yet have the exposure it should. This article aims at answering the most frequent questions I get about PowerShell DSC when I try to introduce its concepts to some of my clients. The answer to the questions are mine, and I am trying to vulgarize the concepts as much as possible.

What is PowerShell DSC?

PowerShell Desired State Configuration (DSC) is an engine within PowerShell version 4 and greater that allows a user to specify an end result configuration (a Desired State) for a machine to be in, and have it automatically configure itself to match that end result configuration.

How does it work?

The Windows Management Framework (WMF) 4 and greater include a service component called the Local Configuration Manager (LCM) that is responsible to orchestrate the configuration of the machine into its Desired State. Via PowerShell, users will create what I refer to as a DSC Configuration Script (which you shouldthink of as a definition/description of what the Desired State should be), and pass it onto the LCM for it to start automating thr configuration.

Why not use a simple PowerShell script to automate the configuration?

Fair question indeed. In the past, we could simply create a .ps1 PowerShell script and execute it manually on a bunch of machines to automate their configuration in a end result state. However, what if that end result state was to change and be modified by somebody on the machine? How can we ensure the machine is always kept in the end result state then? PowerShell DSC tries to solve this by introducing the concept of Configuration Modes. The LCM can automatically detect whenever a machine has steered away from its Desired State and take action.

What type of action can the Local Configuration Manager (LCM) take if a machine steered away from its Desired State

There are 3 types of action the LCM can take if it ever detects that the machine steered away from its Desired State:

  • ApplyOnly: which tells the LCM to simply apply the desired configuration once, and then to do nothing even if the machine steers away from its Desired State.
  • ApplyAndMonitor: which tells the LCM to apply the desired configuration once, and then if the machinesteers away from the desired state, to simply reports discrepancies in the logs. We could then use a tool like System Center to send notifications to the IT team when a server node is no longer in the desired state, allowing them to go an contact the users who changed the Desired State to learn more about why a specific change was made.
  • ApplyAndAutocorrect: which tells the LCM to apply the esired configuration, and whenever it detects that the machine is no longer in a Desired State to automatically fix it so that it becomes compliant with the desired state. Using this mode, the LCM will still report all discrepancies in the logs.

To what frequency is the LCM checking for changes to the Desired State?

By default the LCM will check every 15 minutes to see if the machine steered away from its Desired Configuration, but this can be changed as part of your DSC Configuration Script (definition of the Desired State).

What is a Pull Server?

A pull server is a central repository for all of your DSC Configuration Scripts. When configured in DSC Pull mode, the LCM will ping the Pull Server on a regular basis to check if it Desired State has changed or not.

How can I install a PowerShell DSC Resource?

Go to the PowerShell Gallery PowerShellGallery.com which is supported by Microsoft. The easiest way from there is to ensure your have the PowerShell Get features availble (included in WMF 5). There are instructions on the site on how to proceed from there.

What is a DSC Resource?

Think of a PowerShell DSC resource as a software component that can be configured via PowerShell DSC. For example, if you are to build a new Domain Controller using DSC, once of the component you will need to configure is Active Directory. Well, there is a unique DSC Resource for AD that allows you to configure things like Users, OU, etc. There are even DSC Resources for Firefox and Chrome, assuming you need to install and configure specific browsers on your machine using DSC. DSC Resources are made up of modules, that each take care of a specific component within the software piece being configured by the resource.

Alright, this is normally the point where I’ve lost all of you, so let me try to summarize this further. If we take the Active Directory (AD) example, there is an AD DSC Resource. The component of the AD DSC Resource that allows us to create new users is a module called ADUser. This module contains all the logic required to interact with a User object (CRUD).

How are Modules working internally?

Every module needs to have at least 3 methods: Get-TargetResource, Set-TargetResource, and Test-TargetResource. There is nothing preventing a module from having more than these 3 methods, but these need to exist for a module to be valid.

Get-TargetResource: This method retrieves the current state of the machine. Let me vulgarize the process here a bit. The method performs a scan of the current environment and stores values in a variable (let’s assume it stores it in an XML format). So in our Active Directory example, it will scan all users in AD and will return the current list of users as XML. This method does nothing more than scanning the current environment and returning it CURRENT state.

Test-TargetResource: This method is being called by the LCM everytime it checks to see if the machine steered away from its Desired State. The LCM already knows how the machine should be configured (its Desired State), but it needs to compare it against the Current State to see if they match. To get the Current State, the Test-TargetResource method simply makes a call to the Get-TargetResource method. If the Current State and the Desired State match, then the machine is all configured properly, but if they don’t then that means the machine steered away from its Desired Configuration. The Test-TargetResource method simply returns True or False to the LCM. If the method returns a False, meaning the machine is no longer in its Desired State, then the LCM will need to take one of the 3 actions mentionned above (ApplyOnly, ApplyAndMonitor, or ApplyAndAutocorrect). In the case where the LCM is configured to ApplyAndAutocorrect, LCM will call into the Set-TargetResource method to bring the machine back in its Desired State.

Set-TargetResource: This method is responsible for bringing the machine to its Desired State Configuration based on the DSC Configuration Script (definition of what the Desired State is). When called, this method doesn’t care at all about the Current State. All it wants to do is to bring the environment back to its Desired State. In our Active Directory example, assume the Desired State mentions that a user named “John Smith” has to exist and be part of the Domain Admins group, if someone by mistakes deletes that user, when called this method will automatically recreate the user so that the environment is back to its Desired State.

dsc-resource

geekcoffee

Configure your SharePoint Environment for SharePointDSC

If you have been paying attention to the SharePointDSC project and always wanted to give it a go but just never had time to actually sit down and get started, this blog post may save you several hours of frustration if you ever decide to start experimenting with the project. When dealing with the SharePoint DSC, most of the calls performed have to be impersonating the Farm Account. Because of this, PowerShell has to do several Remote calls using the Invoke-Command cmdlet, and has to authenticate using CredSSP. By default, CredSSP is not authorized on Windows Server.

Trying to simply run the SharePointDSC scripts against your environment will mostly result in you getting the following error thrown:

Connecting to the remote server <server name> failed with the following error message : The WinRM client cannot process the request. the authentication mechanism requested by the client is not supported by the server or unencrypted traffic is disabled in the service configuration.

or

Connecting to remote server <server name>failed with the following error message : The WinRM
cannot process the request. CredSSP authentication is currently disabled in the client configuration. Change the
client configuration and try the request again. CredSSP authentication must also be enabled in the server
configuration. Also, Group Policy must be edited to allow credential delegation to the target computer.

credssperror

In order to fix this, we need to enable CredSSP on both the server and the client. To achieve this, run the following two PowerShell lines of code, where contoso.com is you domain name:

enable-wsmancredssp -role client -delegatecomputer *.contoso.com -force
enable-wsmancredssp -role server -force

SharePoint 2016 Feature Packs

Today at the Ignite conference in Atlanta, Microsoft shared more information about the vision for SharePoint. With SharePoint 2016, it is now possible for organizations to obtain and enable new features within their on-premises environments through the use of “Feature Packs”. In the past, we pretty much had to wait for Service Packs to be released before seeing new features make their way into the product. With Feature Packs, organizations can now activate new features directly into the on-premises product.

The first Feature Pack, scheduled to be made generally available in November of 2016, will introduce the following new features:

For IT Pros

  • Administrative logging: Allowing users to audit actions made in Central Administration;
  • MinRole Changes: Addition of new workloads to support small environments;
  • Unified Logging: Ability to combine logging from both on-premises and Office 365 environments;

For Users

  • OneDrive API Update: One Drive API 2.0 now available on-premises (allows for interaction with Drives and Items);

For Users

  • App Launcher Custom Tiles: Ability to add custom tiles to the App Launcher (waffle icon to left);
  • New OneDrive for Business UX: New User Experience in OneDrive for Business, matching the one introduced in Office 365 last year;
  • Hybrid Taxonomy: Allowing term stores to be unified between on-premises environments and Office 365;

box

Upload Random Documents to a SharePoint Environment

In my daily job I often need to kill a SharePoint farm and build a new one from scratch to match a client’s Production environment. When doing so, I also need to get rid of any test documents I had created in the previous environment, and need to start creating and uploading new documents from scratch for my new SharePoint environment. While working on a demo for one of my client, I need to demo a SharePoint migration, and also had to give them a live presentation of how the SharePoint 2013 Search Results will be displayed in the new environment.

Because of this requirement, I decided that I needed an automated way to upload thousands of documents, randomly, in my environment. Most of my dev SharePoint environments are single server farms (for simplicity and because it runs directly on Hyper-V on my work laptop). On these local farms, I always install the following software: Visual Studio, Office (2016 or 2013) and SharePoint Designer (yes I said it). The idea behind my automated script was that it should automatically generate Word documents, with random content in them, and upload them in a random site, somewhere in my environment, in the “Shared Documents” library that are included by default in Team Sites.

The first part of the script would have to rely on the Office Interop COM object, which means that ideally Office would have to be installed on the machine running the script. The first thing I did is create an array containing various nouns & verbs. Our script will randomly pick some of these words to populate the content of our Word documents.

$words = @("Cat", "Dog","Cow","Giraffe", "Ball", "Cube", "Sphere", "Document", "Paper", "Computer", "The", "Person", "Black", "Blue", "Red", "SharePoint", "Microsoft", "Red", "Search", "Create", "Delete", "Article", "Web", "HTML", "C#", "VB", "Microsoft", "Job", "Office", "Table", "Star", "Pen", "Pencil", "Arm", "Leg", "Doctor", "Student", "Education", "Defense", "Army", "Navy", "Boat", "Plane", "Jet", "Hockey", "Baseball", "Soccer", "Volley-Ball", "Swim", "Pool", "Lake", "Ocean", "Sea", "Bird", "Fly", "Sky", "Planet", "Space", "Pig", "Owl", "Night", "Day", "Morning", "Evening", "Delete", "Create", "Update", "Great", "Super", "Awesome", "Mark", "Check", "Report", "Minutes", "Decisions", "Technology","Software", "Hardware", "Server", "Database", "Application", "Game", "Keyboard", "Mouse", "Screen", "Tower", "Building", "Street", "Song", "Music", "Car", "Bus")

Next I will be prompting the users to ask them how many documents they wish to randomly upload across the environment. The script will then loop as many times as specified by the user, will generate a random document on each iteration, select a random SPWeb somewhere in the environment and upload the document in its Shared Documents library. Each document will be given a GUID as filename to make sure we don’t hit any conflicts.

Add-PSSnapin Microsoft.SharePoint.PowerShell -ErrorAction SilentlyContinue

$words = @("Cat", "Dog","Cow","Giraffe", "Ball", "Cube", "Sphere", "Document", "Paper", "Computer", "The", "Person", "Black", "Blue", "Red", "SharePoint", "Microsoft", "Red", "Search", "Create", "Delete", "Article", "Web", "HTML", "C#", "VB", "Microsoft", "Job", "Office", "Table", "Star", "Pen", "Pencil", "Arm", "Leg", "Doctor", "Student", "Education", "Defense", "Army", "Navy", "Boat", "Plane", "Jet", "Hockey", "Baseball", "Soccer", "Volley-Ball", "Swim", "Pool", "Lake", "Ocean", "Sea", "Bird", "Fly", "Sky", "Planet", "Space", "Pig", "Owl", "Night", "Day", "Morning", "Evening", "Delete", "Create", "Update", "Great", "Super", "Awesome", "Mark", "Check", "Report", "Minutes", "Decisions", "Technology","Software", "Hardware", "Server", "Database", "Application", "Game", "Keyboard", "Mouse", "Screen", "Tower", "Building", "Street", "Song", "Music", "Car", "Bus")

$number = Read-Host "How many documents do you want to randomly upload to SharePoint?"

$allWebApps = Get-SPWebApplication

[ref]$SaveFormat = "Microsoft.Office.Interop.Word.WdSaveFormat" -as [type]
$word = New-Object -ComObject word.application
$word.visible = $false
for($i = 0; $i -lt $number; $i++)
{
    $filePath = "C:\DND\Documents\" + [guid]::NewGuid().ToString() + ".docx"
    
    $doc = $word.documents.add()

    $selection = $word.selection
    $result = $selection.WholeStory
    $selection.Style = "No Spacing"

    $rndNumberOfWords = Get-Random -Minimum 1 -Maximum 1000
    $sentence = ""
    for($j = 0; $j -le $rndNumberOfWords; $j++)
    {
        $sentence += $words[(Get-Random -Minimum 0 -Maximum ($words.Length))] + " "
    }

    $selection.font.size = 14
    $selection.font.bold = 1
    $selection.typeText($sentence)

    $result=$doc.saveas([ref] $filePath, [ref]$saveFormat::wdFormatDocument)
    $result = $doc.Close()

    $randomWebApp = $allWebApps[(Get-Random -Minimum 0 -Maximum $allWebApps.Length)]
    $randomSite = $randomWebApp.Sites[(Get-Random -Minimum 0 -Maximum ($randomWebApp.Sites.Count-1))]
    
    $url = $randomSite.Url
    Write-Host "Uploading a random document to"$url

    $spWeb = Get-SPWeb $url 
    $docLib = $spWeb.GetFolder("Shared Documents") 
    $allFiles = $docLib.Files 

    $fileName = $filePath.Substring($filePath.LastIndexOf("\")+1) 

    $fileObject = Get-ChildItem $filePath

    $stream = $fileObject.OpenRead()
    $result = $allFiles.Add("Shared Documents/" + $fileName, $stream, $false)
    $stream.Close()
    $fileObject = $null

    $spWeb.Dispose()
    Remove-Item $filePath -ErrorAction SilentlyContinue -Force
}
$word = $null