Follow me on Twitter @AntonioMaio2

Friday, January 31, 2014

Putting Metadata to Work: Developing with Metadata Using the SharePoint CSOM

SharePoint has always had fantastic built-in support for metadata, but many organizations have not yet harnessed the power of metadata to build new efficiency, productivity and security into their business.  This post is the fourth in a series that will take us on a tour of what metadata is, how organizations can take advantage of its benefits and the SharePoint 2013 features that support it.

Developing with Metadata Using the SharePoint CSOM
Last year I was assisting a friend with some SharePoint development.  It involved creating and managing SharePoint metadata columns programmatically.  As well, it involved populating data within those fields when files were uploaded to SharePoint outside of the regular SharePoint upload mechanisms.  I found at the time that some of the development approaches I had to take weren't exactly obvious or what you'd expect. 
 
So, within this article I'll discuss the basics of programmatically managing metadata columns within a SharePoint list or library using the SharePoint 2013 Client Side Object model (CSOM).
 
Basic Client Object Model Concepts
The SharePoint client object model allows you to build applications for SharePoint that access and interact with SharePoint data without having to install any code on the SharePoint server.  A client object model application is designed to run on a client computer, and not on the SharePoint server itself.  I have chosen to write my client object model code on the Microsoft .NET framework using C#, but you could also create a client object model application in Silverlight or JavaScript. 
 
The client object model provides classes and APIs that can access information about site collections, sites, lists/libraries and items.  An application will call the APIs to perform one of the following operations:
  • Call methods and get return values
  • Call methods with a CAML query (Collaborative Application Markup Language) and get the query results
  • Get and set properties
Once client object model APIs have been called to perform a specific task SharePoint will bundle up the calls into XML and send a request to the SharePoint server.  The SharePoint server receives the request, makes the appropriate calls to the SharePoint Server Object Model, bundles the results up into a JSON object and sends that object back to the client object model on the client computer.  The client object model then parses the returned JSON and provides the results back to the application as .NET Framework objects. 
 
The client object model will bundle multiple method calls into a single request to the server in order to optimize performance and reduce network traffic.  Its important to note that as the developer you control when the SharePoint client object model sends XML requests to the server, and when it receives JSON back from the server.
 
Since a client object model application will run on a client computer you do need to ensure that the user running the client object application has appropriate permissions to the data that the client object model application will access.  Otherwise you will receive access denied exceptions.
 
Typically a class that is going to access the CSOM may look a little like this:

using System;
using Microsoft.SharePoint.Client;

class CSOM_Example
{

    private string _SPLocation;
    private ClientContext _context;
    private Web _rootweb;
    private Web _site;
    public CSOM_Example(string strURL, string strSite, string strList)
    {
        _SPLocation = strURL;
        ClientContext _context = new ClientContext(_SPLocation);

        _rootweb = clientContext.Web;     //the root web = the site collection
        _context.Load(site);
        _context.ExecuteQuery();

        //Using the default authentication method means the user account that
        //the assembly is run under must have sufficient rights in SharePoint to create   
        //sites, create libraries, add files and/or get and set metadata
       

       _context.AuthenticationMode = ClientAuthenticationMode.Default;

        //Check for null, and if have a rootweb then we can get the site name passed in by title
        if(_rootweb != null)
        {
                _site = GetSitebyTitle(strSite);
        }

        //check for null if the site requested does not exist
        if(_site != null)
        {
            _list = GetListByTitle(strList);
        }

       // 
        //Now we have a client context, the root web, the site and list passed in we can make
        //calls to access objects and information within the site collection, site or list
        //
        //...
        //   

    }

    //
    //Helper method: get the site object by name; there are built-in methods for this
    //but I've found this a bit more versatile
    
private Web GetSiteByTitle(string strSite)
    {
        try
        {
            if (_rootweb != null)
            {
                var query = _context.LoadQuery(_rootweb.Webs.Where(p => p.Title == strSite));
                _context.ExecuteQuery();
                return query.FirstOrDefault();
            }
        }
        catch (Exception e)
        {
                //perhaps do some logging here
                return null;
        }
        return null;
    }
    //
    //Helper method: get the site object by name; there are built-in methods for this
    //but I've found this a bit more versatile
    
private Web GetListByTitle(string strList)
    {
        try
        {
            if (_site != null)
            {
                var query = _context.LoadQuery(_site.Lists.Where(p => p.Title == strList));
                _context.ExecuteQuery();
                return query.FirstOrDefault();
            }
        }
        catch (Exception e)
        {
                //perhaps do some logging here
                return null;
        }
        return null;
    }
    // 
    //Now we have a client context, the root web, the site and list passed in we can add
    //additional methods to access objects and information within the site collection, site or list
    //
    //...
    //

}

Client object model applications usually follow this pattern: setup your context, get your root web, make a query (like your querying a database) or set properties (like your setting database fields).

Now that we have a shell of a class that we can use to access information and objects within the site collection, site and list.  I have wrapped this code in a class with a constructor and private members that store the client context and the root web for the life of any instance of my CSOM_Example object.  You could just as easily call these APIs as shown here in a static method.

Now that we have the basic objects needed, lets see how we use them to interact with metadata.


Creating Metadata Fields
Let's start with some simple code that creates new metadata columns in an existing list.  We'll be using the _list member that we established by name in the class above.

    //First check _context and _list are not null before proceeding... leaving this code to you

    //Add metadata columns to our list - 4 different types of metadata columns
   
    //Text column type
    Field metadataField1 = _list.Fields.AddFieldAsXml(
        @"<Field Type='Text'
                 DisplayName='Address'

                 Name='Address'
                 Required='True'
                 MaxLength='256'
         </Field>",
          true,                                                    //add to default list view

          AddFieldOptions.DefaultValue);               

    //Choice column type
    Field metadataField2 = _list.Fields.AddFieldAsXml(
        @"<Field Type='Choice'
                 DisplayName='Classification'
                 Name='Classification'

                 Required='True'
                 Format='Dropdown'>
            <Default>Public</Default>
            <CHOICES>
              <CHOICE>Public</CHOICE>
              <CHOICE>Confidential</CHOICE>
              <CHOICE>Restricted</CHOICE>
            </CHOICES>
          </Field>"
,
          true                                                   //add to default list view

          AddFieldOptions.DefaultValue);

    //Note column type
    Field metadataField3 = _list.Fields.AddFieldAsXml(
        @"<Field Type='Note'
                 DisplayName='Description'

                 Name='Description'
                 Required='True'
                 MaxLength='500'
                 NumLines='10'
         </Field>",
          true                                                   //add to default list view

          AddFieldOptions.DefaultValue);

    //Number column type
    Field metadataField4 = _list.Fields.AddFieldAsXml(
        @"<Field Type='Number'
                 DisplayName='Budget'/>"

                 Name='Budget'
         </Field>",
           true                                                   //add to default list view

           AddFieldOptions.DefaultValue);
 

    _context.ExecuteQuery();

This code is intended to be added to a new method in the class.  There are 4 different column types we're setting up here, each with their own properties specified in the XML that is passed to the method AddFieldAsXml( ).  We've created a Text field, a Choice field, a Note field (which is the multi-line text field) and a Number field.  The 2nd last parameter, which we have always set to true above is a useful one - it determines if the metadata column is displayed as part of the default view.  For more information on SharePoint 2013 column types refer to: List Item Column Type Reference.

Note: we are passing both a DisplayName and a Name attribute when creating the column.  The Name attribute is the internal name.  A SharePoint metadata column has both, and its important to know that typically when accessing column data the client object model will typically use the Internal Name.  When they're the same this isn't an issue, but read further.

Note: there are other useful properties available as well when setting up metadata columns that are not shown here.  Sometimes you may wish to create a metadata column which appears in a list or libraries default view and displays its currently set value, but you do not wish it to appear in the library settings column configuration panel, nor do you wish the user to interact with it when they Edit Properties on an  item.  This is often needed when column values are only set programmatically and you would like the user to be able to view them but not edit them.  This can be done by passing in the following attributes within the XML as well:
  • ShowInEditForm='FALSE'
  • ShowInNewForm='FALSE'
  • ShowInDisplayForm='FALSE'
The XML for such a field would look something like this:

    Field metadataField4 = _list.Fields.AddFieldAsXml(
       
@"<Field Type='Number'
                 DisplayName='Budget'/>"

                 Name='Budget'
                 ShowInEditForm='FALSE'
                 ShowInNewForm='FALSE'
                 ShowInDisplayForm='FALSE'
           </Field>",
           true                                                   //add to default list view

           AddFieldOptions.DefaultValue);


Getting List Item Metadata Fields
Next let's look at how we programmatically get metadata values for a specific list item from existing metadata columns.

    //First check _context and _list are not null before proceeding... leaving this code to you

    //get the list item for which we want to set metadata using a CAML query
    CamlQuery camlQuery = new CamlQuery();
    camlQuery.ViewXml =
       
@"<View>
            <Query>
              <Where>
                <Eq>
                  <FieldRef Name='Title'>
                  <Value Type='Text'>'Fiscal2014'</Value>
                </Eq>
              </Where>
            </Query>
            <RowLimit>1</RowLimit>
          </View>"
;


    Microsoft.SharePoint.Client.ListItemCollection listItems = _list.GetItems(camlQuery);


    //since RowLimit is set to 1 it will retrieve the first matching item; if items are not unique
    //then you can retrieve more by increasing the RowLimit value

    //if you are working with a large list make more use of the RowLimit element - you should

    //not attempt to retrieve more than 2000 items at a time; if you need to retrieve more
    //items than that, then implement a paging algorithm

    //we can specify the metadata columns to retrieve as part of the query in Load 
    _context.Load(listItems, items => items.Include(
                 item => item["Address"],
                 item => item["Classification"],
                 item => item["Description"],
                 item => item["Budget"]));

    _context.ExecuteQuery();

    //let's assign the values of those metadata columns to our method variables - using a for 

    //loop in case got more than 1 value returned by changing RowLimit
    string    strItemAddress;
    string    strItemClassification;
    string    strItemDescription;
    Double dBudget;

    foreach (ListItem listItem in listItems)
    {
        strItemAddress = listItem["Address"]);
        strItemClassification = listItem["Classification"]);
        strItemDescription = listItem["Description"]);
        dBudget= listItem["Budget"]);      //Budget is a Number column - returns null or a double
    }


Again, this code is pretty straight forward.  We setup our CAML query looking for an item with a particular Title, we load the query specifying at the same time which metadata column data to retrieve with the item, we execute the query and then finally parse out the data.

Note: We are specifying the metadata columns to access in the _clientContext.Load() method using the column's Internal Name.  When the column was created, if only a Display Name was specified then SharePoint will create the Internal Name from the Display Name.  If the Display Name contained a space, then SharePoint will replace the space with _x0020_ and we must take this into account when specifying the column name in such calls.  For example, if the column's Display Name is 'Middle Name' and we let SharePoint create the Internal Name, then we must specify the column as such:

    _context.Load(listItems, items => items.Include(item => item["Middle_x0020_Name"]));

If when creating the column we specified an Internal Name that's different from the Display Name then we must use the Internal Name.  For example, if DisplayName='Middle Name' and InternalName = 'MiddleName' then you must use:

    _context.Load(listItems, items => items.Include(item => item["MiddleName"]));
 


Setting List Item Metadata Fields
Finally, let's look at some simple code to update metadata fields for an existing item in an existing list.

    //First check _context and _list are not null before proceeding... leaving this code to you

    //get the list item for which we want to set metadata using a CAML query
    CamlQuery camlQuery = new CamlQuery();
    camlQuery.ViewXml =
        @"<View>
            <Query>
              <Where>
                <Eq>
                  <FieldRef Name='Title'>
                  <Value Type='Text'>'Fiscal'</Value>
                </Eq>
              </Where>
            </Query>
            <RowLimit>100</RowLimit>
          </View>"
;


    Microsoft.SharePoint.Client.ListItemCollection listItems = _list.GetItems(camlQuery);


     //since RowLimit is set to 100 it will retrieve the first 100 matching items
    //you can retrieve more by increasing the RowLimit value

    //if you are working with a large list make more use of the RowLimit element - you should
    //not attempt to retrieve more than 2000 items at a time; if you need to retrieve more
    //items than that, then implement a paging algorithm
    _context.Load(listItems);
    _context.ExecuteQuery();

   //execute query first in order to get item, and then execute again to  update
 
    //for each matching list item returned set metadata values to some parameters passed in
    foreach (Microsoft.SharePoint.Client.ListItem listItem in listItems)
    {
        listItem["Address"] = strMetadataValue1;
        listItem["Classification"] = strMetadataValue2;
        listItem["Description"] = strMetadataValue3;

        listItem["Budget"] = dMetadataValue4.ToString();
        listItem["Middle_x0020_Name"] = strMetadataValue5;


        //include more metadata fields here if necessary...

    }

   //execute the update once for all the list items retrieved

   listItem.Update();
    _context.ExecuteQuery();

You can see that again we follow a similar pattern, where we setup our CAML query to get the item in question.  In this case we're getting the first 100 items that match our query.  Certainly our CAML query can get more complex and specific than this.  For more information on CAML query structure refer to the Collaborative Application Markup Language ReferenceThen we execute our query to get the item, we set our metadata field values, we update the list item and then we execute our query again.

There are many business applications to working with metadata programmatically within SharePoint, especially when metadata is populated from external sources or other line of business systems, or when metadata must be extracted from SharePoint and used in other external systems.  Hopefully these code examples can help you get started with integrating SharePoint metadata into your business processes.

     -Antonio

Tuesday, January 28, 2014

Putting Metadata to Work: Site Columns in SharePoint

SharePoint has always had fantastic built-in support for metadata, but many organizations have not yet harnessed the power of metadata to build new efficiency, productivity and security into their business.  This post is the third in a series that will take us on a tour of what metadata is, how organizations can take advantage of its benefits and the SharePoint 2013 features that support it.

Site Columns
Today let's look at the Site Column feature.  This feature allows you to create metadata columns at a higher scope from the library or list.  With site columns, you essentially create and manage columns similarly to the methods described above, however you create and manage them for a site or site collection.  The columns created here do not ever appear on a site page, nor are they ever used by the site itself.
 
Site columns are still only used by libraries or lists.  The advantage of using site columns is that if you have a large site or site collection, with either many libraries/lists or many sub-sites, you can centralize the management (to some degree) of all metadata columns in all libraries and lists in the site or site collection.  So, for example, if you have a moderately sized site collection with 5 sub-sites, each sub-site has on average 20 libraries, and each library has a 'classification' column with 5 classification field values available.  If you create the 'classification' column on each library that's 100 metadata columns to create and manage.  The likelihood of errors or inconsistencies appearing in the 'classification' columns deployed to each library is quite high.  As well, the amount of work to create and manage the columns is also high.
 
With site columns, in this case, at the site collection level you can create 1 site column called 'classification' with the 5 field values that are available, and then add it to each library on which it is required.  You create the column only once and you manage it in 1 place.  When settings for that column need to change you simply edit the column at the site collection level, and the changes will be automatically deployed to all libraries which use the column. 
 
Note: when you create a site column, you still need to visit each library or list which requires that column and explicitly add the column from the site column list.
 
To create site columns you'll do the following:
  • Visit the site or site collection in which you wish to manage your site columns
  • Accessing the Site Settings page
  • Click the Site Columns link
The following page will appear and you'll see that many site columns already exist.  These are pre-created within SharePoint 2013 and are available to use on your libraries or lists. 
 
Site Column Page in SharePoint 2013
 
 
To create a new site column, simply click the Create button in the top left side of the page and you'll be taken through the same column creation pages shown above.  In order to access the site column page and create site columns for a particular site, you will need to be part of the site owner's group on the site.
 
A last important note about site columns is that they can be created on any site or site collection, but the scope at which they are created is important.  If you create a site column on a site collection, the column will be available to any library or list within any sub-site in the site collection.  However, if you create a site column within a sub-site, the site column will only be available to libraries and lists within that sub-site or any sub-sites within that sub-site.  It will not be available to the site collection.
 
So, when creating site columns its important to determine at which level you will be using those columns, either at the site collection level and downwards in the site hierarchy, or at a sub-site and downwards in the hierarchy starting from that site.
 
     -Antonio

Monday, January 27, 2014

Putting Metadata to Work: Working with SharePoint Metadata Features

SharePoint has always had fantastic built-in support for metadata, but many organizations have not yet harnessed the power of metadata to build new efficiency, productivity and security into their business.  This post is the second in a series that will take us on a tour of what metadata is, how organizations can take advantage of its benefits and the SharePoint 2013 features that support it.


Working with Metadata in SharePoint
In SharePoint when end users upload a document to a library or add an item to a list, a number of columns will appear beside the document or item – those columns are how SharePoint represents metadata fields.  Several columns are added by default, like date last modified, last modified by and file type.  Administrators or those with sufficient rights can also add custom columns for various purposes.

SharePoint 2013 Library with Standard Built-in Metadata Columns

SharePoint 2013 Library with Custom Metadata Columns


When custom columns are configured for a library, a user will be asked to fill in those metadata fields when they upload a document.  For example, if we have a library with the custom column 'Community' (or Department, or Classification or anything else) the user will be presented with the Edit Properties window asking them to fill in that column:

SharePoint 2013 Edit Properties Window

As mentioned above, metadata fields are sometimes referred to as tags, but they can also be referred to as properties.

Adding Custom Metadata Columns
To add custom metadata columns is relatively easy.  You would typically navigate to the library or list in question, then click the 'Library' or 'List' tab in the ribbon bar and select the 'Create Column' button, which would bring up the following window:

Creating a Custom Column in SharePoint 2013
 
You would then give the column a name, select the type of column, give the column a description and select some useful settings like:
  • Must the user specify a value for the column when uploading a file
  • Must every item in the list or library have a unique value in this column
  • The maximum number of characters in the column value
  • If a value is selected by default
For a single line of text the maximum characters available is 255 characters.  If you need more characters entered for a metadata field then use a 'Multiple lines of text' column type, which allows for up to 2 GB of character data in a column value (that's 1,073,741,823 characters).  That said, although it is possible, it is not advisable to store such large amounts of text in a metadata field value as it could hamper overall performance.

As well, depending on the type of column selected, some additional options will appear.  For example, if you select a 'Choice' column which essentially is a dropdown list of choices you'll see the following control added to the page so that you can specify exactly which values are available to users for this metadata column:

Metadata Column Settings for Choice Columns in SharePoint 2013

Finally, you may also specify a validation formula for the metadata values entered.  Toward the bottom of this window, is a +Column Validation section that can be revealed.  It provides the following options:

Validation Options for Metadata Columns in SharePoint 2013

This allows you to specify a formula that is used to validate the data entered whenever a new metadata field is populated for an item in the library or list.  This can be particularly useful when entering numerical data, in order to apply some checks to the data entered.  You can even specify the message shown to the user I they should enter a value that doesn't pass validation.  Some great help is provided on the link shown in this section regarding the syntax to be used in specifying a formula.

Alternatively, to add a new or edit an existing column, you can click 'Library Settings' on the SharePoint ribbon.  A list of all existing columns will be displayed.  From here you may select an existing column to edit or add a new column as well, which will take you through similar windows to those shown above.

Library Settings Page with Metadata Columns

Some common metadata functions performed in this page:
  • Simply click the name of a metadata column to edit its settings
  • Add a new column (takes you to the same process as above)
  • Change the order which columns appear
  • Add columns from a list of site columns (read further below to learn about site columns)
Again, accessing and modifying these settings is pretty easy.  Its just a matter of working through the various options available and Microsoft has provided a lot of control over these particular settings.


Permissions to Add or Modify SharePoint Metadata Columns
In order to configure metadata fields on a SharePoint library a user must have the ‘Manage Lists’ permission on that library.  The ‘Manage Lists’ permission depends on also having the following permissions on the library: Manage Personal Views, Open, View Pages and View Items.  These permission are by default given as part of the ‘Design’ and ‘Full Control’ permission levels, so if you have these permission levels on the library you should be able to add, remove or edit columns (or metadata fields) on a library. 

Of course, a ‘site collection administrator’ or members of a ‘site owners’ group are able to perform such configuration by default.

Limits on Metadata Columns
There are certain limits on the number of metadata columns that you can configure for each SharePoint list or library.  These are not in fact limits, but rather thresholds, meaning you can surpass these numbers but depending on your SharePoint architecture and infrastructure you may or may not see adverse effects.  With the built-in SQL Server configuration for SharePoint 2013, the following numbers of columns of each type are permitted within a SharePoint list or library

Column Type Number of Columns Threshold
Single line of text 276
Multiple lines of text 192
Choice 276
Number 72
Currency 72
Date and time 48
Lookup 96
Yes/No 96
Person or Group 96
Hyperlink or picture 138
Calculated 48
GUID 6
Int 96
Managed Metadata 94

Each column once created will take up a certain number of bytes and this will vary based on the column type.  The threshold shown ultimately comes down to the fact that the sum of all columns in a SharePoint list cannot exceed 8,000 bytes. Depending on column usage or which types of columns are created, users can reach the 8,000 byte limitation before other limitations are exceeded.  For a full explanation how these limits are calculated, please visit the Microsoft SharePoint 2013 Software Boundaries and Limits page and read the section on Column Limits.

Auto-Populating SharePoint Metadata: Property Promotion
When adding Microsoft Office documents like Word, Excel and PowerPoint to SharePoint, metadata that is stored within those files can automatically be added to SharePoint metadata columns.  This occurs for the default metadata columns mentioned, but it can also occur for custom metadata columns in SharePoint. 

If a metadata field stored within a Microsoft Office file has a field name which exactly matches a SharePoint metadata column that is already configured on the library to which it is being added, then the value of the metadata field in the document will be automatically populated in the corresponding SharePoint metadata column.  This feature is called Property Promotion.  There are a few caveats to this feature though:

  • It only works with certain Microsoft Office formats, namely Word, Excel and PowerPoint.  This does not work for PDF files or other formats.
  • SharePoint only checks the ‘Document Properties’ fields (docProps section) within these Microsoft Office file formats.  It does not check any of the Open XML Document format fields (customXML section).  If you have 3rd party solutions storing metadata in MS Office files, ensure that the metadata is stored in the correct section of the file.
  • The SharePoint metadata column must have its named configured to exactly match the metadata field within the file, and this must be done when the column is created. You cannot take an existing metadata column and rename it to match the metadata field.  If you make a typo, you must delete the column and recreate it.
  • This feature does not work if Information Rights Management is configured on the library.

Often when working with Microsoft Office files metadata will automatically be added to the file, like the name of the document’s author, language or creation date.  Third party solutions integrated with Microsoft Office can also add metadata to our documents automatically.  This is a great feature for automatically populating metadata within SharePoint for the most common file formats that we work with.


Conclusion
This post has covered how to work with metadata using several of the basic built-in metadata features within SharePoint.  They are not complex, but for those new to SharePoint many of the concepts are new. 
 
There are several other features within SharePoint that allow you to work with metadata in more advanced ways; features like:
  • Site Columns
  • Managed Metadata Service
  • Metadata Navigation 
These features will be covered future posts.  For now, I hope you enjoy working with metadata within SharePoint using the features described here.
 
     -Antonio
 

Tuesday, January 14, 2014

Putting Metadata to Work: An Introduction

SharePoint has always had fantastic built-in support for metadata, but many organizations have not yet harnessed the power of metadata to build new efficiency and productivity and security into their business.  This article and a few posts that follow will take us on a tour of what metadata is, how organizations can take advantage of its benefits and the SharePoint features that support it.

An Introduction to Metadata
In simplest terms metadata is structured information about our data.  If you are looking at a set of documents the ‘last modified’ date for each document is a form of metadata.  As well, the ‘author’ of each document is also a form of metadata.  One way in which we use these 2 types of metadata is when we want to sort that list of documents by the ‘last modified’ date in order to find the most recently updated document.  We could also sort by author in order to find a document written by a particular person.  We can see these fields available in the Windows Explorer.


Metadata fields in this form are sometimes referred to as Tags.  The ‘last modified date’ and ‘author’ are typically filled in automatically by the applications we use to edit those documents as is the case with Microsoft Word.  The values for other metadata fields can be selected by end users – for example, a ‘department’ may need to be filled by end users when saving a document to identify the department responsible for the document.


When users interact with documents and their metadata within SharePoint, for example saving a document or adding a document to SharePoint, they are typically presented with a limited set of values to select from for each metadata field.  This ensures consistency in how metadata fields are specified and simplifies the process for end users making it more likely that they will select the correct values. 

Creating a well-defined set of metadata fields and values for the business is often referred to as creating a Metadata Taxonomy.  Creating a corporate metadata taxonomy can be an extremely important task for many organizations, and it can turn out to be a simple or very complicated process which involves many stakeholders.  Well look at this process in a future post.
Other forms of metadata can also serve other purposes, for example to enhance security or business process.

Persistent Metadata
An important concept when working with metadata is the idea of Persistent Metadata.  Persistent metadata refers to metadata which is stored within the files or information objects to which they refer.  For example, if I start a new Microsoft Word document, when I save it my name (as configured on the Windows system) is stored within the document as the author of that document, as is the current date/time as the document’s creation date. 
I can typically see the persistent metadata within a file by right-mouse clicking on the file and access the document’s properties.  This will show me the document’s persistent metadata:
 


Persistent metadata is important because it allows metadata to travel with the document so that no matter how I distribute or transmit the document the metadata travels with it.  This has benefits from a security perspective because most network or gateway security systems can scan document metadata as document’s pass through the network (via email, HTTP or FTP).  Those systems can perform simple validations to determine if documents are being transmitted inappropriately or against corporate policy.

There are several standardized formats or protocols for storing persistent metadata within information objects:
 
  • Documents Properties (docProps) within legacy Microsoft Office document formats like .doc, .xls, .ppt.
  • Open XML file custom properties (customXml) within current Microsoft Office document formats like .docx, .xlsx, .pptx.
  • Keywords field for simple name/value pairs within PDF files
  • XMP section within PDF or PDF/A files
  • Dublin Core for standardized metadata schema elements
  • Persistent metadata within emails is often stored as an xHeader

There are several more for specialized security and interoperability purposes.

Metadata which is not persistent is typically stored outside the file in a database of some kind.  For example, document metadata within SharePoint is stored by default in the SQL Server database upon which SharePoint sits.  Metadata stored within a database also has many benefits in terms of making content easily searchable.  This allows end users to more easily find the content they need, and it can assist auditors during eDiscovery processes.


Metadata Improves Information Security
In recent years many organizations have begun to attach one or more ‘classification’ metadata fields to their documents to identify the sensitivity of the information within the document.  Along with the sensitivity you often see a ‘community’ metadata field which helps identify the intended audience for that information.
 
Often the available values for a ‘classification’ metadata field are limited to a small number of terms that make sense to those end users that will be selecting the sensitivity classification for a particular document – terms like:
 
  • Public
  • Internal Distribution Only
  • Confidential
  • Highly Confidential
  • Restricted
  • Legal Restricted
  • Secret
  • Top Secret 
 
Not all of these classification terms may make sense to your business, but some likely will.  It’s important to only use the terms that make sense to you and your end users.  Often educating employees is required to train people on which term to use in specific cases, and on what the company policy is for classifying corporate information.
 
From a security perspective, the purpose for using a classification or sensitivity metadata field is typically related to controlling distribution of the information and ensuring corporate information is only viewed or accessed by those that are permitted to access it.  If we look at the metadata term list above, some identify the sensitivity of the data (ex. Confidential, Highly Confidential) while others define who should have access to the data (ex. Public, Internal Distribution Only, Legal Restricted).  It’s very tempting to combine terms from both sets into the same metadata field because they ultimately serve the same purpose (control distribution or access) but that can often cause confusion for end users – when do I use ‘Internal Distribution Only’ as opposed to ‘Confidential’?  Often, when several classification terms overlap in meaning as in the case here, organizations will use a ‘community’ metadata field to separate the concept of sensitivity of the information from the distribution of the information.
 

 
 
There are 2 main security benefits of applying ‘classification’ metadata to identify sensitivity of your information:
 
  • When end users access or receive information the ‘classification’ metadata can educate them on how to handle that information or how to control its distribution,
  • Automated policy systems can enforce access control policies based on that metadata
 
Of course, for these benefits to be realized other systems need to be in place. End users must look at a document’s metadata, or some other system needs to be in place to add security markings to documents based on that metadata.  As well, one or more policy systems will need to be place in order to take advantage of that metadata to control access or distribution.  SharePoint 2013 out of the box does not contain such systems to automate that would automate these processes, but the first step in securing such sensitive information should be to add metadata fields to SharePoint lists and libraries so that you can start capturing valuable metadata.
 
 
Metadata Improves Business Productivity
 
Business Workflows
In many business processes information objects or documents need to move through specific workflows, typically moving from one person to another.  As they move through that process the state of a document can change to show that one stage of the process is complete, and another is ready to begin.  This is pretty basic and already occurs in most businesses in one form or another.  The ‘state’ in this case can be viewed as document metadata.
 
When working with large numbers of documents, large user communities or many processes metadata can provide some great benefits to our business by allowing us to streamline processes to target directly at specific documents depending on the nature of the information or on its current state.  For example, many organizations will store ‘department’ and ‘status’ metadata fields with documents which must through specific approvals; then depending on the values of the ‘department’ and ‘status’ fields an approval task will be automatically assigned to a specific manager. 
 
This is pretty common, but we can see how it alleviates a user’s need to select the appropriate manager to approve things like expense reports, travel requisitions, budgets, etc.  It can also route tasks more accurately to the appropriate people, helping to avoid user error.  This can get a little more complex by also looking at the amount of an expense report (another piece of metadata) and if the total is over a certain amount then automatically route the approval task to a more senior level manager.
 
Another example that’s often seen is the implementation of a workflow which automatically moves a document from one site to another when the document is ready to be archived.  This type of workflow would be based on several pieces of metadata including:
 
  • Status (if the document is approved, published or in some other state identifying completion)
  • Date Last Modified (if a document has not been modified for a long time)
  • Department (some departments may not ever want content to be archived)
  • Retention Period (depending on the nature of the information and compliance laws it may require being retained for specific periods of time)
 
SharePoint builds in great functionality and flexibility to design and implement workflows.  Those workflows can take advantage of metadata fields within SharePoint to achieve the scenarios described above, as well as many others.  Workflows in SharePoint typically take advantage of metadata through Content Types.  Some great information from Microsoft on planning and implementing content types and workflows can be found here:  http://technet.microsoft.com/en-US/library/ff607870(v=office.14).aspx.
 
Making Content More Searchable
SharePoint 2013 has made great advancements in its ability to search large amounts of content.  It now provides a very flexible and very robust enterprise search application.  Deployment of SharePoint 2013 search can be fairly complex requiring specific planning and in-depth knowledge.  The built-in search capability can take advantage of SharePoint metadata and provide users with additional structured data to use in finding content they need.  As well, it can allow end users to refine their search queries and narrow in more quickly to the content they’re looking for.
 
An example of such a search could be if an auditor wishes to find all content from the Finance department which has been approved by a specific manager that is currently under investigation.  To make this a little more complicated let’s consider looking for all content from the Finance department, which is an expense report, approved by a specific manager, which is over a certain amount of money and which was approved between two specific dates.  If these values were all stored as SharePoint metadata with each document, such a search would be rather trivial.  We can find many such examples where metadata can provide the benefit of making content more searchable.
 
That said, by default SharePoint 2013 Search does not use metadata as part of its search index.  Some configuration is needed to have metadata included as part of your search query.  When searching for content SharePoint must first crawl the content to build up an index of the content available - this is done prior to end users performing a search.  When crawling content, several different types of properties within each document are examined by the search engine.  For example the search engine can extract keywords from content – these are called ‘crawled properties’.  Only those properties that have been pre-configured as ‘managed properties’ are used as part of the search index.  This can be a complex task that could be an article in itself - for more information on configuring SharePoint 2013 search to use managed properties and metadata refer to: http://technet.microsoft.com/en-us/library/jj219667.aspx.
 
  

Saturday, January 11, 2014

SharePoint 2013 Encouraging Users to Share More

 
Configuring SharePoint Permissions in SharePoint 2013
One noticeable change related to permissions in SharePoint 2013 is how “power users” gain access to the controls for setting permissions on SharePoint content. If given sufficient rights, in previous versions of SharePoint a user could access the “Manage Permissions” window for a particular SharePoint item by activating the menu on an item and selecting “Manage Permissions”. Within the Manage Permissions window, the “power user” could allow access to other users, remove user access, and either break inheritance or re-inherit permissions from their parent. In SharePoint 2013, the Manage Permission window is still available, however, it’s much more difficult to find. As well, new user interface elements have been added to allow users to more easily specify the permissions for an item or a document.

The first noticeable interface change occurs when a user clicks the ellipsis (…) beside an item. A new dialog—or “Share Window”—is presented to the user, containing a “SHARE” button and a “Shared with lots of people.”

Accessing SharePoint sharing capabilities

Figure 3. Accessing SharePoint sharing capabilities.

Clicking the “SHARE” button displays a window from which other users can be granted access to the item or document:

Users can invite people to access a document they want to share

Figure 4. Users can invite people to access a document they want to share.

Users may share the document with others by specifying user names, email addresses or ‘Everyone’ in this window. Users can even select if the invited people can edit the document or if they can simply view the document. The “Can edit” and “Can view” options in the dropdown list relate to specific permission levels. Adding users to this window and clicking “Share” will automatically break permission inheritance on the document (if it is not already broken) and create unique permissions on the document, thereby creating a unique security scope if necessary. This is an important point because it makes it much easier for end users to break inheritance and create additional security scopes in SharePoint. While 50,000 security scopes is a high number, for high user access sites this could become an issue.

In Figure 3 above, you will notice that the “Share Window” generated by the ellipsis (…) also contains another ellipsis. Selecting this second ellipsis (…) in the “Share Window” brings up a secondary menu:

Accessing the Shared With menu


Figure 5. Accessing the Shared With menu.

Clicking the “Shared With” menu item (or the “lots of people” link) displays the window in Figure 6 below, which allows users to view the audience that the document has been shared with. If the document has been shared with any SharePoint groups, the members of those groups will be individually displayed here (as opposed to the group). From this window, users may invite additional users or email everyone with access to the document.

Viewing the audience that a document has been shared with

Figure 6. Viewing the audience that a document has been shared with.

Clicking the “ADVANCED” button in Figure 6 will display the traditional “Manage Permissions” window (Figure 7), which still looks very much like that of SharePoint 2010. From this window, given sufficient rights, users can view a document’s permissions, break inheritance, re-inherit permissions, add permissions and remove permissions. Users can even check if a specific user or group has appropriate permissions to access the document.

The Manage Permissions window

Figure 7. The Manage Permissions window.


As mentioned these are new user interface elements that have been added to allow users to more easily specify the permissions for an item or a document, and thereby more easily share information at a fine grained level.

This is a fundamental change by Microsoft! It means that more focus has been placed on encouraging users to share content than restricting access to content. The traditional Manage Permissions window has always implied that users restrict permissions. As well, it means that Microsoft is willing to allow users to more easily create unique permissions on individual documents and items.  This tells us that Microsoft has more confidence in SharePoint 2013's ability to manage unique permissions on individual items.  This is very much in line with the increases in unique permissions that SharePoint 2013 supports from a performance perspective (see the following article: Updated SharePoint 2013 Software Boundaries and Limits: Unique Permissions).... and all of this is a very good thing!

Happy sharing!
   -Antonio