Follow me on Twitter @AntonioMaio2

Tuesday, March 4, 2014

Notes from SPC14: SharePoint for large scale records management - hundreds of millions of documents and beyond!


Tuesday March 4, 2014 3:15-5:00pm
Speakers: Alex O'Donnell , Nishan DeSilva , Roberto Yglesias
ITPro Session 

This session talked about a few very large SharePoint deployments, some of the largest in the world.  I captured notes on the first example given which has some great recommendations if you are ever dealing with SharePoint records management at very large scale. 

The example talked about a very large bank - A global bank with global needs
  • 250k users, 370 million documents, 1.5 PB legacy content 

 Some considerations:
  • consider the cloud in all that's done, but say on premise for now
  • refer to SP software boundaries article to start
  • Went with record center site template instead of in-place records management 

Crunching the Numbers based on SP2013 Boundaries and Limits recommendations
  • Content Database Size: no explicit limit for records
  • Content database items: 60 million
  • Site collections 5K per database, 250K per farm
  • Documents: 30 million per library
  • Document versions: 400K per document
  • Security Scopes: 50K per list 

In-place versus Records Center
  • Perceived concern managing a separate center
  • Uncertainty of an active site's full lifecylce
  • Need to change retention at a granular level
  • Permissions needed on an item's records
  • Separate storage layer beneath records
  • Differing SLA for records 

  • Decided on using Records Center!

Information Architecture Mapping - very important to think this through
 

Two Content Types Selected in the End
 

Routing Records
Encouraged Model:
  • SharePoint 2013 content organizer rules allow document routing based on available metadata into libraries and folders

Current Model:
  • Concept of ownership defined at the site level - maps to records access
  • Permission on each site's library at provisioning 

High-Level Overview
  • On trigger, routing rules determine where the record should be stored:
    • Site provisioning captures metadata*
    • Document content type captures metadata
    • Major version publish triggers record*
    • Rules rename and move item to correct location*
    • Record repository permissions set for site owner(s)
    • Search and eDiscovery for review and legal hold 

Denotes customization *

Development Core Principles
  • Did you consider configuration?
  • Use recommended extension points?
  • Consider future cloud migration
  • Use a mature release management process
    • Code review, analysis, patterns, frameworks
    • Make use of existing tooling
    • Automate and document everything
  • Performance test at extreme scale 

Relevant Customization Points
  • ItemUpdated in SPItemEventReceiver
  • SendToOfficialFIle method to submit a record
  • OnSubmitFile in ICustomRouter
  • SafeFileToFinalLocation after making changes
  • ComputeExpireDate in IExpirationFormula
  • Return calculated date 

Potential End State Architecture
 

Best Practices Recommended
  • review boundaries and limits article in its entirety - some of the numbers will pull and push on each other
  • consider security, sizing and granularity to determine information architecture
  • configure before customize 

Conclusions
  • Consideration of SharePoint Boundaries and Limits
  • Records Information Architecture is as important as Active Information Architecture
  • Relevant development can meet complex requirements without additional products

No comments:

Post a Comment