With the global economy in the toilet, washroom, wc, commode, crapper, biday, the john.
Companies are looking to save money. (I still miss all the controls on my toilet in Japan, the spray vs. the biday and warm to hot water controls. Why are all toilets pretty much the same in the US? Do we need all these options... Maybe not in every bathroom.)
If your SharePoint Deploment was an Ice Cream stand, how many flavors would you be offering? Is each customer a custom hand holding experience or do you have self service? Sure customers will spill every once in a while (especially the untrained and inexperienced), but you don't have to be involved with every purchase and you can be focused on the other more proactive duties. Would you pick up a couple of well known brands, or go with 31 Flavors? Well, you're operating on a shoe string budget, and those flavors will definitely melt away if you can't figure out how to manage the one or two best selling flavors. You must achieve economies of scale or the other desert vendors will put you out of business. Maybe you go with the prepackaged and you simply provide the freezer and someone else does all the work, where you get the value, but the choices are good, but limited. (Mabe you go with a prepackaged offering like SharePoint online.)
I was in a consulting engagement recently when someone piped up and said they want to save money and asking what business benefit would they get out of splitting the site collections and databases and then saying they don't want to use quotas. While it was hard at first to articulate the business value of splitting databases and sites, I came to realize that the value is in performance and consistency. It's about achieving economies of scale. Getting to the SharePoint Rockstar Dream state of proactive bliss where the SharePoint ops team is more focused on helping out in consulting and addressing business process, scorecard, integration and dashboarding scenarios rather than chasing down perf issues and locking and taking days or weeks on brain dead restores. Is it the latency or is it the information architecture and poor site design?
If you've done much SharePoint administration you'll soon realize the sooner you can turn your site collections into repeatable, sustainable, objects with consistency and standardized administration, the sooner you'll be able to achieve economies of scale.
If all you are is administering one .com publishing site, some of these same principals apply, but understand the challenges around information architecture sizing and planning.
The assembly line helped Ford to beat out the competition and push out more cars than anyone could dream of in the early 20th century. When it first began many thought of this as a disaster, what about the personal touch? I don't want black, it's ugly. If he would have listened, the industrial age would have taken longer, and we'd be set back. Did they realize cost savings... You bet! The more options you have the more expensive. Why is it so hard to realize this in IT. We're afraid to push back, and when the budget gets cut, we try to do more with less. Why not work smarter rather than harder? Can we not figure out what is core, and do those well, and build sustainable services and platforms with consistency? Shouldn't all IT companies look at the hosting models and figure out how they can do the same. How do you host Exchange, AD, and SharePoint inside your company? What does your service look like...
Maybe Ford and car makers today needs to take a lesson from the past and build fewer types of cars and focus on those that meet the 80% of what people want and need and what people demand, optimize for fuel efficiency and optimize costs. (Can you really commoditize the 60's Ford Mustang GT convertible for me and make it electric or run on hydro?). Remember the cars that require you to order them months in advance? Why not make more of those, except make them look good, rinse and repeat.
In SharePoint the way to achieve low Total Cost of Ownership or TCO is to figure out what is manageable and to figure out supportability around what scales.
First off, why doesn't the product group give you these formulas and all the answers? Well, it's because all of these answers "depend." The defaults in the box are not appropriate for all environments.
With Site Collections it is always a tough question to ask, what is the max size? I've heard of a Site Collection that grew to nearly 1TB. The company was complaining about the product scalability, issues with security and ACLs, they found SQL blocking and locking issues (when they started monitoring for it), first it was the STSADM nightly backups so they could avoid having to restore that 1TB database. It was taking more than 72 hours to backup the site collections it was trying to backup before it would start over. Performance was horrible. Backups were running 24 hours a day, both Tape and SQL backups were taking forever. How could SharePoint scale if they couldn't even support 1TB? Indexing on top of this, and the SQL servers were running out of memory, all the time. (Another thing the SQL team wasn't watching for). After a number of outages they decided they needed to do something. First they decided that granular restores would be much easier if the databases weren't spanning multiple tapes. Other companies found solutions such as Quest Recovery Manager (just saw a demo this week, still fresh in my mind) which integrated with their backup solution to provide a granular restore solution to recover from the databases on disk without having to have a recovery farm or in leveraging DPM, or something like AvePoint Docave for granular restore. They determined with multiple databases they could do a multi threaded backup of about 6 max threads if they split up their database into multiple content databases and increase backup times by more than 3X. They found having hundreds of databases were a pain, so keeping the databases large and not too small had it's benefits as well. Sounds a lot like Goldilocks. Not too Large, Not too Small... Just right... So What is just right? In our years of debating this question we ended up at 50-100GB. Easy enough to fit into a 4 hour window, and small enough to copy over the WAN if they had to. The exceptions and putting site collections into their own database came as a result of needing to allow exceptions. 100GB is not a cliff, if you have 1 db that's growing past 200-300GB, it doesn't freak me out, but does increase manageability cost and has performance impact. It is a new flavor of ice cream, Rocky Road. Where's the limit? It's harder to recover, it's less reliable... so we treat it special... It's on it's own spindles, on it's own partition, etc... It is monitored and DOES impact IT by making them spend more time on it. It also came with recommendations from product support who was finding inefficiencies in queries when small site collections were combined with large.
What is a large farm these days? 10 TB is on the hefty side. Microsoft is around 20TB these days, but this is spread across 3 regional deployments, with the bulk in the Puget Sound (Redmond area) region. There are deployments with over 1 million users, but that's a different challenge altogether.
Why split up databases?
- One massive database offers no flexibility
- SQL Backups take too long
- Tape Backups/Restore take too long
- Tape Reliability goes down fast when spanning multiple tapes
- Database Restores take too long
- Snapshotting copy on write reliability goes down with larger diff areas (experience will differ based on products)
- Disk I/O and disk contention gets hammered with disk queuing for long periods of time affecting global performance
- Network Interface Cards get saturated effecting global performance (off hours local, but peak hours regional) - Lessen the impact to the network with backup VLANs which you can then leverage with multiple threads to reduce impact.
- Disk pivoting solutions more flexible and optimized with more disks, more databases to spread across those disks for better throughput
- Mixing small site collections with large site collections results in inefficient queries (ask Keith Richie)
What does the business get out of it? Is this all just technical?
It means better performance. It means being able to upload your document faster. It means when you have to run to a meeting, but that doc needs to get on the team site first, you make it to your meeting rather than retrying it 10 times and missing your meeting. It means better reliability for both pages and documents.
Why split up site collections?
- SQL Blocking/Locking Table locking - You don't want one site to impact another
- Site Collections over 15GB (some conservative companies even say limit to 5GB) are less reliable when using STSADM (Backup/Restore)
- Larger Site Collections become more difficult to support and more error prone
- Permissions structures become gnarly as groups and business purposes are mixed (Common mistakes)
- They are more likely to have large lists that are out of control
- Time it takes to upgrade is not sequential with size
- Upgrade will be more reliable and smoother easier
Is 15GB a magic number? No. Based on testing of reliability of site collections over 15GB vs. less than 15GB. You want to use 25GB? It's trade offs. In determining the 15GB number it was based on reliability and time to backup/restore with STSADM. These were the major factors, but re-factoring and restructuring the sites really paid off with better site and page rendering performance on a day to day basis, not just in STSADM operations.
In this blog I've had more posts than anyone else on List, Site, Site Collection and Database Scalability, even storage.
In the figure above, you see 2 major buckets:
Commodity Databases ($ Vanilla):
The commodity databases host the common collaboration, projects, workspaces, document sharing, blogs,
wikis, and so on. While many of these sites are likely under 100MB, the larger ones should be watched as they grow beyond 5GB to 15GB. My recommendation would be a 5GB quota at which time a site is analyzed for super large files that aren't appropriate and information architecture is consulted with the group or team on scale. As business requirements dictate up through the 15GB they are considered exception and are moved to dedicated databases. For sites under 15GB, STSADM is the primary tool for backing up, managing, and so on. Site administrators of site collections in the commodity space should sit through a few hour training on how to manage and maintain site security, provision and manage sites and lists for scale, etc...
Achieving Economies of Scale!
- Mandatory Site Collection Admin Commodity Training and acceptance of policies
- Accepts to live in capacity boundaries and understands scale and performance management
- Training on permissions, roles, delegation rules, and use of AD security groups for scale
- Knows how to use Storage Manager, Life Cycle Retention Policies and Recycle Bin
- Trained on the Service Center and understands and accepts community support responsibilities
- Site Usage Scenarios Extranet, Supplier, Partner vs. Intranet, Product Sites, Product Wikis (separate internal content from external content)
- Encourage Community Champion/Lead Incentive Programs
Leave the User with a laminated SharePoint Information card on policies and procedures as well as training. Just found this one, I'm not sponsoring them, but it is pretty cool. I do suggest you put in your corporate policies, not just the plain ordinary training stuff.
Large Dedicated Databases ($$ Chocolate Special - treat with Care):
The dedicated databases are special exceptions where the business units or groups have shown for business reasons they need to continue to scale. These special exceptions are commonly knowledge
repositories and should have training on tagging, metadata management, content types, properties, search, and understand how to be take advantage of virtual navigation using lists, top nav, left nav, and not deeply nest sites. Structured data should be separate from ad hoc collaborative data. These special site collections should not be dumping grounds, and special care should be taken to optimize repositories with folders, indexed columns. Note: Paging does not solve the problem. If a view has 5000 items and displays 100 items at a time, it will query against the entire view (all 5000 items to get the next 100) before rendering each page. Sites in dedicated databases now use the Database tools for backup/restore and moving them around. Even farms are happy to take on site collections in dedicated databases without running the stsadm tools. Essentially limiting the STSADM operations where everything has to run through iterations saves performance.
- Mandatory Knowledge Document Repository Training on top of existing - Help them be your eyes and ears, knowing what to watch for in terms of performance
- Site Structure training - Information Architecture planning essentials
- Training on Managing and Monitoring Site and Site Collections Permissions and 64K ACL issues and use of AD Security Groups
- Community Champion incentive programs
More Flavors ($$$) Custom Development and Customization
Customization. While the focus of this post is simply on size in relation to the SharePoint Site hierarchy. Custom Development and Deep Customizations are another way to hurt your support economies of scale. It does have it's place in the application space, but here we're talking about running out of the box commodity hosting scenarios.
If you are seeing perf issues related to any features or solutions, you should look at the new SPDispose also called SPDisposeCheck tool to check for memory issues implemented into the code on accident. Memory leaks are one of the top call generators on the dev side. Devs just assuming it will automatically dispose and it doesn't!
I recommend a separate service for true line of business deployments. If you are just trying to apply some look and feel and can do it without custom solutions, then don't worry about splitting up the content. It's when one business/department/division/team/group etc... wants something and everyone else takes the risk/hit. By isolating that group through separate web apps, and separate memory (app pool and worker process) they can minimize the impact to the rest of the server.
SharePoint Online and Hosted
With SharePoint online's published rate sheet with cheap prices, companies are looking at these services and saying to themselves, what are the tradeoffs. Why would we run this on premise with the headaches and all, when we can let Microsoft host this for us, and let them upgrade the servers, support the users, and so on.
With Standard, you'll see a very consistent one scooped ice cream cone. Very vanilla, extra vanilla. The rules around quotas and users and security are set policies.
With Dedicated you'll find more flexibility, but still I'd encourage you to ask them what max size of site collection, and max size of database they will support. They may differ slightly from what I've provided here as guidance, but it wouldn't surprise me if they are either the same or even more conservative. If your plan is to ultimately have your vanilla service hosted so you can focus on the Sales App that integrates with Siebel or SAP, then you're going to want to make sure your flavor of vanilla is similar. Net new isn't the challenge, it's migration. With planning they will hear what the service offers and pay attention. That rate sheet and then the technical guidance on what they support is what you should provide as a service internally. Be consistent, even if you are supporting one department consistency, policies, and standards are critical to achieve your goals of supportability, sustainability, and scale. You adjust the policies as you figure out new things, and restructure to address the weak points. People don't know everything, but those who have figured things out, attempt to share best practices... Like at the upcoming SharePoint Best Practices conference. Where you will find me pouring my heart out.