Pages

Tuesday, April 1, 2014

SQL Server Installation Checklist

Many people have published SQL Server installation checklists before. Here is one I have been using for several installations without issue, six for SQL Server 2012 and three for SQL Server 2008 R2. This list has evolved over the years, but primarily comes from many sources, including my own experience as well as Jonathan Kehayias, Glenn Berry, and Paul Randal.
 
This list is just a base configuration and is by no means complete. It is intended to be a starting point, not just to standardize your SQL Server installations, but also just to make sure you do not miss anything.
 
Pre-SQL Installation
 
1.     Service Account
a.     Procure all AD Service accounts. Preferably one for each SQL Server service.
b.    Register an SPN. DBAs often do not have permission to do this. Have this done by a domain admin now so it is ready to go after the installation:
                                          i.    Add the readServicePrincipalName and writeServicePrincipalName permissions to the SQL Server service account in AD.
                                         ii.    Run the following:
setspn -S MSSQLSvc/ServerName SQLServiceAccountName
setspn -S MSSQLSvc/ServerName:1433 SQLServiceAccountName
setspn -S MSSQLSvc/ServerName.root.DomainName:1433 SQLServiceAccountName
2.     Server Configuration
a.     Install the latest applicable Windows Updates. Ensure there are no pending reboots before continuing. Run Disk Defragmenter afterwards.
b.    Add the SQL Server service account to the following Local Security Policy.
                                          i.    Replace a process level token
                                         ii.    Lock Pages in Memory (LPIM).  May not be necessary in Windows 2008 or newer. Do it anyway.
                                        iii.    Perform Volume Maintenance Task
c.     Set the queue depth to an appropriate level. Preferably 128 or higher.
d.    Set any virus scan software to not scan any MDF, NDF, LDF, BAK or TRN files.
e.     Set all NICs, physical or VM, to full duplex.
f.     Set BIOS or VM power management to OS managed, then set to full power.
g.    Format data drives with an allocation unit size of 64k, and 8192 bytes for log drives.
Verify using the fsutil utility: fsutil fsinfo ntfsinfo [drive letter]:
h.     Set the pagefile to 150% of the total RAM. If this is too large, then set it to 100% + 1 MB.
i.      Create local “SQL Server Admins” group, add applicable users.
j.      Add applicable users to local Administrators and RDP groups.
3.     Drives and Directories - Create the following directory structure for a single SQL instance. Also consider directory structure requirements for other services, such Analysis Services.
a.     OS – C:
b.    SQL  Installation – D:
c.     Data drive(s) – E:\SQLData, add more as needed.
d.    Log drive – L:\SQLLogs
e.     Backup drive – N:\Backups
f.     TempDb drive – T:\TempDb
4.     Service Account Permissions
a.     Remove “Everyone” from non-C drives.
b.    Grant the following permission to the SQL Server and SQL Agent service accounts
                                          i.    SQL Server
1.     D: - Full control
2.     E: - List folder contents
3.     L: - List folder contents
4.     T: - List folder contents
5.     E:\SQLData - Full control
6.     L:\SQLLogs - Full control
7.     T:\TempDb - Full control
                                         ii.    SQL Server Agent
1.     D: - Full control
2.     N: - List folder contents
3.     N:\SQLBackups - Full control
SQL Installation
1.     Features - Install the features needed for the instance. Do not install something “just in case.”
2.     Shared feature directory: D:\Program Files\Microsoft SQL Server\
3.     Shared feature directory (x86): D:\Program Files (x86)\Microsoft SQL Server\
4.     Instance root: D:\Program Files\Microsoft SQL Server\
5.     Service Accounts: add applicable AD service accounts with Automatic Startup
6.     SQL Server Bowser: Disabled if installing only one instance.
7.     Authentication Mode: Mixed, or as appropriate.
8.     Add “SQL Server Admins” local group to SQL Server Administrators
9.     Data Directories
a.     Root: E:\SQLData
b.    User database: E:\SQLData\
c.     User database logs: L:\SQLLogs
d.    Tempdb data and logs: T:\TempDb
e.     Backup: N:\Backups
Post-SQL Installation
1.     Install any SQL Server or Visual Studio service packs and updates.
2.     Check SQL Logs that Lock Pages in Memory is being used. Restart SQL service if not done after updates.
3.     Confirm the SPN was registered:
a.     Restart SQL service if not done after updates.
b.    The query below should return “KERBEROS” if registration was successful. Run from a remote connection:
SELECT auth_scheme
FROM sys.dm_exec_connections
WHERE session_id = @@spid ;.
4.     Added applicable users to sysadmin server role.
5.     Set Model database settings. Adjust the sizes as appropriate. Here is a good start:
a.     SIMPLE recovery
b.    Data file: 2048 MB Initial Size, 1024 MB autogrowth, unlimted size
c.     Log file: 1024 MB Initial Size, 512 MB autogrowth, unlimited size
d.    Move .mdf and .ldf to desired data and log folders if necessary.
6.     Set master and msdb database settings. Adjust the sizes as appropriate. Here is a good start:
a.     SIMPLE recovery
b.    Data file: 1024 MB autogrowth, unlimted size
c.     Log file: 512 MB autogrowth
7.     Set tempdb database settings. Use Glenn Berry’s suggested settings for a starting point.
a.     SIMPLE recovery
b.    Add additional data files as appropriate:
Logical name = tempdev2, File name = tempdb2.ndf.
c.     All tempdb data files: 4096 MB Initial Size, 1024 MB autogrowth, unlimted size
d.    Log file: 1024 MB Initial Size, 512 MB autogrowth, unlimited size
8.     Set trace flag -T1118. This helps elevate contention in tempdb. There is some debate on this, but it looks like it does not hurt to have it on. PaulRandal’s blog goes into detail.
9.     Enable the DAC. (sp_configure 'remote admin connections', 1)
10.  Set Optimize for Ad hoc Workloads to True
11.  Set Min and MAX Server Memory to an appropriate level. Use Glenn Berry’s suggested settings for a starting point.
12.  Set Cost Threshold for parallelism as appropriate.
13.  Set Max Degree Parallelism as appropriate.
14.  Setup database mail.
15.  Create Administrator Operators.
16.  Enable Alert System in SQL Agent using the profile just created.
a.     Enable fail-safe operator for notification types.
17.  Configure SQL Error Log retention for 31 log files, or whatever is appropriate.
18.  Install Brent Ozar’s sp_blitz. Address issues.
19.  Create job to cycle error logs: sp_cycle_agent_errorlog, sp_cycle_errorlog.
20.  Configure alerts for severity 16 through 25 as well as specific alerts for 823, 824 and 825 errors.
21.  Confirm SQL Server service account is NOT in the local Administrators group.
22.  Install and MS Baseline Config Analyzer and SQL Server 2012 BPA on your workstation, run. Address issues.
a.     Setup backups
b.    Setup integrity checks
c.     Setup index maintenance
d.    Setup history maintenance

Friday, February 22, 2013

Memory - Buffer Manager Free Pages

This script will display the current value of the number of free pages available in the buffer pool for a SQL Server instance.

Having a sustained value below 640 could mean that new memory requests may stall and performance will suffer. Looking at Buffer Manager Free Pages over a period of time can give a general indication of possible memory pressure.  You will also want to look at Page Life Expectancy and Buffer Catch Hit Ratio.

Buffer Manager Free Pages in SQL Server 2012 is not applicable.

SELECT  cntr_value ,
        GETDATE()  AS [CollectionDt]
FROM    sys.dm_os_performance_counters WITH NOLOCK )
WHERE   OBJECT_NAME N'SQLServer:Buffer Manager'
        
AND counter_name N'Free pages'
OPTION  (RECOMPILE);

To learn more about the Buffer Manager and Free Pages and memory monitoring, please take a look at these links:

Wednesday, February 20, 2013

Operation is not valid due to the current state of the object

I have discovered how to fix this, but there was not a whole lot of information out there yet, so I thought I would add a post with all the information I found. Hopefully you can add to the discussion with anything useful to add.

Solution

Before getting into the details, the quick solution is to add key="aspnet:MaxHttpCollectionKeys" value="?" to the appSettings in Web.config. The default value is 1000, which appears to have changed in a recent security update. Set the value to something higher than 1000, but understand the security implications before doing this. There is a reason why Microsoft made the change. I recommend reading the KBs mentioned in this post.

Here is the key I added to <appSettings> in Web.config for Reporting Services. On the Report Server Web.config you may have to add <appSettings>.

<add key="aspnet:MaxHttpCollectionKeys" value="1500" />


Symptoms

After adding several security updates from Microsoft Update on one of our Reporting Services instances, we started receiving complaints that some reports would not run and would display the following message.
Operation is not valid due to the current state of the object
Our Symptoms
  • Reports with large number of parameters, or form fields, would not return the report and display “Operation is not valid due to the current state of the object” in Reporting Services and Report Server.
  • Applications that called the Report Server for a report with a large number of parameters, or form fields, would display a generic “rsInternalError” message.
  • We installed Security Update for Microsoft .NET Framework 4 on XP, Server 2003, Vista, Windows 7, Server 2008 x86 (KB2656351). http://support.microsoft.com/kb/2656351
  • Event ID 1309 in the Application Event Viewer from source ASP.NET with: Event code: 3005, Event message: An unhandled exception has occurred.

Others’ Symptoms
  • Source Error: System.InvalidOperationException
  • Stack Trace: InvalidOperationException: Operation is not valid due to the current state of the object.

Cause

This appears to be caused from Microsofts security bulletin MS11-100 (http://technet.microsoft.com/en-us/security/bulletin/ms11-100) for the .Net framework. In short, the update resolves vulnerabilities in how .Net handles requests, authenticates users, and cached content.

It looks like this has impacted ASP.Net environments that have installed KB 2656351. Any of these security updates described in security bulletin MS11-100 may also cause this issue, although I have only researched 2656351.

  • KB 2656351
  • KB 2656356
  • KB 2657424
  • KB 2656352
  • KB 2656362
  • KB 2656355
  • KB 2656358
  • KB 2656353

Here are some links I used to bring this information together.


Hope this helps. Any contribution is appreciated.

Regards,
Jon

Monday, August 8, 2011

Replication Agent Failure

I recently spent several days investigating a replication failure. Finding the answer specific to my problem was difficult, so I thought I would post my findings and hope someone else may find it useful.

The day after installing a few security patches, replication to the subscriber database began to fail. The Distribution Agent had the following error message:

Executed as user: <UserAccount>. Replication-Replication Distribution Subsystem: agent <AgentName> failed. Executed as user: <UserAccount>. A required privilege is not held by the client. The step failed. (Error 14151). The step failed.

This message is usually caused by changing the SQL Server service using the Windows Service Control manager, which cannot grant the required permissions to start the service appropriately or to run SQL Agent jobs. SQL Server Configuration Manager should be used instead and the correct way to fix it is to set the service account to the Local System account, then back to the domain account using the SQL Server Configuration Manager. (http://support.microsoft.com/kb/911305/)

Since I did not change the service account recently, I thought this could not be the correct answer to the problem. But since I did get the same error message, I thought it would not hurt to restart the service with SQL Server Configuration Manger anyway. Of course, this did not work, but error message does mean it was security related.

Then I changed all the replication agents to have the maximum privileges possible. This did work either. This led me to think that this was not related to the articles themselves and maybe not even to Replication or SQL Server. I found the following error message in Event Viewer:

Log Name: System
Source: Security-Kerberos
Event ID: 4

The Kerberos client received a KRB_AP_ERR_MODIFIED error from the server <servername>$. The target name used was <servername>$. This indicates that the target server failed to decrypt the ticket provided by the client. This can occur when the target server principal name (SPN) is registered on an account other than the account the target service is using. Please ensure that the target SPN is registered on, and only registered on, the account used by the server. This error can also happen when the target service is using a different password for the target service account than what the Kerberos Key Distribution Center (KDC) has for the target service account. Please ensure that the service on the server and the KDC are both updated to use the current password. If the server name is not fully qualified, and the target domain (<domainname.com>) is different from the client domain (<domainname.com>), check if there are identically named server accounts in these two domains, or use the fully-qualified name to identify the server.

This error message occurs when two or more computer accounts have the same SPN registered.
(http://support.microsoft.com/kb/321044)

To fix this we deleted the computer account entries in AD, then disjoined and rejoined the server from the domain. Except for a few expired articles, all replication articles synchronized on their own without intervention. I reinitialized the expired articles.

This fixed our problem so far, but if anyone has any experience with this your feedback is welcomed.

 

Regards,

Jon

Friday, May 20, 2011

Managing Data Change

Often DBAs receive requests to manually manipulate production data. Ideally, all data changes should be performed through the appropriate application or interface. Unfortunately, sometimes applications lack a feature or has a bug that does not allow the user to modify data. To fix a bug or add a feature may require another software release. Sometimes the business need is too great to wait for the next release.

Manually manipulating data can pose various problems for both the DBA who designed the database and knows everything about the schema, as well as the DBA who recently inherited a database and knows nothing about the schema. Some of these problems are:

  1. Data cannot simply be changed whenever we want, to any value we want. Most tables are related to other tables. Some data is designed to represent a record’s state at a certain point in time. Changing a value may break that relationship or state and, therefore, break the application and corrupt data.
  2. In most cases with a well designed and documented database that has all the appropriate keys and constraints, a DBA can quickly determine if modifying data will break the application. In the real world, this is often not the case. The schema does not necessary reveal the application logic, or how the data is used or modified. Sometimes only the original developer or experienced analyst knows the application well enough.
  3. Changing data manually is very labor intensive. Also, DBA and developer labor is not cheap. Because it is labor intensive, continuing to manually manipulating will always become more expensive at some point in the future than coding the application appropriately.
  4. Changing data manually is error prone - Despite their appearance, DBAs are human. Even a highly skilled DBA with years of experience can write an UPDATE statement incorrectly. Also, the DBA can be given the wrong record to update.

If requests for data modification are frequent, policy and procedures to manage data change are needed. These policies and procedures should have the following objectives:

  1. Retain old data - Retain data prior to the change so the changes can be tracked and reversed, if needed.
  2. Document Purpose - Documenting the purpose of the data change in order build a case for a software update. By doing this you may find that you are updating the same column on the same table several times a month. Now you will have the documentation to prove it and why.
  3. Notify Interested Parties - There may be other interested parties in your company that need to be aware of the change. Operations managers may not want you changing records during the day or after an order is shipped. Software Development and Product managers may want to know about a missing feature or bug in the software.
  4. Only certain data can be changed - Because the database does not necessarily reveal the application logic and table relationships, only approved tables and columns should be changed.
  5. Test all changes - All changes should first be done in a development environment.
  6. Verify changes - All changes are verified in a QA or testing environment.
  7. Use the best person to develop an update script - The DBA may not be the best person to develop an update script. Sure, he or she may be the best skilled, but as I mentioned above, the DBA may not completely understand the application logic or business implications. In such cases the developer, maybe with the help of a business analyst, can develop an appropriate update script.

How does your company manage data change requests? Besides my list above, what other problems do you see with manually manipulating data? Can you think of other objectives a data change request policy should have?

Monday, May 2, 2011

Poor Man’s SQL Monitor

Here is quick tip that can help you track SQL Job and Alert notifications in Microsoft Outlook. You may already do this, but if you arrange your inbox by conversation, instead of date, emails will grouped by subject. This will enable you to easily “arrow down” through the group and quickly compare messages.

Look at the standard a SQL Server Alert notification as an example:

Subject: SQL Server Alert System: 'Batch Requests/sec
DATE/TIME: 5/2/2011 5:35:01 AM
DESCRIPTION: The SQL Server performance counter 'Batch Requests/sec' (instance 'N/A') of object 'SQLServer:SQL Statistics' is now above the threshold of 1000.00 (the current value is 1077.30).

By sorting through several of these “Batch Requests/sec“ messages you can quickly compare important information such as the date/time and the current Batch Request/sec value.

In my particular case, I received this email exactly every 30 minutes. I may have found this pattern anyway, but by arranging my inbox by conversation, it was quick and easy to spot this pattern. The exact time pattern suggested some type scheduled job or process, which is exactly what this problem turned out to be.

Friday, April 29, 2011

Managing your Backup Schedules

Have you ever noticed that your backup job duration is growing faster than your database. This just recently happened to me so I thought I would share this.

Not managing your backup schedules could cause you several problems later on. This is especially the case when you do not have any centralized backup management software, such as Red Gate's SQL Backup. Managing backups is fundamental for a database administrator, but I believe revisiting the basics is good practice, especially of your are a new DBA.

For the past several months, I have noticed the duration of several backups taking up to 50% longer. At first I just assumed since most of these databases are growing, it makes sense the backups would take longer too, so I did not do anything about it.  Eventually, I noticed most of these databases had grown only about 15% over the same period. This did not add up to me. I then looked at my backup schedules on the the different servers and noticed that some of them started at the same time and some where not completing before the next one started. These jobs where competing for SQL Server resources, primarily disk contention on the local backup drive as well as the the network drive, where the backup file was later copied. There was also a developer database restore job that runs at the same time other backup jobs were running, causing disk contention on the network drive.

Of course, the answer is to just schedule the backups appropriately, but there is a lesson here. Here are a few things I do to keep my backup schedules from tripping on each other:
  • Periodically review your backup schedules. If your backup jobs use the same drive, local or network, plan your schedules to work with all jobs on all servers.
  • Consider other jobs or processes that will be accessing the same drive.
  • Monitor your database growth. Your backups will grow too.
  • Monitor your backup duration and adjust your schedules accordingly.
  • Use a spreadsheet to list all your schedules' for all servers to see if any overlap. List Server Name, Process/Job Name, Start Time, Average Duration, and Average End Time.
Again, this is fairly basic, but I have been a SQL Server Database Administrator for over 9 years and this one got me.