Saturday, November 10, 2012

PASS Summit Day 3

The last day of PASS is pretty exhausting. My head was still spinning with ideas from the previous days and the tracks were a bit less balanced. To let the ideas settle in I investigated some different areas.


Microsoft is really getting into Hadoop in a big way. I am glad that they have partnered with Hortonworks (the orginal Yahoo Hadoop team) for the Windows implementation.  Microsoft is making a serious attempt to be a good opensource citizen and is actively contributing back to the Apache project. Their Hadoop product now has a name, HDInsight and some interesting characteristics, such as:

  • Available via NuGet
  • Web Management console
  • Map-Reduce can be written in C# and JavaScript
  • ODBC Driver to connect to Hive
    • Normal .Net
    • LINQ 
    • Excel (with a friendly plug-in!)

There are two versions announced

  • Azure Edition
    • This version actually substitutes the Azure BlobStore for HDFS. Presumably this was done because they share many characteristics but the BlobStore is already tuned for the MS Datacenters and has presumably avoided the NameNode issue somehow.
    • A popular model is to gather data (such as web or application logs) into the blobstore and on a regular schedule spin up a large HDInsight cluster, run the analysis and then spin it back down. This technique is used by Halo 4.
  • On-Premises Edition
    • This sounds as though it is targeted at preconfigured Hyper-V Virtual Machines (I may be way off here).
    • The filesystem is HDFS, not NTFS or version of the BlobStore
    • Enterprise infratructure support coming
      • Microsoft Virtual Machine Manager
      • System Center HDInsight Management Pack
      • Active Directory for unified authentication

Additional features (maybe)

  • Scheduled
    • Oozi Worksflow Scheduler
    • Mahout
  • UnScheduled (but popular requests)
    • SSIS Integration
    • SQL Server Management Studio


The Azure data integration story continues to improve. The model to move data is either using SSIS or Data Sync. However, the SSIS development best practices are a little bit different. To transfer data efficiently there are a combination of things to do:

  1. Compress your data being sent over the network
  2. Parallelism is good but adding thread will reduce the transfer time
  3. Robustness must be maintained by the package. A "blip" in th connection cannot force the entire operation to restart. This is long and painful process and should be able to pickup where the blip happened. This is achieved by using a loop (with sleep) to keep retrying and sending small sets into staging tables and then pushing loaded sets from staging into final tables.


Encryption is becoming more of an issue as PII regulations become more prominent. Some things to keep in mind when planning security

  • Always salt the hash and encryption. 
  • Be aware of CPU scaling concerns
  • Encryption makes everything bigger
  • Only 4000 characters at a time work with hashing and encryption functions built into SQL Server.
  • Encrypted and hashed values will not compress well

What types on encryption should be considered?

  • Field Level
  • Data at rest (on disk or on backup)
  • Connection Level
  • SAN level

Field Level

At the field level this can get complex because it is either managed in the DB or the Application. Each has tradeoffs

  • The DB is hard to scale out by adding nodes if it is responsible for all encryption. 
  • The DB could manage to invisibly decrypt columns based on account permissions
  • The application would have to wrap all sensitive data with encrypt / decrypt logic if it was responsible for doing it.
  • The application can scale out to additional servers if needed to distribute the CPU load.

Data at Rest

When the data is at rest the database can encrypt it with minimal overhead using Transparent Data Encryption (TDE). This does mean that data is not encrypted in memory so an attacker could view data sitting through an exploit. TDE does do some changes silently. Instant File Initialization becomes disabled as soon as TDE is initiated. The shared database TEMPDB also must become encrypted.

Backup encryption is critical because there is no telling what will happen when tapes leave the security of a private datacenter and goes touring through the outside world.

Connection Level

There are two ways to make sure you connection is encrypted. A simple one to setup is having Active Directory enact a policy requiring IPSec for all connections to the DB Server. The other one is to use certificates to establish the classic SSL connection encryption. This can be controlled at the connection string level so you can configure it for only a subset of the traffic and reduce the CPU load.


SAN Level Encryption (MPIO) is fascinating because it will stop the LUN from being restored to a machine without the proper certificates. This tends to require a lot of CPU resources but it is possible to offload that encryption work to special (such as the Emulex HPA) that can be integrated into the RSA Key Manager.


I did discover something a bit disappointing about DQS today. There are curerntly no matching services available. That means that there are no internal web services for identifying matches nor are they any SSIS operations available. Matching work can ONLY be done through the client GUI. Without automation this will have a real challenge working with some types of data..

Friday, November 9, 2012

PASS Summit Day 2

Another great day filled with learning new techniques and discussing different challenges. There was more of a focus on Query optimization than anything else today. After the sessions We all went to the Experience Music Project where there was food, live band karaoke and a Dalek!

Great fun all around, I just need to get back to a normal sleeping schedule, this 4 hours a night thing is exhausting.

Technical notes below!

The session that Adam Machanic did about forcing better parallel execution plans was astounding. He took a 28 minute query and optimized it to about 3 seconds and by improving the parallelization. 

The Optimizer Issue

The premise is that optimizer is outdated because the Memory to Data Volume ratio (size or speed) is nothing like the way it was 1997 (when the last big optimizer rewrite happened). This means the optimizer needs to be coaxed into doing the right thing. 

The optimizer tends to serialize a lot of processing for a number of reasons that may not make sense now. The problem with serializing all of these processes is that Amdahl's Law (not theory, LAW) says that the number of serial processes substantially impacts performance as the number of processors increases. The goal is to make as many parallel as possible. The reason the optimizer does not default to this behaviour is because it is expecting greater disk latency than we have now that would have rendered this point moot.

Possible Approach

An example of a common item that inhibit can parallelism is the Scalar Function. Potential solutions to this are: 
  • Convert it to an in-line TVF and CROSS APPLY it. 
  • Create a CLR scalar function (weird but true, it works much faster)

To check how balanced the parallelism is you can open the properties of any step of the query plan and check view the actual number of rows returned for each thread (there is a little arrow thing hiding there).

There is a traceflag that will force the optimizer to try to make some part of query parallel (Trace Flag 8649) but this is not a good answer. The issue is that it doesn't ensure that you are actually removing the serial parts to balance the load.

One way to encourage the optimizer to seek the parallelism path without the Trace Flag is to use a TOP(@i) with an OPTION(OPTIMIZE FOR (@i = 2000000000)). This will make the optimizer expect 2 billion rows even if there are only 10,000 in the table. This will top the estimated cost calculate to the side of making the query parallel.

MAXDOP should be configured to less than the number of processing cores for each numa node. This avoids making for foreign memory requests which may be slow. It is ok to go higher, but this is a very sensible starting spot, be sure to test carefully.

Parallel Apply Pattern

The Parallel Apply Pattern can be used to enourage working with an managing efficient streams with the optimizer. The abstract form of this query is:


To execute this pattern the DRIVER of your query must be identified, for example is it really just calculating information about product stock levels? Then the driver table would be Product and the CORRELATED PAYLOAD would be the sophisticated joins, aggregations and calculations. 

Roughly speaking, for each record in the DRIVER TABLE the CORRELATED PAYLOAD executes all of it's heavy set based work as a smaller piece rather than against the full DRIVER TABLE set. This is more efficient for parallel loads.


If the driver table is not going parallel there are a couple of choices after turning it into a derived table
1. Use the TOP techique mentioned above. The optimizer will think there are more rows.
2. Use a DISTINCT 0%ColA + ColA calculation in the select list to prevent a calculation from being extracted from the parallel stream.


If the CORRELATED PAYLOAD is still having it's calculations pulled away from the parallel stream that change the CROSS APPLY to an OUTER APPLY. This prevents the optimizer from knowing if it is safe to pull the calculations up (as a poor optimization choice).

Thursday, November 8, 2012

PASS Summit Day 1

Summit officially began today. I had a great time meeting new people, talking to vendors and learning stuff.

The big things today were they Keynote, Extended Events, Enterprise Information Management and Performance Tuning.

MS is clearly taking the threat from SAP's HANA seriously with making a new in-memory storage engine fr SQL Server. It will be in the next major release and is expected to speed up existing applications substantially.

Extended Events are interesting and really are complex enough that there would be a real benefit from having an actual book assembled about them. It is replacing a very simple tool (Profiler) with a very complex and powerful one. I learned just enough to know that I need to send a lot of quality time to understand it as much  as I'd like.

So far it looks like Microsoft is taking the approach of going after the smaller, simpler data to Master instead of trying to battle in the arena of CDI or PIM where there are large established player. This follows the approach MS has had for a while. Something like Informatica is just overkill to manage Jill's spreadsheet that everyone shares, but MDS is perfect for managing and distributing the information throughout the company.

Some of my notes from today are below:


MS Announced Hekaton : An in-memory engine for SQL Server

  • There will be an adviser on when to use it
  • Live as part of the DBMS server as a different storage engine. Heterogeneous
  • Lock and latch free
  • The storage engine for tables can be changed to move them in and out of memory. If the table no longer fits it switches to the conventional model. Because this can be done seamlessly the data is likely persisted to disk in the conventional DBMS way.
    • Exadata uses flash as a sort of cache fusion
    • HANA is designed as Memory primary and disk secondary so has huge performance drops when memory limits are exceeded.
  • Stored procedure optimization compiled to native code
  • Included in the next version of SQL Server

Parallel Data Warehouse
* Polybase allows seems connection to Hadoop / Hive and SQL Server DB to allow a single point of federated querying

Extended Events

SQL Trace and Profiler are deprecated. With SQL 2012 every traceable is now available in Extended Events

Extended events are much lighter than Profiler (similar to trace) of the previous options with support for multiple targets.
The available targets are:

  • Ringbuffer
  • Event Bucketing (Bucketizer) is a specialized ringbuffer, storing aggregates
  • Event Pairing allows the tracking of things like everything happening between lock_acquired and lock_released
  • Event file (xml)
  • Synchronous Event Counter only tracks number of times and event fired, no details

The SQL Server audit events have been merged into the normal extended events.

System Overhead (as tested by SQLCAT)

  • Query Time increases by 2 microseconds. As more actions are tracked this does increase.
  • Memory increases by 4MB for the RingBuffer
  • Disk storage is more verbose, always try to avoid logging to the Data / Log volumes
  • If the event tracking causes too much of a slowdown, it will be automatically disconnected until it has less impact on performance.

In general the learning curve is steeper but you can do more with Extended Events than with Profiler.

The GUI is not as nice and profiler and there is a larger learning curve. Also the SQL 2012 GUI does not work with SQL 2008.

Event information is recorded in an XML structure that is fairly consistent. Use XQuery to extract the results in a tabular form.

One mental model for understanding the different parts of the setup is to think of it like an insert statement


The predicate will short circuit, this is means that Criteria should be placed with the most restrictive rules first!

More good stuff on Jonathan Kehayias's blog 

Because traces are very light they can run in the background. For example, the old SQL Trace Blackbox has returned as an Extended Event session called "system_health" and is visible under: Server>Management>Extended Events>Sessions and is using two targets, an event file and the ring_buffer.

Performance Tuning

5 Mains areas to consider when there is a problem

  • IO
  • CPU
  • Memory
  • Network
  • Locks and Blocks

Activity Monitor is MUCH better than it was in SQL 2005. Most of the common troubleshooting DMVs are presented there.

The only system process with a SPID > 50 is... SERVICE BROKER! That is a feature that really gets no respect and it turns out the other system process don't want to be associated with it either

When a WAIT x:y:z is displayed, the variables represent

  • x : DBID
  • y : File ID
  • z : Page
    • page 1 contains the page free space bitmap, contnention here is typically for heaps (most temp tables and table variables are heaps). Scalar and non-inlined TVFs can abuse this by creating and destroying objects swiftly.

Wednesday, November 7, 2012

Pre-Con Day 2 T-SQL Querying and Query Tuning Enhancements in SQL Server

This was the second (and last) day of the pre-conference events. I selected the T-SQL session for two reasons, I know that there were a lot of enhancements to core and it was being given by Itzik Ben-Gan, who is a SQL Master (he literally wrote the book on T-SQL query techniques).

I was a bit concerned because he enumerated the items he wanted to cover and I've used nearly all of them so I didn't expect to learn much. Much like yesterday, I was wrong.

There were two areas that I struggled to keep up with him during the session, those were the Aggregate Windowing (which I never used) and all the crazy ways you can use the APPLY operator. The rest of the session was good but I kept coming back to (and testing out) those two areas.

I didn't get much of a chance for socializing because I was tinkering with the new stuff I learned whenever I had a few free minutes and I did something bad to my ankle so I just hung around my room for the evening.

Below are my notes from this session:

T-SQL Querying and Query Tuning Enhancements in SQL Server

Extended Events Change

This was the last thing we discussed but it is SO important that I need to talk about it first. Extended Events have anew Global Field called Query Hash. This is a hash of the auto-paramterized query, capturing this allows the detection of all sorts of performance problems. All of the tiny, fast queries that execute ALL the time can be evaluated to see if your system is suffering the "death by a million cuts". This is HUGE.


CTEs join the rank of derived tables, views and table valued functions as a way to make the query syntax cleaner. By themselves, all of these options generate the exact same query plan and do nothing improve or degrade query performance. The biggest immediate benefit to non-recursive CTEs is the improved legibility they gain by allowing query reuse.

CTEs are evaluated Top to Bottom. This allows the first query to be referenced by the second query, Both the first and second queries could be referneced by the third query, etc.

To improve performance consider using something to actually materialize the data and reference that single result set. Examples of this would be actual tables, temp tables and table variables.

Recursive CTEs do not follow the rules mentioned above, instead they need to keep a running tally of the data. It does not persist it to a normal table but instead creates a B-Tree structure for the data on disk that has been optimized for the Recursive CTE's usage patterns.

Windowing Functions

Craziness. Absolute, amazing crazy that has been hiding from me since SQL 2005 that can be applied to any aggregate (but works best with cumulative aggregates).

This is work applied to the result set.

The Windowing Functions are made up of a bunch of different parts

  • OVER : Establishes the window of data over the result sets
  • PARTITION BY : Defines the size of the sliding window by referencing what columns cause the aggregate to reset. (Note - FILTER operator is not supported, which would be an explicit where to filter the partition)
  • ORDER BY : Sorting sequence to be used when calculating aggregates. This important for things like running totals.
  • Window Frame : 
    • ROWS BETWEEN ... AND ...
      • Specify the number of result rows to consider, regardless of missing data 
      • Duplicate ORDER BY values are calculated individually
      • Always Specify Row, it is not the default (Damn you ISO SQL Standards!) but it is much more efficient
      • UNBOUNDED PRECEDING : From the beginning of the Partition (Default)
      • UNBOUNDED FOLLOWING : To the end of the Partition
      • n PRECEDING : From the beginning of the Partition
      • n FOLLOWING : To the end of the Partition
      • CURRENT ROW : To the current row (think of a running total) (Default)
      • Specify the value range to consider, this is not fully supported in SQL 2012 
      • Duplicate ORDER BY values are grouped together for the aggregate calculation
      • Cannot tell the difference between? 
      • Specify the number of result rows to consider (may skip missing values)
      • Use with extreme caution, it is the default when there is a partition
      • UNBOUNDED PRECEDING : From the beginning of the Partition (Default)
      • CURRENT ROW : To the current row (think of a running total) (Default)


This is NOT syntactic sugar. The query plan has a new operation called Window Spool which allows a "Fast Track" of results (for Linear Scaling) under two very common sets of conditions.

  1. "BETWEEN Unbounded Preceding" looks only at directly previous row instead of maintaining a full list for each record. 
  2. The aggregate is a Cumulative Aggregate (such as SUM or COUNT, not a MIN or MAX) it is calculated by using the "Unbounded Preceding" behind the scenes and doing appropriate.

If the Fast-Track conditions are not met it turns from a 2n cost to an n^2 cost.

For indexing the window aggregate a POC pattern should be followed.

  • P: Partition (Quirk: Is nearly always ASC unless the same column is in the order by as DESC)
  • O: Order By
  • C: Coverage (can be in the INCLUDE)

To check for performance issues, use Extended Events or Statistics IO (which show the worktable)

New SQL 2012 Window Aggregates

  • LAG : Element from previous row in the Partition
  • LEAD : Element from next row in the Partition
  • FIRST_VALUE : Element from the FIRST row in the FRAME
    • Typically the frame and partition are the same when using FIRST_VALUE
  • LAST_VALUE : Element from the LAST row in the FRAME
    • Typically the frame and partition are the same when using LAST_VALUE


The new Offset - Fetch syntax was added as an extension of the ORDER BY clause. The implementation is not SQL Compliant yet because it is missing PERCENTAGE and WITH TIES options still


It performs similar to TOP with a discard of the first X rows (the offset). This is fine for the first few "pages" of data but does not scale deeply. To scale it, use a CTE on an indexed column and join to that, such as:

SELECT orderid
FROM dbo.orders
ORDER BY orderid
OFFSET (@pagenum - 1) * @pagesize ROWS
, O.orderdate
, O.custid
, O.empid
Keys as K
INNER JOIN Orders as O
ON O.orderid = K.orderid
ORDER BY K.orderid


This is not part of the SQL Standard. These are syntactical sugar for some common pivot requirements. As a pivot becomes more complex (such as multiple columns) they break down and require a more sophisticated model.

An efficient UNPIVOT can be created with the CROSS APPLY using a VALUES clause (which creates a derived table behind the scenes). This method uses VALUES as a constant scan which feeds the stream aggregate avoiding any worktables.

CREATE TABLE #t (id int , [2008] int, [2009] int , [2010] int, [2011] int)
INSERT into #t
(1, 100, 101,102,104)
,(2, 200, 201,202,204)
,(3, 300, 301,302,304)
,(4, 400, 401,402,404)

u AS 
SELECT, sum(val) AS total
#t t
    ( VALUES
        (, 2008, t.[2008])
        , (, 2009, t.[2009])
        , (, 2010, t.[2010])
        , (, 2011, t.[2011])
    ) AS v (id, [year], val)
SELECT * FROM #t t inner join u on =


The new seqeunce object uses the same internals as the identity feature but is more configurable with different defaults.


  • MIN value: Most sequences are not user visible values so it starts at the lowest value possible for the data type.
  • DATATYPE : BIGINT. This is effectively an alternative to using the GUID keys so it should be support a OT of values. BEWARE! This datatype SHOULD match the datatype in the table (especially if being used as a constraint), otherwise there will be a lot of conversions that may not have been expected.
  • CYCLE : OFF is the default value but the reason it exists is for IDs used in staging data that have a relatively short lifespan to avoid running out of keys.

Extended beyond the SQL Standard

  • It is able to be bound as a default constraint on a table. This will help with being able to eliminate the IDENTITY columns without impacting application code or adding triggers (which is the normal model).
  • There is an "OVER" clause available when generating IDs for a set to ensure the expected sequence.


  • This has all of the dangers on the IDENTITY when it comes to sequential values. The process of getting IDs is outside of the transaction, so once it's been handed off it is gone (unless the sequence cycles).
  • A single sequence can be used across multiple tables but this does increase the gap likelihood.


This is commonly thought of as an UPSERT but it can do a bit more. It compares the source (stg) to the target (active) and will cause different DML actions to be taken.

The "USING ... ON" clause is used matching the table from the source and left joining to the target. This is not a "WHERE" but left join conditions so it DOES NOT filter the source table. To filter the source table use a CTE as the source.

MS extended this to support executing an action if the record exists in the Target but not in the Source. This is great when doing full table refreshes.


  • Use a HOLDLOCK to keep the data consistent because the merge executes multiple actions.
  • Review any Target tables to ensure there are no triggers referrencing @@ROWCOUNT in the beginning. This value is set for the ENTIRE Merge command, not the single operation. Typically I've seen this used to do a quick eject from the trigger or switch the path from row parocessing to set processing. Look at the INSERTED / DELETED tables instead.
  • Unlike the OUTPUT from an INSERT statement, you can return the any generated values from both the source and target. If you need these values then make a merge with a non-mathcing condition (1=2).
  • If you only want to update the table when there has been a change there are two approaches, adding each column to the WHEN or adding an exists to the WHEN with an EXCEPT, like this:

MERGE INTO dbo.Customers AS TGT
USING dbo.CustomersStage AS SRC
  ON TGT.custid = SRC.custid
     EXISTS  (    
            SELECT * FROM CustomersStage AS S WHERE s.custID = src.custid
            SELECT * FROM dbo.Customers AS T WHERE t.custid = tgt.custid
        ) THEN
    TGT.companyname = SRC.companyname, =,
    TGT.address = SRC.address

Misc cool stuff

  • If the goal is to have a non-deterministic sort but an ORDER BY is required, the field list can replaced by a (SELECT NULL)
  • Left to Right reading of the Execution Plan is how the API works
  • The lack of a ; after each SQL Statement is officially Deprecated! 
  • Avoid using IIF and CHOOSE statements. They were added as a stopgap to help with conversions from MS Access. Behind the scenes they are converted into CASE statements by the optimizer.
  • Interesting trick that can be embedded in other queries to get a delimited list:

SELECT N',' + QUOTENAME(orderyear) AS [text()]
   FROM (SELECT DISTINCT YEAR(orderdate) AS orderyear FROM Sales.Orders) AS Y
   ORDER BY orderyear

Tuesday, November 6, 2012

PreCon Day 1

So the conference began. Registration was a snap, I got a nifty backpack that is big enough to hold my workstation (an Elitebook 8560w) and was remarkably comfortable. I don't think it will replace my Tom Bihn bag for travel but it may be my new commute backpack.

The session I attended today was SQL Server 2012 in a Highly Available World and, while it was not what I expected I found a lot to learn. I was expecting a focus on Availability Groups with maybe a section on Failover Cluster Instances (FCI). Instead this was a much broader session. It started with Log Shipping, Database Mirroring (deprecated in SQL 2012), FCI and then finally Availability Groups. While I didn't want the first bits (Log Shipping has been around forever and Mirroring is going away) it was a good refresher and I learned a lot.

The best part of today's session was working through a problem that the person next to me was having during the breaks. There were two real questions

  1. Can you mirror upstream from SQL 2008R2 to SQL 2012?
    • It turns out you can, if the patches / services packs are correct. In fact this will be demonstrated later this week in one of the sessions! 
  2. Can you combine Failover Clusters with Availability Groups?
    • Yes, but it prevents automatic failover between datacenters, this may be fixed in a service pack.
I now have an intense need to get VMWare going on my laptop and start messing with cluster configurations.

My notes from this session are below.

PASS Pre-Con 1: SQL Server 2012 in a Highly Available World

This was an overview of the High Availability options typically used in SQL 2012. Topics ranged from the venerable Log Shipping to the newer and shinier Availability Groups.

Log Shipping

What happens when the database is set to "Bulk Logged" mode and log shipping is enabled? It turns out that the data never makes it over to the target system unless it is updated afterward. This is a good example of why data should only be bulk loaded into staging tables, cleaned and moved to final tables using normal, logged commands.

Test log shipping Disaster Recovery (with an outage window) 

  1. Server 1: "BACKUP LOG ... NORECOVERY" 
  2. Server 2: "RESTORE LOG ... WITH RECOVERY"
  3. Do stuff (nothing bad) and flip it back
  4. Server 2: "BACKUP LOG ... NORECOVERY" 
  5. Server 1: "RESTORE LOG ... WITH RECOVERY"
This maintains allows you to maintain the log chain and test DR as long as only legitimate transactions happen as part of the test.


The reason that this still matters is that is was JUST deprecated. That means it will still be there for at least the next two releases (SQL Server 2014 and 2016 presumably). The last version of SQL Server with mirroring will be supported for at least 10 years after release, that means it will be 2026 before it will be gone.

Unsupported DB Objects

  • FileStream 
  • FileTable


The Log Stream endpoint hangs out on port 5022. This endpoint will ONLY talk to other SQL Servers and is always encrypted, typically by a server signed certificate.


If everything is working well, Synchronous replication is expected to only add 5ms to the DML commands at distances < 100 km. If the distance is > 100km then Asynchronous Replication is recommended. The Mirroring overhead time is viewable in the Mirroring Monitor (which presumably fronts a DMV).

The greatest visible lag in mirroring is when the application realizes that it must switch to the secondary node. The application needs to have a connection timeout happen (5 - 10 secs) before flipping to the next server (which will of course work).

Mirroring is not impacted by single user mode (other than nobody doing transactions) because it is using a log reader and is not "in" the database.

Always On - Clustering (Local)

Cluster Configuration

The dreaded Hardware Compartibility List (HCL) is gone! The reason for this is that the storage options for the cluster now include network shares (not just iSCSI). Now who in their right mind thinks that network shares over SMB are a good idea for the DB? Well the enhancements to SMB (starting in 2.2 with Win 2008R2 but much improved in 3.0 with Win2012) are pretty amazing. They have removed the overly chatty aspects of the protocol and made it a viable alternative to SAN storage. There is a lot of value going Win2012 instead of Win2008R2 for the SMB 3.0 enhancements.

Sadly, the validation wizard still needs to execute to make sure that the node can be part of a cluster. The good news is that it is a lot less invasive (doesn't do things like take SAN Luns offline). It is still slow, while a 2 node cluster may take 10 minutes, larger ones can take a weekend or longer.

When setting up the cluster, if you are not a domain admin you will need to either get access to add computers to AD or get the new cluster name added to the directory before installing.

The heartbeat for the cluster is using IPv6, if it is not a dedicated connection between the servers the network must be able to accommodate it (it's 2012, probably time to update to IPv6 anyway).

Windows Core is completely supported for the database engine. An interesting thing about core is that (in case of emergency) you can now activate the full GUI when required and then flip it off again when done. The core instance actually keeps all the DLLs and extra files on disk, but in a packed form that is unpacked (into memory?) when needed.

Best Practice tip: Try to have your resource group name match the network group name for your cluster. It will improve manageability a lot.

Quorum Modes configuration

Node Majority Set should be chosen for large clusters (> 2 nodes) with an odd number of nodes
Node and File Share Quorum should be chosen for large clusters that may not have an odd number of nodes
Quorum Drive can be used for 2 node clusters but remember, this is a single point of failure.
Win2012 improvement: Dead Nodes are removed from voting after they are kicked out of the cluster so the quorum requirement is recalculated.

Database Cluster Configuration

TempDB can finally use direct storage! This allows the use of local SSD or even Fusion-IO cards (gasp!). This can be a lifesaver in systems with heavy TempDB usage.

The number of available SQL Server nodes per cluster is getting absurd in Enterprise Edition (but is still stuck at 2 for Std and the new BI Editions). The cap is somewhere around 64, but is really limited by the number of drive letters (26 - A,B,C,D = 22). Technically, this can probably be worked around with mount points but requires further research.

One thing that seemed to cause confusion was the idea that even named instances can be treated like a default instance if you left it hanging around on port 1433. I've always thought of instance naming as just a human readable shortcut for remembering the port number (I know for other protocols the name is more important but most systems are TCP/IP focused). So, if the instance is locked to port 1433, then when you try to connect without an instance name that is the door it knocks at.

When reviewing the cluster resource groups make sure that the mount points are explicitly listed and set as proper dependencies, this does not happen automatically at this time. This can be complex for mount points because the label can be a bit obscure, use the Disk ID to match the volume in Disk Management to the one in the Cluster Manager.

Slipstream installs are now deprecated. I didn't see that one coming. The reason for it is that every patch that comes out need to get an updated slipstream image generated. Now you can store the patches in a centralized location and point the installer at it. It will pickup and apply the patches automatically. This is a big improvement if you haven't automated your image generation yet, otherwise it's a minor change.

Only the Database Engine, Analysis Services and SQL Agent are cluster aware. The other SQL Server goodies are STILL not cluster aware. This includes the MDM solution (MDS + DQS).

Always On - Geo Distributed Clusters

There have always been a LOT of obstacles when doing Geo-Clustering, primarily around moving the data and managing the network configuration. As far as I know, MS is not touching the data side, they are leaving that to the SAN experts for this option. However, MS now has options for Multi-Subnet Failover where the IP Address can be set dynamically based on the available subnets. This feature may stop the network admins from wanting to kill you.

Always On - Availability Groups

Different from Mirroring

This is really a fork of the mirroring technology used previously. It is still an encrypted transaction log stream replayed from the primary to the secondaries with the same overhead.

There can only be a total of 4 replicas of any single availability group.

  • Synchronous Replicas are limited to 2
  • Asynchronous Replicas are limited to 4

This means you can have 4 Async Replics or 1 Sync + 3 Async or 2 Sync + 2 Async but NOT 3 Sync + 1 Async.


  • Read-Only replica (independent statistics kept in tempdb)
  • Routing based on application intent
  • Increased number of worker threads
  • Increased cap from 20 to 200 Replicas / instance 
  • Replication Publishers and Subscribers are supported
  • Filestream / Filetable
  • Extended events are supported


  • Cannot be used across domains
  • Does not support the replication distributor database
  • No SQL Server Profiler Events are available for Availability Groups

Fancy stuff. You can set your read-intent route to point at a load balancer that will bounce you between several read-only instances if desired.


The combination of Failover Cluster Instances can be used in conjunction with Availability Groups for a very comprehensive solution (licenses may get pricey though). The sequence for doing this is:

  1. Install Windows Cluster Services
  2. Install SQL Server Cluster
  3. Enable Always On Availability Groups
  4. Create / Restore Databases
  5. Configure Availiability Groups

SQLCAT has a whitepaper on implementing this combined cluster.


  • Full and Differential backups shold be taken from the primary
  • Transaction Log backups can be taken from the secondary. The info is pushed up to the primary so it knows that the logs have been archived.
  • Only maintenance plans (and Ola's script) support backup rules so far.


  • An Availability Group Listener is created (more Active Directory fun) and all systems should connect to that name.
  • The Availability Group Listener is not supported by the SQL Server Browser and is recommended to live on port 1433 to avoid confusion.
  • If there is a read intent connection and no secondaries are available to service the request it will execute on the primary.

Monday, November 5, 2012

PASS Summit 2012 Pre-PreCon

I flew into Seattle last night and I'm here for PASS Summit 2012 this week. I can hardly wait for things to get going (hence blogging at 5am).

Why is it so exciting? Well this is what I liked so much last time.

This is a SQL Server conference, but it is not run by Microsoft and SQL Server (and Data in all it's varied forms) is the ONLY topic. This means that the party line was not always followed and DB folk were not playing second fiddle to Developers or Server Admins.

The presenters were all top-notch. Not just good presenters but some of the best SQL Server presenters from across the country (world?) are here. They are experts in their arena, not only specific parts of SQL Server but on the broader topics of data modeling, data integration, data warehousing or enterprise data architecture. There were discussions that continued long after the sessions were over and private dialogues sometimes continued past the conference over email.

The people, not just the presenters were great too. There is something special about being surrounded by a totally new group of peers with different and exciting experiences. Everyone was having fun sharing war stories about the weird corner-cases that were encountered.

I learned more from that week at PASS Summit 2009 than I ever expected.

This week is gonna be great.