SQL Server 2012 : SQL Server Architecture – THE LIFE CYCLE OF A QUERY (part 2) – Plan Cache

1. Plan Cache

Creating execution plans can be time consuming and resource intensive, so it makes sense that if SQL Server has already found a good way to execute a piece of code that it should try to reuse it for subsequent requests.

If no cached plan is found, then the Command Parser generates a query tree based on the T-SQL. A query tree is an internal structure whereby each node in the tree represents an operation in the query that needs to be performed. This tree is then passed to the Query Optimizer to process. Our basic query didn’t have an existing plan so a query tree was created and passed to the Query Optimizer.

Figure 3 shows the plan cache added to the diagram, which is checked by the Command Parser for an existing query plan. Also added is the query tree output from the Command Parser being passed to the optimizer because nothing was found in cache for our query.



Query Optimizer

The Query Optimizer is the most prized possession of the SQL Server team and one of the most complex and secretive parts of the product. Fortunately, it’s only the low-level algorithms and source code that are so well protected (even within Microsoft), and research and observation can reveal how the Optimizer works.

It is what’s known as a “cost-based” optimizer, which means that it evaluates multiple ways to execute a query and then picks the method that it deems will have the lowest cost to execute. This “method” of executing is implemented as a query plan and is the output from the Query Optimizer.

Based on that description, you would be forgiven for thinking that the Optimizer’s job is to find the best query plan because that would seem like an obvious assumption. Its actual job, however, is to find a good plan in a reasonable amount of time, rather than the best plan. The optimizer’s goal is most commonly described as finding the most efficient plan.

If the Optimizer tried to find the “best” plan every time, it might take longer to find the plan than it would to just execute a slower plan (some built-in heuristics actually ensure that it never takes longer to find a good plan than it does to just find a plan and execute it).

As well as being cost based, the Optimizer also performs multi-stage optimization, increasing the number of decisions available to find a good plan at each stage. When a good plan is found, optimization stops at that stage.

The first stage is known as pre-optimization, and queries drop out of the process at this stage when the statement is simple enough that there can only be one optimal plan, removing the need for additional costing. Basic queries with no joins are regarded as “simple,” and plans produced as such have zero cost (because they haven’t been costed) and are referred to as trivial plans.

The next stage is where optimization actually begins, and it consists of three search phases:

  • Phase 0 — During this phase the optimizer looks at nested loop joins and won’t consider parallel operators .
The optimizer will stop here if the cost of the plan it has found is < 0.2. A plan generated at this phase is known as a transaction processing, or TP, plan.
  • Phase 1 — Phase 1 uses a subset of the possible optimization rules and looks for common patterns for which it already has a plan.
The optimizer will stop here if the cost of the plan it has found is < 1.0. Plans generated in this phase are called quick plans.
  • Phase 2 — This final phase is where the optimizer pulls out all the stops and is able to use all of its optimization rules. It also looks at parallelism and indexed views (if you’re running Enterprise Edition).
Completion of Phase 2 is a balance between the cost of the plan found versus the time spent optimizing. Plans created in this phase have an optimization level of “Full.”
The term cost doesn’t translate into seconds or anything meaningful; it is just an arbitrary number used to assign a value representing the resource cost for a plan. However, its origin was a benchmark on a desktop computer at Microsoft early in SQL Server’s life.
In a plan, each operator has a baseline cost, which is then multiplied by the size of the row and the estimated number of rows to get the cost of that operator — and the cost of the plan is the total cost of all the operators.
Because cost is created from a baseline value and isn’t related to the speed of your hardware, any plan created will have the same cost on every SQL Server installation (like-for-like version).

Because our SELECT query is very simple, it drops out of the process in the pre-optimization phase because the plan is obvious to the optimizer (a trivial plan). Now that there is a query plan, it’s on to the Query Executor for execution.

Query Executor

The Query Executor’s job is self-explanatory; it executes the query. To be more specific, it executes the query plan by working through each step it contains and interacting with the Storage Engine to retrieve or modify data.

The interface to the Storage Engine is actually OLE DB, which is a legacy from a design decision made in SQL Server’s history. The development team’s original idea was to interface through OLE DB to allow different Storage Engines to be plugged in. However, the strategy changed soon after that.
The idea of a pluggable Storage Engine was dropped and the developers started writing extensions to OLE DB to improve performance. These customizations are now core to the product; and while there’s now no reason to have OLE DB, the existing investment and performance precludes any justification to change it.

The SELECT query needs to retrieve data, so the request is passed to the Storage Engine through an OLE DB interface to the Access Methods.

Figure 4 shows the addition of the query plan as the output from the Optimizer being passed to the Query Executor. Also introduced is the Storage Engine, which is interfaced by the Query Executor via OLE DB to the Access Methods (coming up next).


Access Methods

Access Methods is a collection of code that provides the storage structures for your data and indexes, as well as the interface through which data is retrieved and modified. It contains all the code to retrieve data but it doesn’t actually perform the operation itself; it passes the request to the Buffer Manager.

Suppose our SELECT statement needs to read just a few rows that are all on a single page. The Access Methods code will ask the Buffer Manager to retrieve the page so that it can prepare an OLE DB rowset to pass back to the Relational Engine.

Buffer Manager

The Buffer Manager, as its name suggests, manages the buffer pool, which represents the majority of SQL Server’s memory usage. If you need to read some rows from a page (you’ll look at writes when we look at an UPDATE query), the Buffer Manager checks the data cache in the buffer pool to see if it already has the page cached in memory. If the page is already cached, then the results are passed back to the Access Methods.

If the page isn’t already in cache, then the Buffer Manager gets the page from the database on disk, puts it in the data cache, and passes the results to the Access Methods.

The key point to take away from this is that you only ever work with data in memory. Every new data read that you request is first read from disk and then written to memory (the data cache) before being returned as a result set.

This is why SQL Server needs to maintain a minimum level of free pages in memory; you wouldn’t be able to read any new data if there were no space in cache to put it first.

The Access Methods code determined that the SELECT query needed a single page, so it asked the Buffer Manager to get it. The Buffer Manager checked whether it already had it in the data cache, and then loaded it from disk into the cache when it couldn’t find it.

Data Cache

The data cache is usually the largest part of the buffer pool; therefore, it’s the largest memory consumer within SQL Server. It is here that every data page that is read from disk is written to before being used.

The sys.dm_os_buffer_descriptors DMV contains one row for every data page currently held in cache. You can use this script to see how much space each database is using in the data cache:

SELECT count(*)*8/1024 AS 'Cached Size (MB)'
   ,CASE database_id
        WHEN 32767 THEN 'ResourceDb'
        ELSE db_name(database_id)
        END AS 'Database'
FROM sys.dm_os_buffer_descriptors
GROUP BY db_name(database_id),database_id
ORDER BY 'Cached Size (MB)' DESC

The output will look something like this (with your own databases, obviously):

Cached Size (MB)  Database
3287              People
34                tempdb
12                ResourceDb
4                 msdb

In this example, the People database has 3,287MB of data pages in the data cache.

The amount of time that pages stay in cache is determined by a least recently used (LRU) policy.

The header of each page in cache stores details about the last two times it was accessed, and a periodic scan through the cache examines these values. A counter is maintained that is decremented if the page hasn’t been accessed for a while; and when SQL Server needs to free up some cache, the pages with the lowest counter are flushed first.

The process of “aging out” pages from cache and maintaining an available amount of free cache pages for subsequent use can be done by any worker thread after scheduling its own I/O or by the lazy writer process, covered later in the section “Lazy Writer.”

You can view how long SQL Server expects to be able to keep a page in cache by looking at the MSSQL$<instance>:Buffer ManagerPage Life Expectancy counter in Performance Monitor. Page life expectancy (PLE) is the amount of time, in seconds, that SQL Server expects to be able to keep a page in cache.

Under memory pressure, data pages are flushed from cache far more frequently. Microsoft has a long standing recommendation for a minimum of 300 seconds for PLE but a good value is generally considered to be 1000s of seconds these days. Exactly what your acceptable threshold should be is variable depending on your data usage, but more often than not, you’ll find servers with either 1000s of seconds PLE or a lot less than 300, so it’s usually easy to spot a problem.

The database page read to serve the result set for our SELECT query is now in the data cache in the buffer pool and will have an entry in the sys.dm_os_buffer_descriptors DMV. Now that the Buffer Manager has the result set, it’s passed back to the Access Methods to make its way to the client.

A Basic SELECT Statement Life Cycle Summary

Figure 5 shows the whole life cycle of a SELECT query, described here:



1. The SQL Server Network Interface (SNI) on the client established a connection to the SNI on the SQL Server using a network protocol such as TCP/IP. It then created a connection to a TDS endpoint over the TCP/IP connection and sent the SELECT statement to SQL Server as a TDS message.
2. The SNI on the SQL Server unpacked the TDS message, read the SELECT statement, and passed a “SQL Command” to the Command Parser.
3. The Command Parser checked the plan cache in the buffer pool for an existing, usable query plan that matched the statement received. When it didn’t find one, it created a query tree based on the SELECT statement and passed it to the Optimizer to generate a query plan.
4. The Optimizer generated a “zero cost” or “trivial” plan in the pre-optimization phase because the statement was so simple. The query plan created was then passed to the Query Executor for execution.
5. At execution time, the Query Executor determined that data needed to be read to complete the query plan so it passed the request to the Access Methods in the Storage Engine via an OLE DB interface.
6. The Access Methods needed to read a page from the database to complete the request from the Query Executor and asked the Buffer Manager to provision the data page.
7. The Buffer Manager checked the data cache to see if it already had the page in cache. It wasn’t in cache so it pulled the page from disk, put it in cache, and passed it back to the Access Methods.
8. Finally, the Access Methods passed the result set back to the Relational Engine to send to the client.