On Nomenclature, and Execution Plan Plus
Nomenclature, or the science of using words to name things using a logical system, is important since accurate names portray with clarity what the thing does. Recently, there’s been a suggestion, posited by Erin Stellato and Hugo Kornelis in a detailed and well-thought-out blog post, to rename the Actual Execution Plan used by SQL Server1 to “Execution Plan Plus Runtime Statistics” or “Execution Plan Plus”, for short.
Arguably, many people, especially those new to looking at plans, simply refer to them as either plans or execution plans, when talking about both the estimated and actual plans. The terms “Actual” and “Estimated” are perhaps not the most explicit names possible, however they are widely used, and concise, which is why people understand what they imply. I can see why it might be worthwhile to rename the plans to something more understandable, however adding “Plus” to the name, and removing “Actual” does not accomplish that.
It might be worthwhile renaming the “Actual Execution Plan” to “Runtime Execution Plan”, and retaining the “Estimated Execution Plan” name for estimated plans. Live Query plans, that can be seen with statistics as they happen via the SSMS GUI, could be called, somewhat intuitively and perhaps unsurprisingly, “Live Query Plans”. There are many technical reasons why the names for actual and estimated plans should be differentiated; I believe adding “Plus” to the name is simply meaningless to most people, especially for folks that are new to SQL Server.
I don’t agree that the proposed names offer the clarity that Grant Fritchey claims. “Estimated Execution Plan”, “Runtime Execution Plan”, and “Live Execution Plan” make more sense, and confer much-needed clarity.
Clarity is defined by the Cambridge Dictionary as the quality of being clear and easy to understand. In that light, the “Estimated Execution Plan” shows what SQL Server estimates it will do when executing the given query. The word “estimated” clearly denotes this may not be what is actually decided-upon for the execution plan at runtime.
“Runtime Execution Plan” denotes that this plan is the one actually used by SQL Server when processing the given query. “Runtime” is concise, clear, and easy to understand for even the most novice SQL Server user.
In the name “Live Execution Plan”, the term “live” is very easily interpreted to mean “that which is currently happening”, as in the definition contained in the Cambridge Dictionary.
The problem I see with the existing name, “Actual Execution Plan”, is that the term “actual”, when heard in a spoken sentence, can easily be misinterpreted as if the term didn’t exist. For example, the spoken sentence “show me the actual execution plan” could be taken as “show me the execution plan”, since actual is not easily recognizable as a noun in that context. Adding “Plus Runtime Statistics” to the name does eliminate the possibility of confusion since it contains the term runtime, however “Execution Plan Plus Runtime Statistics” is wordier than it needs to be. This is confirmed by the desire to shorten the name to “Execution Plan Plus”. The name I am proposing, “Runtime Execution Plan,” places emphasis on the term runtime, which is unmistakable in its meaning.
Others have already discussed, ad nauseam, the important differences between estimated plans and runtime plans. Suffice it to say, it is important that there be a readily apparent differentiation between the plans generated as an estimate, and those used at runtime. Differences between estimated and runtime plans can be easily seen with this sample code showing one interesting feature of the SQL Server query engine, Parameter Embedding Optimization2:
1 2 3 4 5 6 7 8 9 10 |
DECLARE @dummy int = 1; SELECT * FROM master..spt_values WHERE @dummy = 1 UNION ALL SELECT * FROM master..spt_values WHERE @dummy = 0 OPTION (RECOMPILE); |
The code above generates the following Estimated Execution Plan:
The Runtime Execution Plan for that query:
While they’re at it, perhaps Microsoft should consider renaming Extended Events, Profiler Plus. 😃
1 – The Actual Execution Plan conveys details of the work performed by SQL Server while running the T-SQL code declared in the query passed to it. It’s named the “actual” execution plan since it’s the plan that was actually used during execution. SQL Server doesn’t actually run the execution plan, it’s merely a graphical or textual representation of the steps taken by SQL Server encapsulated in an easy to understand format. I use the term “Actual” here to denote that the plan is what was actually used; another important aspect of that name is in the distinction between estimated rows, estimated rebinds and rewinds, estimated I/O costs, versus the actual, recorded values for number of rows read, rebinds, rewinds, etc.
2 – Thanks, Joe, for that succinct example!
First of all: thanks for reading what Grant, Erin and I have written, and for taking it seriously. I am happy with all the exposure we can get for our proposal.
After reading your reasons why you disagree with our proposal, I am going to cite one fragment from your post, because that fragment gives me the perfect opportunity to clarify our proposals, and the reasons for it:
“In that light, the “Estimated Execution Plan” shows what SQL Server estimates it will do when executing the given query. The word “estimated” clearly denotes this may not be what is actually decided-upon for the execution plan at runtime.”
And this is EXACTLY the incorrect assumption that is currently being by the terms Microsoft used.
The so-called “estimated execution plan” is NOT what SQL Server estimates it will do when executing the query. It is what SQL Server WILL do (if I had formatting tools in this comment, I’d have used bold and italics there).
SQL Server is not architected to first go “oh, I think I’ll do the query that way”, and then later at run-time reconsider. It’s architected to create a plan, and then follow that plan.
Yes, there are cases where a query gets recompiled. A recompile causes the old plan to be discarded and a new plan to be generated. If anything has changed in the input to the query optimizer, then a new plan might be different.
If you look at what is currently called an estimated execution plan and think “this is what SQL Server expects it will do when I run the query”, then you are getting it wrong. And then you are proving exactly why these names need to change.
When you look at an “estimated” execution plan, then you SHOULD think “this is what SQL Server WILL do when I run the query, except when (a) the plan was cached AND there is reason for recompile AND at recompile time reality has changed enough to get a different plan; or (b) the plan was not cached AND by the time I run this query reality has changed enough to get a different plan”.
The example at the end is actually pretty interesting. It showcases an artefact of variable inlining, where the variable is actually replaced by a hardcoded value in the query before the execution plan is generated (look at the query text at the top of the execution plans in the screenshot to see what I mean). This is not showcasing a difference between “estimated execution plan” and “actual execution plan”. This is showcasing a difference between “execution plan created normally” and “execution plan created using the special process, that has one extra step, and that can only be used for a run-time recompile of a query that has the OPTION(RECOMPILE) hint”.
Bottom line, and I’ll repeat this here to reinforce this: the term “estimated execution plan” clearly causes people, such as you, to think that this plan is what SQL Server “estimates it will do when executing the given query”. And the term “actual execution plan” then completes the picture by suggesting it means to say “oh, but here’s what it really did”.
If words cause confusion, then the best course of action is to change those words. That’s what Erin, Grant, and I have now proposed.
(And while I agree that “execution plan plus” is by itself a rather vague term, let’s not forget that it’s actually a short form for “execution plan plus run-time statistics”, and I hope you can agree with me that there’s nothing unclear about THAT term)
Are you saying SQL Server doesn’t have the ability to change the plan at Execution Time? Seems to me there are some newer features built in that allow SQL Server to switch to different plan operators as it encounters situations recognized as sub-optimal at runtime. Adaptive Joins comes to mind – while the operator does show up in estimated execution plans, you won’t see what is actually chosen until runtime. What about memory grant feedback? Table variable deferred compilation?
There are probably (undoubtedly) capabilities I’ve not thought of here. However, that’s not really the point. The names “Estimated” and “Runtime” clearly denote the differences necessary for new users (your apparent target audience for this change) to more easily and readily understand why we’re constantly referring to the “real” plan.
Whew! Lots of new comments. Thanks to all who commented.
Let’s see if I can address all feedback, one comment at a time.
“Are you saying SQL Server doesn’t have the ability to change the plan at Execution Time?” – yes, that’s exactly what I am saying.
“Seems to me there are some newer features built in that allow SQL Server to switch to different plan operators as it encounters situations recognized as sub-optimal at runtime.” – exactly BECAUSE SQL Server does not have the ability to change the plan at execution time, some features have been added (starting SQL2017) to allow at least some (very limited) adaptation at run-time.
You mention Adaptive Join. This is an operator that at run-time decides to not execute one branch and do execute another. Very similar to the functionality already provided since many versions ago by the Startup predicate in a Filter operator, or the Pass-Through predicate in a Nested Loops. None of these change the plan. They just at runtime decide not to call a child operator,
You mention Memory Grant Feedback. This is implemented by changing the cached execution plan post-execution. You will not see this change in the execution plan plus after execution, since the change is made after returning the plan, You will see the change if you request the execution plan, or run the query again with execution plan plus run-time statistics.
Table-variable deferred compilation should actually be called table-variable deferred recompilation, because that’s what it does: it forces a query that uses a table variable to recompile before execution. So it uses the standard recompile mechanism.
You didn’t mention interleaved execution; I will because this is so far the only feature that actually can cause an execution plan to change during execution. It’s still the standard recompile mechanism, but the unique element here is that this recompile is triggered mid-execution. Run with execution plan plus, and you’ll see the plan after recompile because the plan was changed before execution finished. Request execution plan after the query has finished, and (assuming standard plan caching rules and no weird exceptions) you’ll see that changed version of the plan, Run the same query again, and that changed version of the plan will be used (this time without recompile). Apart from the rather unusual timing of the recompile, it still behaves like a standard recompile in each and every way.
“The names “Estimated” and “Runtime” clearly denote the differences necessary for new users” – the whole point is that we want to denote that there are no differences between plans. When (if!) a query gets recompiled, there will be older/previous versions of the plan versus the current/most recent version of the plan. If a query does not get recompiled, then there is only “the” plan.
“to more easily and readily understand why we’re constantly referring to the “real” plan” – but what do you mean by “the ‘real’ plan”? To me, this term implies that there are also … I don’t know, fake plans? imaginary plans?
All plans are real. Yes, there is a chance that they get discarded in favour of a recompile before the next execution. This may be after the current plan has been used a million times already, or after it has been used just once, or even before it ever had the chance to get used. But the fact that not a single plan is guaranteed to remain current for the next execution does not mean that they are not real.
You said “let’s not forget that it’s actually a short form for “execution plan plus run-time statistics””, but not only people will absolutely forget that, but there will be much more people that will never know that it’s a short form of that. In my opinion the important part of “execution plan plus run-time statistics” is actually “runtime”, so it makes more sense to me that the short form would be “Runtime Execution Plan” rather than “Execution Plan Plus”
Hi Lamak,
I’ll be the first to admit that Execution Plan Plus is probably my least favourite of the proposed new names. And I’m looking at all proposals with an open and welcoming eye.
However, my fear with your suggestion of “Runtime Execution Plan” is that it might cause the same misunderstanding the current naming does,
Currently: “Yeah, I see that execution plan, but it’s only the estimated plan. Perhaps the actual plan will be different”.
My fear for your suggestion: “Yeah, I see that execution plan, but the query didn’t run yet. Let’s execute it and look at the runtime execution plan instead”.
The naming still (to me, at least) suggests that the runtime execution plan is a different type of plan, and that it can be different by virtue of being the runtime plan. And that’s exactly the kind of misunderstanding that our proposal tries to prevent.
The main issue I have is that the terminology you’ve chosen is clunky and completely ineffective, and no matter how many times you hear that you don’t seem to accept it. It doesn’t sound like you actually want feedback, here. Perhaps you estimated feedback would be all positive? Knowing people who gave you feedback during the process and were ignored certainly doesn’t make it seem like you’ve got any real interest in it.
I imagine every conversation going something like this:
“It’s the execution plan plus”
“Plus what?”
“Plus actual rows, number executions…”
“So it’s the actual plan?”
Leaving aside the technical inaccuracies and ambiguities in your comment, these terms do not make teaching people about execution plans any easier.
Erik, I’m sorry if I gave you or others the impression of not genuinely looking for feedback.
We (Erin, Grant, and I) did indeed reach out to several people with requests for feedback. And with hindsight I will admit that we didn’t do a good job of looping back to those people to tell them which suggestions we took, which we didn’t, and why we didn’t. As the main author of the blog, I’ll take that to heart and apologize to all who feel ignored,
However, we did look at all feedback carefully. We did incorporate a lot of suggestions into the text of the blog. And we didn’t reject anything without consideration.
I’m not claiming that our suggested names are perfect. I am saying that so far, the alternatives I have seen have (in my opinion) bigger drawbacks than the one we suggested. And I’m also saying that they are definitely better than what we have now.
“I imagine every conversation going something like this: (…)”
Yes, for now that will probably be the case. Because it’s a new term and people need to relate that to familiar terms.
But my hope is that in two or three years time, conversations will instead go like:
“I saw an old blog post somewhere, it talks about an actual execution plan … ever heard of that?”
“Yeah, that’s the execution plan after execution, you know, with all the runtime statistics added.”
“Oh, you mean the execution plan plus?”
“Yes, exactly. That blog post must be from before 2020.”
“these terms do not make teaching people about execution plans any easier.” – if MS does not change the terminology in their tools, and if the community remains decided and doesn’t all settle on a single terminology, then you are 100% right. It will be confusing as hell if multiple terms are used for the same thing. And if that happens, if the aggregate of community and Microsoft (and, to an extent, tool vendors) doesn’t all embrace the change, then I know that at one point we’ll have to admit defeat and accept that the old terms are here to stay, with all their drawbacks.
But if Microsoft DOES decide to change the terms in their tools and official documentation to new terminology, and if all those who produce community content switches to that same terminology, then the end effect will be that, after a tough transition period, the terms have changed but there are no multiple terms for the same thing.
Then the only confusion possible from terminology is “does the term accurately describe the thing, or does it convey unwanted and inaccurate associations in the reader’s mind”. I still maintain that “execution plan”, “execution plan plus run-time statistics”, and “execution plan with live statistics” do a better job in that area than “estimated execution plan”, “actual execution plan”, and “live query statistics”.
And I’m still open to hear better suggestions. But I’ll weigh them on the same scale I weighed the current names, and our own suggestions.
Thanks for your comment!
“This is showcasing a difference between “execution plan created normally” and “execution plan created using the special process, that has one extra step, and that can only be used for a run-time recompile of a query that has the OPTION(RECOMPILE) hint”.”
So… this scenario is so rare that it doesn’t matter if it happens at all?
That would be subjective, as I’ve seen enough of them (catch-all/reporting queries anyone?), and what this showcases is that the estimated plan is different than the actual plan – not just the same plan plus runtime statistics.
“So… this scenario is so rare that it doesn’t matter if it happens at all?”
Absolutely not, and I’m sorry if it came across that way!
I actually think that OPTION(RECOMPILE) is the only hint I have ever used outside of demo and test code. It is a great tool to solve certain specific problems.
The current “actual” / “estimated” terminology suggests that these are two different types of plan. The code sample at the end of the blog post above seems to support this idea, because the plan you see is different depending on which of the two options you use.
My response was intended to explain that this difference is not caused by pressing another button. It is caused by the very specific combination of (1) using OPTION(RECOMPILE), and (2) having a variable in the query.
If one believes that this specific scenario is important enough that it requires a special adjective for the execution plan in this case, then it should be an adjective that hopefully permeates, and at least is restricted to, this very specific combination of circumstances. The “actual” / “estimated” adjectives cannot do that, because “actual” applies to ALL cases where an execution plan is returned post-execution and with run-time statistics.
I personally don’t think we should introduce an adjective for the specific case of variable inlining. Not because it’s not important, ut because it’s too specific. Instead, I think we should have a clear understanding of what an execution plan is, when and how it is compiled, recompiled, cached, used, and removed from cache. And we should also have a clear understanding of what variable inlining is, when it can be applied and when not, and how it can affect execution plans.
FYI, comments here do support the comment markup provided by JetPack
Here’s a question. What kind of plan is in Query Store? Is it an “estimated plan” or an “actual plan” using the old terminology?
Well, it has no runtime metrics, no “actuals”, so it’s an “estimated plan” right? However, it’s stored after recompiles are completed, after execution, so it’s an “actual” plan right? But, not runtime metrics, so it can’t be an actual plan, can it?
What about the plans in cache? Are they estimated or actual? Again, no runtime metrics, therefore, estimates, but, they’re stored in cache because they’re the plans used, therefore, actuals, but, no runtime metrics. My head hurts.
The core of the problem is that “estimated’ and “actual” are grossly inaccurate. They lead people to thinking that these are literally different things, when they are not. What is currently called an “actual” plan is just the same plan as the “estimated” plan with the addition of runtime metrics. The arguments people keep making (after making fun of Plus) is that “well, those estimated plans are different” when they’re not. They simply are not. Now, might you, under circumstances laid out above, see different plan shapes? Sure. Are those different plan types? This is the core.
Honestly, Runtime Execution Plan isn’t a bad nomenclature. In fact, Execution Plan, Runtime Execution Plan, Live Execution Plan, that works. Maybe this can be the naming going forward. However, arguing that Estimated and Actual are accurate, that doesn’t fly. They’re not accurate. At all. As Hugo points out, in the blog post you argue that an estimated plan is a different thing, when it just is not. This is caused by the inaccuracies that this nomenclature engenders into the brain. Naming does matter. Estimated and Actual are broken. Picking the right new names should be the argument, not that Estimated & Actual work.
The whole discussion (here and elsewhere) is a little overwrought.
Use the plan term(s) that make most sense in context. Add extra explanation when necessary. There’s no single word or terminology that will satisfy everyone and encompass every nuance of what is a very complex topic.
Most of the time I am happy to say estimated or actual plan at least as a starting point.
How often are people encouraged to compare estimated plan data with actual values? That’s partly where the name came from, and why it has stuck. It is a useful shorthand, especially when talking about query tuning.
An estimated plan does not affect persistent user data or return query results. The estimated plan is based on optimizer estimates and contains only estimated numbers. It’s an estimate that might or might not turn out to be ‘good enough’ (accurate enough) in practice. All in all, the word estimated seems a reasonable enough choice to me.
A plan that is executed (to affect the database or return results) may contain actual numbers relating to that single execution. It is the plan that was actually used at runtime. It still contains the optimizer estimates, but adds values actually encountered during execution. If we need a word to contrast with estimated, then actual seems like a reasonable choice, all things considered.
People often request an estimated plan to see how the system estimates it will perform the requested actions. The plan might be the same (shape) when actually executed later, or it might not. What does it mean to say that an estimated plan that isn’t cached is the same plan as the one produced by a later execution of the same statement? It might be close to identical, but it’s not the same plan because that was thrown away.
We cannot simply handwave away recompilations or relegate them to a footnote as if they were an uncommon thing, with only a single cause. It is tautological to say, “The actual plan is the same as the estimated plan unless something changed”.
Moreover, if the estimated (compiled) and actual (executed) plans look different, then they are different for most practical purposes. If I can see the plan operators for a UDF in an estimated plan but not in an actual plan, why can I not regard these plans as “different” in practical terms?
It only makes sense to assert there is only one “type” (whatever that means) of execution plan in a narrow technical sense, and even then we can argue about it. What’s the use of it? Same goes for talking of ‘decorating’ the One True Plan with runtime data. Fair enough (plans that change, and display differences aside!) but what advantage is there? Which serious problems have been caused by the current common terminology? The authors of the original post didn’t say or at least they didn’t give compelling practical examples.
Now, we are not always talking about interactive SQL or query tuning. If we’re talking about the plan cache or query store, it may make more sense to talk about compiled or stored plans. But I won’t berate anyone too much for referring to them as estimated plans because they do indeed contain only estimates and the term conveys a familiar meaning at little cost.