Skip to content

Daniel Crawford's Data Blog

Tips for performant database designs

Menu
  • Home
  • Blog
  • Resources
  • About
  • Contact
Menu

Synapse Fundamentals for Tuning – Nesting

Posted on November 23, 2022September 5, 2025 by Daniel Crawford

Calling stored procedures from other stored procedures or referencing a view within another view can introduce performance complexities that are difficult to troubleshoot.  I will breakdown views and stored proc nesting issues individually, but the same message applies to both.  Avoid heavily nested objects in Synapse if reliable performance is a priority.  Understandably, we like the idea of nesting objects for security purposes, reusability, and as building blocks for larger processes and queries.  There are, however, other ways of accomplishing the same tasks but they do regrettably require more code to maintain better control of query plans, tempdb consumption, and memory consumption.

Views

Query engines (both SQL and DSQL) attempt to do predicate pushdowns to get a query plan that is the most efficient.  The challenge to optimizing nested views is that many times there are no statistics for intermediate datasets or there are calculated columns (think of case statements, field manipulation, or date type conversions in the select clause of the nested view) that are needed to complete the larger query that appears to only be querying a single view.  As discussed before, statistics are key to getting the best query plan and nested views work against us in tuning efforts.  Trying to detangle query plans from one view to the next is difficult, if not impossible, when nested views involve many tables, and sometimes are referenced multiple times.  The query plans that are produced from your estimated query plans or even the actual plan post execution (when reviewing execution steps from DMV’s) are confusing.  And make it nearly impossible to understand why a table is being broadcast or shuffled.  Tuning this type of query is a very time-consuming task that can have varying return on investment since the execution may not even be consistent from one execution to the next. In these cases, it is recommended to rewrite the query to meet the business need.  This means going back to the business to understand what they are trying to accomplish and then dissecting the end user query and the associated views to understand why data is being manipulated.  Rewriting the query without views also means that we are no longer reusing code… or does it?  It is likely that if we are nesting views, there is some data transformation or reusable code that should be pushed further upstream.  Nesting views typically means we are applying business logic repeatedly at time of read instead of pushing that logic into our transformations during data preparation of the warehouse.  Remember that one of the main reasons for a data warehouse is data preparation for performant analytics and reporting.  If you have nested views, you may not actually be preparing your data for read performance. 

What about security?

There are some customers that leverage views to abstract tables for end users and then apply a layer of security for data access.  In essence, they are creating a virtual datamart over the top of the warehouse. In some cases, filter criteria is applied to a view based upon the login to further restrict data access at a row level.  If you are applying filter criteria in a view based upon a username, I beg you to explore the Row Level Security features.  For anyone else that is simply using views as a way to abstract tables and apply security to tables, I have a few comments and recommendations:

1.) Review if the views are adding any benefit over applying the same security (select permissions) to the base tables.  (Remember that Row Level Security and Column Level Encryption are additional options if there is concern about fine grain data security within a table.) 

2.) Review how users are accessing the data.  If you are applying filters, manipulating the data, and applying business logic within the view specific to a particular user group, you may want to consider materializing that data in a separate table instead.  (Think of creating a physical data mart rather than a virtual data mart.)  This may go against some core design principles that are proliferated today. Engineers and architects are looking for reusability, a single source of truth, and a single copy of the data but I will point you back to my previous comment that part of data warehousing is preparing data for performant analytics and reporting.  Also, if you wanted to really adopt the “one copy of data” approach, we would just query directly from the transactional source system.  I don’t want to see data duplicated no more than the next engineer/architect, but I would rather the design be performant and non-impactful to the business than a completely sterile environment with strict one copy rules that may not benefit the business.

Stored Procedures

In Synapse Dedicated SQL Pools, I recommend taking a train car approach with small units of work that can be combined to get to your desired outcome (data transformation or query result set).  Trying to do too much all in a single stored proc leaves session level temp tables and caches intact until the parent session is deallocated.  In some extreme cases, I have seen parent stored procs run not just hours but days.  This creates tempdb and memory pressure and is compounded by poor performance of queries within the stored procs.  It is best to break up the end-to-end unit of work into bite size pieces.  Manage your own temp tables (and stats on those temp tables if needed) instead of letting the DW query engine create its own QTables for data movement.  Thoughtfully leverage temp tables in a way that will minimize long term storage. Purposefully deallocate/drop temp tables when done using them.  If you are doing a multi-step process, consider re-runnability of the stored proc.  It may make sense to execute the large stored proc in smaller pieces that are orchestrated via a Synapse pipeline to guarantee sequential execution of steps which would allow for failed processes to be rerun independently of the parent process. 

As an example, instead of creating a master stored proc to sequentially loop through reprocessing partitions in a large fact table by calling child stored procs, consider having a single stored proc with the base code to process a single partition then orchestrate the processing of partitions via a Synapse pipeline so multiple partitions can be processed in parallel.  This allows for re-runnability of specific partitions if needed as well as parallel processing.  If you are worried about data consistency for end users in between multiple stored procs being called separately or being able to rollback, remember that Synapse has a great RENAME feature which could allow you to segregate the processing of your data from your production end-user consumable tables.  You can transform/process your data in a set of tables identical to your serving layer and isolate them for just processing.  When processing completes, simply perform a rename of the tables to expose them as prepared tables with no impact to end users.

In Summary

Nesting relinquishes control of your queries and processes back to the DW engine.  While the DW engine works well for most scenarios, it doesn’t know your exact scenario; therefore, I always recommend maintaining control over your own database tuning.  This means de-normalizing and making copies of tables for performance or security purposes.  It also means creating re-runnable isolated and atomic processes that are manageable for developers, easy to maintain, and easy to tune.

Loading

Categories

  • Architecture Patterns
  • Fabric
  • Performance Tuning
  • Synapse
  • Top 10 Performance Considerations

Archives

  • July 2023
  • January 2023
  • December 2022
  • November 2022
  • September 2022
  • August 2022

Recent Synapse Videos

In this video Bogdan joins Stijn to talk about Microsoft Fabric performance and what happens underneath the hood while processing a query! <br /><br />  <br /><br />Polaris white paper: https://www.vldb.org/pvldb/vol13/p3204-saborit.pdf <br /><br />  <br /><br />Bogdan Crivat - VP Synapse Analytics<br /><br />https://twitter.com/bogdanC_guid <br /><br />https://www.linkedin.com/in/bogdanc/ <br /><br />  <br /><br />Stijn Wynants - Senior Product Manager <br />https://www.linkedin.com/in/stijn-wynants-ba528660/ <br />https://sql-stijn.com/ <br />https://twitter.com/SQLStijn
Performance at Scale with Microsoft Fabric: Query Processing!
As part of our Fabric Espresso series, we're diving deep into the realm of Data Engineering and Data Science. Join us as our senior product managers - Estera Kot, Ted Vilutis, and Stijn Wynants discuss the crucial features of Microsoft Fabric that every data engineer should know about! <br /><br />From insights into data engineering within Microsoft Fabric to decision guides for copying data into Fabric, and shortcuts that point to other storage locations - we've got it all covered in this exciting new video. <br /><br />Check out these key resources for a deep dive: <br /><br />👉 Data Engineering in Microsoft Fabric: https://learn.microsoft.com/en-us/fabric/data-engineering/data-engineering-overview  <br /><br />👉 Decision Guide to Copy Data into Fabric: https://learn.microsoft.com/en-us/fabric/get-started/decision-guide-pipeline-dataflow-spark <br /><br />👉 Shortcuts to Other Storage Locations: https://learn.microsoft.com/en-us/fabric/onelake/onelake-shortcuts <br /><br />  <br /><br />Meet the Speakers: <br /><br />1️⃣ Stijn Wynants: Senior Product Manager at Microsoft <br /><br />LinkedIn: https://www.linkedin.com/in/stijn-wynants-ba528660/ <br /><br />Twitter: https://twitter.com/SQLStijn <br /><br />Blog: https://sql-stijn.com/ <br /><br />  <br /><br />2️⃣ Estera Kot: Senior Product Manager at Microsoft <br /><br />LinkedIn: https://www.linkedin.com/in/esterakot/ <br /><br />Twitter: https://twitter.com/estera_kot <br /><br />  <br /><br />3️⃣ Ted Vilutis: Senior Product Manager at Microsoft <br /><br />LinkedIn: https://www.linkedin.com/in/tedvilutis/ <br /><br />Twitter: https://twitter.com/tvilutis <br /><br />  <br /><br />Don't miss out on this opportunity to learn directly from the Microsoft Fabric Product Group Team and elevate your data engineering skills with Microsoft Fabric! 🚀
Top Microsoft Fabric Features that Every Data Engineer Should Know
Welcome to our Fabric Espresso series! In this video Ambika joins Stijn to talk about table clone in Warehouse within Microsoft Fabric. We will talk a bit more deep on what is a table clone, when and how to use a table clone.<br /><br />Clone table: https://learn.microsoft.com/fabric/data-warehouse/clone-table<br />Clone table using T-SQL: https://learn.microsoft.com/fabric/data-warehouse/tutorial-clone-table<br />CREATE TABLE AS CLONE OF: https://learn.microsoft.com/sql/t-sql/statements/create-table-as-clone-of-transact-sql?view=fabric<br /><br />Ambika Jagadish - Product Manager for Warehouse in Microsoft Fabric<br />www.linkedin.com/in/ambikajagadish
Fabric Espresso: Table clone in Warehouse within Microsoft Fabric
Welcome to the fourth video in our How To series for Real-Time Analytics in Microsoft Fabric!<br /><br />In this video, Guy Reginiano, a Product Manager for Synapse Real-Time Analytics in Microsoft Fabric, will show you how to get data from Azure Event Hubs into a Real-Time Analytics KQL database in Microsoft Fabric.<br /><br />00:07 Introduction<br />00:25 Basic definitions<br />01:12 Event Hubs ingestion demo<br />06:34 Interact with the ingested data using KQL<br />07:54 Create a Cloud Connection in Fabric<br />10:24 Wrap-Up<br /><br />#microsoftfabric #synapserealtimeanalytics
Synapse Real-Time Analytics: Stream data from Event Hubs into a KQL database for powerful analytics
Welcome to our Fabric Espresso series! In this video Bogdan joins Stijn to talk about Microsoft Fabric and what place Azure Synapse has in all of this. We will talk a bit more deep on the engine that was created for Fabric Warehouse and how it compares to Dedicated SQL Pools!<br /><br />Polaris white paper: https://www.vldb.org/pvldb/vol13/p3204-saborit.pdf<br /><br />Bogdan Crivat - VP Synapse Analytics<br />https://twitter.com/bogdanC_guid<br />https://www.linkedin.com/in/bogdanc/<br /><br />Stijn Wynants - Senior Program Manager<br />https://www.linkedin.com/in/stijn-wynants-ba528660/<br />https://sql-stijn.com/<br />https://twitter.com/SQLStijn
Fabric Espresso: Will Fabric replace Azure Synapse?
Load More... Subscribe
This error message is only visible to WordPress admins

Cannot collect videos from this channel. Please make sure this is a valid channel ID.