Daily Loads: ~551 minutes (over 9 hours)
Hourly Loads: ~299 minutes (almost 5 hours)
Reduce overall compute cost without increasing infrastructure spend.
Improve runtime performance to make data more readily available to consumers.
Scale effectively with anticipated data growth.
Minimize disruption to existing pipelines and the development team.
No radical re-engineering of the solution.
Cost-sensitive adjustments recommendations had to decrease, not increase, spend.
Preserving referential integrity and business logic already embedded in the ETL.

Primary Key Optimization: Enabled OPTIMIZE_FOR_SEQUENTIAL_KEY on clustered indexes. This reduced insert contention, cutting ~1 minute from a 10M-row load.
Foreign Key Handling: Deferred FK enforcement until after full load completion, removing constraint overhead during inserts.
Index Management: Retained existing disable/rebuild cycle for non-clustered indexes.
Shifted compute from Azure IR to Self-Hosted IR (SHIR), reusing idle clusters and avoiding Azure IR startup costs.
Introduced Copy Activities with bulk insert locks and Degree of Parallelism (DOP=4).
Prototyped Deductible Pipeline Results:
Runtime reduced from 28m22s → 24m56s (10% faster).
Azure IR activity time reduced from 28m22s → 20m9s (28% less).


Moved schema swaps, truncates, and index management to SHIR where possible to further reduce Azure IR vCore usage.
Added performance monitoring at job and process-group levels (average runtimes, rolling 30-day views) to highlight creep and ensure early detection of bottlenecks.
Introduced wait/suppress logic for raw-to-structured transforms to prevent overruns and wasted extracts, ensuring business-day data availability.
Cost Savings: By shifting major workloads from Azure IR to SHIR and reducing compute intensity, we cut the largest cost driver (~90% of subscription spend) significantly.
Performance Gains: End-to-end runtime reductions of 10–28% were achieved in large pipelines, with parallelization improving throughput.
Reliability: Edge-case failures due to concurrent jobs were reduced, improving data availability within the same business day.
Scalability: The framework now supports extension of these optimizations to additional large pipelines, enabling incremental improvements without disruption.

Targeted tuning delivers outsized results: simple database index adjustments combined with workload reallocation yielded double-digit performance gains.
Self-Hosted IR is a powerful lever: when used correctly, it provides cost-effective compute and avoids the hidden costs of Azure IR vCore uptime.
Visibility prevents regressions: ongoing performance reporting allows the team to spot inefficiencies early.
Incremental adoption is key: rolling out changes pipeline-by-pipeline minimized risk while capturing savings steadily.
© 2025 Macer Consulting • All Rights Reserved.