Top Tips to Improve Your SQL Server Database Performance

Ehud Eshet, Chief Architect, Precise Software Solutions

Ehud Eshet, Chief Architect, Precise Software Solutions

Top Tips to Improve Your SQL Server Database Performance

A series of blogs about the critical problem faced by today’s DBAs

By Ehud Eshet, Chief Architect, Precise Software Solutions

 

#1: The Silent Killer: The Impact of I/O Performance on the Database

Most people know that I/O performance is critical. In many environments, I/O is the main bottleneck of a database, for which the main task, after all, is to retrieve and to store records within files. When encountering I/O performance issues, many DBAs will blame their storage administrator, demanding expensive storage hardware upgrades or changes to the way they manage the database-to- storage connection. But new hardware is very expensive, and, from our experience, simple changes and small investments can often yield performance boosts on par with the most expensive upgrades.

This blog is part of a series of posts targeting SQL Server DBAs who want to improve I/O and overall performance of their databases.  While it primarily deals with SQL Server terminology and concepts, many of the lessons apply just as well to other database systems.

Identifying slow I/O

When your database is encumbered with what seems to be I/O performance issues, your first step should be to identify if I/O is indeed the performance bottleneck of your database, and, if so, the root cause for its slowness.

There are many ways to identify I/O statistics related to your SQL Server performance.

  1. You can use Windows tools such as “perfmon” to keep track on disk statistics. Per Windows volume (identified by drive letter), you can get: average read latency, average write latency, and average disk queue length.
  2. SQL Server itself provides a wealth of I/O related information in its Dynamic Management Views (DMV) and functions.
  3. File I/O statistics are reported by “sys.dm_io_virtual_file_stats” function.
  4. More granular tables and indexes I/O statistics are reported by the “sys.dm_db_index_operational_stats” function.

While identifying slow I/O may be easy, without visibility into the performance of your I/O, you still can’t make informed decisions. For example, fixing slow I/O will not make SQL queries run considerably faster, if only 10% of their execution time is spent waiting for I/O. The opposite is also true: Even if the average I/O latency is about one millisecond (seemingly pretty fast), if 80% of SQL queries execution time is spent waiting for I/O, then you’ve still got an I/O performance problem on your hands.

Even when you know how much time is spent waiting for I/O, the answer on how to fix it may not always be obvious.

Improving I/O performance is challenging

Now that you know how I/O affects your database’s performance, it’s time to improve it. Surely, the quickest way to improve I/O performance is to move files to quick (and expensive) Flash storage, right? Well, not always.

Here’s a counter-intuitive example.  Let’s examine the case of massive INSERT INTO … VALUES ( … ) statement. Here are a few facts to consider:

  • DML statements (INSERT, UPDATE, DELETE, MERGE) are not waiting for updated data pages to be written to disk.
  • Most write-requests are processed asynchronously by background processes, such as the Lazy Writer.
  • Even when committing a transaction, data pages are not necessarily written to disk (only log buffer is flushed to disk on commit).

In this case, you may actually need to improve the I/O performance of the log file.

Suppose you need a frequently-executed transaction, which inserts hundreds of rows, to run faster? The INSERT statement may not wait for I/O at all since updated data pages are written asynchronously by the Lazy Writer.  These data pages are new pages created in memory (no need to wait for reading them into buffer cache). In this case, purchasing an expensive hardware upgrade, and moving data files to faster storage will not help at all! Moving log files, on the other hand, to faster storage may help much more, and should be much less expensive.  Some additional considerations may also apply:

  • If the inserted table already contains millions of rows and has indexes, the INSERT statement may wait for I/O.
  • The key columns of each inserted row should be inserted to the right index page.
  • The right index page (based on key order) may not reside in buffer cache and synchronous read I/O is required in order to update it.
  • You may consider reducing synchronous reads during INSERT by increasing buffer cache size (real memory) or by removing unused indexes.
  • Alternatively, range partitioning the table by date (directing all inserted rows to a single partition) will require keeping a smaller index in buffer cache. This way, smaller buffer cache and reduced server real memory might be enough to avoid synchronous reads during INSERT.

Measuring Time Spent

How do you measure time spent waiting for I/O by SQL statements? Actually, it’s pretty tricky. Some may try to identify top wait-events by querying the “sys.dm_os_wait_stats” view:

  • The events “ASYNC_IO_COMPLETION, IO_COMPLETION, IO_RETRY”, and all events that start with “PAGEIOLATCH” should be considered as I/O waits.

Unfortunately, the total wait time of all these I/O wait events cannot be the answer. It includes time spent waiting for I/O by background processes (not affecting your application’s throughput).

A closer estimate would be summing all read-wait times of data files + write times of log files.

You can gather these statistics by executing the “sys.dm_io_virtual_file_stats” function. Then, in order to gauge the impact of I/O waits on your statements, divide the total SQL activity time by the total I/O wait time experienced by statements. Total SQL activity time can be queried from “sys.dm_exec_query_stats” view. It reports total response time and CPU consumption per SQL statement. However, it does not report I/O.

A new approach for tuning I/O performance

Precise Software’s newest release, Precise SQLyzer, provides a clear and accurate report of wait time breakdown of SQL statements. It reports the true amount of time spent waiting for I/O out of the true total SQL activity (while ignoring background processes that do not contribute to SQL performance).

Total SQL activity, broken down into “Using CPU”, “I/O wait”, “Lock wait”, “Log wait”, etc., is reported for the entire instance, per database, per SQL statement, per program, per login, and per machine. After you have identified that I/O is the bottleneck of your application or database, it’s time to drill down. When using local disk drives, SQLyzer will also present I/O statistics for all DB files placed on the same disk.

The DBA will be able to identify contention between DB files of different databases. For example, if the same SQL Server instance stores several database files on the same disk, that disk may experience contention. This also applies to DB files of other SQL Server instances running on the same Windows server.

Summary

Both storage and I/O performance have a huge impact on database efficiency, however, until now, it hasn’t been easy for practitioners to gauge the impact of I/O on SQL query performance or to understand the potential impact of changes to storage and file configuration.

Precise SQLyzer provides DBAs with all the power that they need to quickly identify whether or not their database or statements suffer from I/O performance issues, and to understand and correct I/O related problems.

Our next blog: How to detect and resolve the root cause of storage-related performance issues

Five things In Memory database administrators need to remember

Lawrence Baisch, Vice President, Customer Solutions, Precise Software

Lawrence Baisch, Vice President, Customer Solutions, Precise Software

By:  Lawrence Baisch, Vice President, Customer Solutions, Precise Software

Often in Enterprise IT, when the hype dies down the real benefits become visible. So it is with ‘In Memory databases’ and ‘Big Data’, two technology concepts whose times it seems, have come.  Now, with careful planning and good advice, the buzzwords are turning into competitive advantages. Greenplum can be very fruitful, your important data really can be Highly Available with HANA and Exalytics may well aid your business analytics exponentially.

We are indeed beginning to see true enterprise-wide deployments of SAP, Oracle and EMC’s ‘shiny new’ In Memory offerings. All of these products were announced quite a while ago, but the temptation to keep robust SQL-based production systems hard at work is difficult to resist. No enterprise IT manager wants to be the one who tries to replace a trusted production system with unproven technology.

There are two primary drivers which, when they come together, typically prompt a move to In Memory. The first is the need for speed. When data needs to be analyzed more rapidly than ever, databases accessing traditional storage are too slow. The second is the phenomenal growth in data volumes. If only one of these is a concern, throwing more cheap storage at the issue can be a temporary fix. When both size and speed matter, the answer may well be In Memory.

What we are not yet seeing are wholesale migrations away from traditional transactional databases to In Memory technology – although they may happen in the future. Enterprise IT teams know more than anyone that running several infrastructures in parallel is painful. Precise offers a solution which covers the old and the new through ‘a single pane of glass’.  So, with no need to worry about losing the metrics you need to run your business, here are five things we recommend considering when moving to In Memory:

1.       Migrate only what you need. Most organizations are not pushing everything to In Memory due to costs and complexity.  Moving only what merits the added horsepower of In-Memory makes sense. This will very much depend on your business processes.   The answer also varies by sector, e.g., a high-volume e-tailer versus a manufacturer.

2.       Do a cost benefit analysis. When some applications work well in SQL, why tinker?  A valid set of metrics allows you to justify both the move to In-Memory and the right performance management solution for your organization.

3.       Understand the specifics. You need to know what you are doing. Using new technologies like In Memory to speed up processing can just expose issues and bottlenecks elsewhere in your system. Just like every layer in the existing application stack from storage to web front ends, In-Memory databases have specific performance issues. Understanding these issues across the architecture speeds up resolutions.

4.      Avoid point products - Incorporating smart monitoring of newer components like In Memory is as important as keeping proven components highly available. In fact unless there is a system-wide overview, operations teams are likely to be hopping across IT management technologies and system layers to account for new and old. Many of today’s solutions for In Memory monitoring, claim to be all-encompassing but need to be patched into enterprise systems – avoid them if possible. Ideally, the perspective should be that of the user and to get this, your chosen solution must tie business implications to user experiences.

5.       Keep it in context – If monitoring IT systems end-to-end is a win, monitoring the effects of IT performance in a business context is a bigger win. With so many In Memory use cases being business-driven, In-Memory systems are valued more when monitored in ways the business cares about. Relating performance to business transactions and the ability to predict the business outcomes of In Memory performance issues is the best way to keep your business colleagues happy.

If you bear all this in mind, In Memory stands a great chance of delivering what it promises. If you expect this new generation of database technology to have fewer issues than those which preceded it you will be ignoring the lessons of history. That is why Precise is looking to combine established best-practices with smart ways to exploit the potential of new technologies.

Chatting with Assaf Sagi about Precise Software’s newest product, SQLyzer™

Assaf Sagi, Precise Software

Assaf Sagi, Director, Product Management, Precise Software

At Precise Software, Assaf Sagi is a Director of Product Management. He’s responsible for leading the end-to-end vision and roadmap and driving the company’s development initiatives, including its newest product, Precise SQLyzer.

Assaf has been with Precise for over seven years, fulfilling various management roles in Engineering and in Product Management.

Q: Congratulations on the new SQLyzer product release. How’s the product been received so far?

A: So far it’s been fantastic. I have to admit that the reception of Precise SQLyzer has been even better than we had anticipated. Within two business days after the product’s release, we acquired our first Precise SQLyzer customer, and interest has only been picking up since then. Every day we get more leads downloading the free trial and showing interest, and we are all very excited here about this!

Q: Why did you release a new SQL Server performance tuning product, in addition to Precise for SQL Server?

A: We looked at who our current customers of Precise for SQL Server are, and we realized that the profile of our typical customer would be a mid-to-large size enterprise corporation, with expert DBAs who need the most sophisticated expert tools for database tuning. What we concluded was that there is an under-served segment of the market, comprised of DBAs, analysts, developers and architects who need an accessible SQL Server performance tuning solution. These individuals are super-talented. They typically play multiple roles and share responsibility for performance of business-critical databases, yet they don’t have the resources nor time to make big and expensive software acquisitions and implementations, nor can they afford the steep learning curve that is often associated with sophisticated expert tools.

So what we set out to do was to democratize SQL Server performance tuning. Make it affordable and easy for all types of hands-on SQL server individuals and organizations. Our strategy was simple: Take Precise’s existing intellectual property around database performance tuning, which we have accumulated for over 20 years, and re-package it into a new product that would be extremely easy to deploy, use and administrate.

Q: Does Precise’s partnership with EMC extend into SQLyzer as well?

A: Absolutely! Precise SQLyzer gives the DBA visibility into how storage affects the performance of their SQL Server database. This is the result of deep and unique years-long cooperation between Precise and EMC.

DBAs know that storage I/O is a major contributor to database performance. After all, that’s what the database is essentially all about: retrieving and updating records stored in files. The problem is that even though everybody knows that, there’s little anybody could do about it! To DBAs, storage is this big black box, which they spend a lot of waiting time on, but when they work with the storage administrator to improve performance, the conversation can get lost in unfamiliar topics including IOPS and policies.

SQLyzer‘s integration with EMC storage provides the missing link between the application and storage. So now DBAs can get visibility to storage performance and approach storage administrators with facts in hand to quickly pinpoint performance issues that stem from contention or misaligned storage tiering policies.

Q: What does your roadmap for SQLyzer look like?

A: The focus of Precise SQLyzer is all about easy SQL Server performance tuning. There’s a lot we can still do here in order to make expert SQL Server tuning available to everyone. This includes for example more automated advice and recommendations, special provisions for Data Warehouses (DW) and Business Intelligence (BI) modules, simplifying Precise’s unique objects tuning approach, and more.
We have set up a community for SQLyzer customers and pundits, and we are listening intensely to the discussions there, where our customers are telling us what they like about the product, and what they’d like to see.

Q: So how can users try out SQLyzer and buy it?

A: That one is easy.
To give SQLyzer a quick spin, go to http://www.precise.com/sqlyzer and download our free 30-days trial. In less than 20 minutes you should already be able to understand what we are about, and how Precise SQLyzer helps solve SQL Server performance issues quickly.
Buying SQLyzer is also a no-brainer. Simply visit http://www.precise.com/buysql and follow the instructions for buying Precise SQLyzer licenses.

Solving the SQL problems of modern business

Today’s Database Administrators (DBAs) have to operate under increasing pressure. While in the past they might have been responsible for close to a dozen databases, now they are expected to manage a 100 or more which are all becoming progressively critical to business. Even more frustrating is CIOs and CEOs have a tendency to point the finger at DBAs for poor performance when in reality the usual culprit is miss-use of the system such as storage misalignment. Add to this the additional headaches created thanks to virtualisation and database offshoring, and you begin to get an idea of the challenges today’s DBAs face.

Whilst many may be tempted to simply throw hardware at a database issue this is an expensive, and often ineffective, option. In our experience the majority of systems require just a few small tweaks to effectively solve performance issues.

SQLyzer™, Precise Software’s latest offering to SQL Server developers and DBAs, has been designed specifically to make the lives of DBAs easier – freeing them up to focus on strategic and innovative projects. Our latest SQL product was designed to not only bring transparency and diagnostics to a system but also provide users with an actionable remedy to issues, turning them into experts.

Often when performance issues occur a great deal of time is spent locating and isolating the problem as well as prioritising which issues require instant resolution and which can wait. This means DBAs require an in depth knowledge of how their programs are consuming resources as well as historical patterns into who is using what and why to solve them. We have designed SQLyzer to analyze and diagnose SQL Server instances 24/7. Now users can dive in and focus on an issue as soon as they notice a bottleneck in performance and understand which require immediate action.

While database monitoring systems are nothing new SQLyzer differentiates by giving the user a recommendation on how to resolve the issue. As soon as a problem is detected, users are simply two clicks away from a solution – benefitting from the more than 20 years’ R&D of SQL statistical analysis algorithm development.

We have designed the UI to enable users to set customisable reporting periods meaning DBAs can rapidly report back to the CIO about the causes of performance fluxes. We have also taken steps to simplify complex queries, for example when storage and device contention issues occur SQLyzer can examine down to a single device to see exactly what resources are competing for space.

That so many of our existing enterprise customers are renewing their licenses is a reflection on our ability to ‘close the loop’ on database issues – identifying a problem, locating and isolating it and feeding back to the user on how to resolve it. It is our aim at Precise to free up DBAs from daily fire fighting, to instead be able to focus on adding value to the business, and we feel we have achieved this with SQLyzer.

For more information and to sign up to a free 30 day trial click here

15 Top Reasons Why APM Deployments Fail

This is an epic list of mistakes. These are the top reasons why APM deployments fail, according to many of the industry’s top experts – including consultants, analysts, vendors and users – and also some of the solutions that help you overcome these challenges.

The list is not meant to be in order of importance. There is quite a lot of variety in the list, and any one of these pitfalls can cause your APM deployment to fail, if you are not careful…

See whole story at: APM Digest, “15 Top Reasons Why APM Deployments Fail“, Oct 25th, 2012.

“One of the biggest reasons for APM failure is when the deployment doesn’t deliver the ROI – not just the software cost but the whole effort of deploying the APM environ and maintaining it.” Sherman Wood, VP of Products, Precise

What Lies Beneath

Precise 9.5

Resource contention in cloud environments is often the source of application misbehavior. Precise is helping attack that problem with its latest software release.

Cloud and mobility are the game-changing IT trends of this decade. But to fully take advantage of the openness and flexibility that cloud computing, mobile apps and BYOD present, CIOs must be able to manage performance in a constantly changing and fluid environment. Unfortunately, too many companies are drowning in a sea of change from app and device overload; the cloud and mobility are producing chaos, not competitive advantage.

Service level expectations remain high, and users don’t know nor care that the application infrastructure has become inherently more complex to manage. As IT organizations transition from traditional architectures to private and public clouds, performance problems can appear out of nowhere. In some cases, the cloud exacerbates inefficient use of resources such as the network, by legacy applications. In other cases, the advantages of dynamic provisioning are not being realized because of load balancing problems.

Server contention is a common problem. Virtual machines competing for the same physical CPU and memory can result in serious performance degradations. This type of performance problem is elusive in a virtual environment, making it difficult to reproduce and correct later. It’s difficult and even impossible to predict when resource convention will occur. Precise 9.5 solves this problem by discovering and mapping relationships between virtual machines and their physical hosts in real time, and correlating their behavior to end-user actions.

For example, Precise can tell you that last night, a specific group of users suffered poor performance that was due to a slowdown in the Java tier, and that the slowdown occurred due to a starved Virtual Machine that didn’t get enough CPU from its ESX host. Taking it one step further, Precise analyzes which other Virtual Machine stole the CPU, and what particular resource-heavy transaction it was running at the time. This creates actionable advice to segregate the virtual machines by the business priorities of the transactions they are running.

Storage virtualization is also tough to manage as relates to application performance. Earlier this year, we discussed how storage is a silent killer for application performance. Modern storage tiering technologies move data around without user intervention, between virtual and physical infrastructure layers.

This saves time yet makes it difficult for IT managers to know where application data resides and whether a slowdown is the result of data moving to slower storage devices or something else. Precise 9.5 helps by mapping storage patterns and contention issues as they occur and recommending ways to correct problems.

Let’s say that an auto company using Precise discovered that Flash disks were not allocated to their most important application, despite a policy which defined otherwise. The policy violation occurred because of another application’s I/O-heavy batch which was automatically promoted to Flash. Precise is able to pinpoint this issue and provide recommendations for moving the critical application to a different storage pool, for more reliable performance.

Large companies today need APM technologies that can track transactions across physical and virtual layers and discover resource conflicts, so that high-touch, customer-facing cloud and mobile applications perform as expected.

Enter mobility and the situation gets even more cloudy. Managing the devices is hard enough. IT organizations also struggle to manage the growing volume of users and unpredictable traffic patterns that are occurring from mobile users. Mobility is fantastic for user productivity but for IT, it’s getting expensive.

To regain control, IT must be able to quickly visualize and understand mobile access patterns. Precise 9.5 tracks every click from a mobile user and monitors activity as it flows from the mobile device through the application tiers to database and storage. The software provides dashboard visualizations of hotspots and trends and correlates performance with parameters such as the device’s location, user and time of day. This kind of detailed analysis on mobile usage helps IT managers configure applications and infrastructure to improve the mobile user experience.

As technology evolves and IT organizations transition from internally managed, on-premise applications to SaaS, cloud and mobile everything, system and application performance management technologies must also evolve. Point solutions that focus on specific aspects of the infrastructure, such as servers or networks, are no longer viable. Precise 9.5 was developed to arm IT organizations with information that can make all the difference in reaping the benefits from cloud computing and mobile IT.

Talking with Tom Wailgum about SAP

Tom WailgumPrecise has been monitoring SAP environments for 20 years. Today we talk with Tom Wailgum, editorial director at ASUGNews.com, the online news publication of the America’s SAP User’s Group (ASUG). Wailgum is a veteran technology journalist with more than 15 years of reporting, writing, and analysis experience. Most recently, he was a senior editor for CIO.com and CIO Magazine, where he covered SAP, Oracle and the enterprise software market including upgrade planning, maintenance and support tactics and analysis of on-demand and cloud computing strategies. ASUG has more than 100,000 individual members, making it the largest independent SAP user group in the world.

What do SAP customers struggle with the most today?

It’s kind of shocking that after all these years, change management is still a challenge. The SAP software is pretty rock-solid. But customers often do a lot of customizations to software and then don’t provide initial or ongoing training for the users. And all those customizations make upgrades a lot harder to manage later on. SAP has been doing a lot of marketing around HANA, their supercharged analytics platform, but many of their customers are behind on this because they’re still dealing with upgrades to the latest version of their enterprise applications or trying to get more value out of the software they already own.

How about managing the performance of what they own?

Understanding the reliability and performance of your SAP software is pretty critical. Think about it: SAP customers spend millions on licenses, installations and upgrades, but after it’s all in, how do customers know how well their applications are actually doing? Plus, if customers start expanding their SAP footprint and adding more of the SAP pillars to their IT portfolios, this challenge becomes even more important.

Is SAP still seen as innovative in the software space?

Traditionally, they’ve always been strong in applications and analytics. In HANA, they have an excellent database product that’s still maturing. They’re getting into cloud and mobile but there’s still a lot of work to do there. In this so-called “five pillar” strategy that execs talk about, SAP is trying to be everything to everybody. The cloud is one area of particular interest. In 2007, SAP’s experiment with the cloud—Business ByDesign—didn’t do all that well, but now with the SuccessFactors acquisition, that’s probably going to turn around. It’s a tough climb though, because Workday, Salesforce and others have been doing this for about 10 years now.

Is cloud-based ERP really going to replace on-premise applications for large companies?

I wrote a story on Burberry and how they have SAP at the core and on the edge they have cloud-based apps such as Salesforce Chatter and Force.com. A lot of big companies are doing this hybrid strategy. They are not going to rip and replace their huge SAP investments. One of SAP’s newest strategies is to sell them the on-premise ERP version for their headquarters and the cloud version for their remote branches and smaller offices. Kimberly-Clark is another company that is running SAP at the core but also using Workday and Salesforce to tap into all that good SAP data that they have. But I think most large companies are not going to be running their core ERP applications in the cloud anytime soon.

Supposedly, it’s a hot time for mobile enterprise applications. How is SAP doing there?

SAP is pushing mobility really hard. They just recently opened up their developer ecosystem, which is a very good thing, because there just aren’t many SAP mobile apps available. Gartner recently put SAP/Sybase in the upper right quadrant for enterprise mobility, which speaks to SAP’s progress on its mobility offerings but also to the relative immaturity of the market. Getting customers to buy the mobile apps and Afaria platform is what’s been hard for SAP. The mobile versions of SAP software are very narrow today, because you’re really just looking at one small chunk of the larger package. And there’s a big difference between power and casual users when it comes to mobile applications. Segmenting apps for these different audiences will be important. So companies are being a bit cautious. Will mobility really pay off? What’s the business case? What’s the ROI? Those are very critical questions for SAP customers right now.

What are SAP’s greatest challenges in winning and keeping customers?

One thing they’re doing really well is learning how to sell to business people, because they are the ones who are buying the technology now. SAP knows that the era of selling to CIOs is gone. SAP also finally woke up to the fact that it’s worthwhile to create strong customer engagement programs and tap into the existing user groups. For instance, ASUG communities give feedback to SAP, and they have direct access to SAP product people, whereas previously, most customers only had the option of going through their account manager. SAP has another new program where customers can get early access to products before they go GA. Encouraging that feedback channel is important for retention, and SAP understands that now.

It’ll also be interesting to watch how this five-pillar strategy plays out with customers. If SAP is selling apps, analytics, database software, cloud software and mobility solutions and ensuring that these are high-quality products, will customers feel comfortable going “all in” with SAP? I know what SAP wants to happen, but customers will ultimately have to make that decision.

Talking with Sherman Wood

Sherman Wood, VP of Products for Precise, has been delivering technology products and services to global markets for over 25 years. His expertise spans across product management, marketing and as a technical architect. Prior to Precise, he was VP Engineering and Product Management for the service sales division of Synnex called Concentrix, which he joined through the acquisition of Encover Inc. Sherman was also Director, Business Intelligence for Jaspersoft, a leading open source business intelligence company.

Precise: There’s been much ado about cloud computing in the last two years, yet many organizations are still knee-deep in managing their current enterprise IT investments – data centers, databases, applications, ERP and storage. Is it tough for companies to straddle both worlds?

Sherman Wood: Enterprises have committed to the cloud message of using virtualization to increase services and scalability, while at the same time optimizing budgets. The success of SaaS, such as Salesforce.com and Netsuite, is based on enterprises moving en masse to the cloud. But there are few organizations that can completely rely on SaaS services now, as they have existing processes and data that either are not available as a SaaS service, or not immediately replaceable. A typical scenario I have seen is organizations that use Salesforce.com for CRM connecting to their internally-hosted ERP system, like SAP.

So organizations have committed to integrating SaaS and public cloud offerings with their internal systems. They are expanding services to employees and customers, while reducing their infrastructure costs. At the same time, however, they are also increasing the complexity of their environments and their set of vendors and touch points. It is tough for organizations to manage cloud environments, but the increase in services they can offer and cost savings is making the complexity palatable.

Precise: What are the biggest problems that Precise customers discuss when it comes to managing the IT environment?

SW: The amount of business running through the IT environment – transactions, emails, web pages served – has a direct relationship to revenue and operational efficiency of the organization. Time literally is money. Performance needs to be visible and understood, so it can be controlled and doesn’t slow down the pace of business. Some of the things that customers often tell us include:

  • I don’t have the complete picture or “performance intelligence” that I need.
  • What parts of my business are the most important? What should their service-level agreements be? How can I track those SLAs?
  • Where are the root causes of performance problems–in the network, application, database, storage, connections to the cloud and external service providers? And what exactly is going on within those areas?
  • How do I fix those problems?
  • How much should we spend on performance? We can add capacity or bandwidth to improve performance – but is this worthwhile?
  • The organization and the IT environment change all the time, and so performance changes. How do I keep up?

 Sherman Wood is VP of Products at Precise.

The Silent Killer Part II: The Batch Process

John KellyBy John Kelly

Previously, I wrote about how companies don’t have adequate visibility into transactions from a business perspective. IT can’t tell which transactions are high priority when troubleshooting issues, which means that companies aren’t consistently allocating the most critical data to the fastest Flash storage arrays. This week, I continue that discussion with a look at batch processing, also known as repeatable work flows.

Many years ago, batch processing was scheduled for off-hours while on-line transaction processing (OTLP) ran during normal business hours. Batch processing and OLTP coexisted just fine. But with the onset of global ecommerce, the concept of off-hours no longer exists.

Applications need to be up and available 24×7 and they need to simultaneously handle batch processing and OLTP. To address this challenge, organizations deployed clustering or replication strategies to separate the different data demands on the application storage subsystem.

Fast forward to today: organizations are in major cost-cutting mode. Server consolidation is now putting extra pressure on storage. As a result, IT has deployed enterprise flash drives (EFD) to lessen the demand on the storage subsystems.

Conflict of interest: batch v OLTP

Let’s take a look at a batch job for a banking application. Every day at 5pm EST, a calculate interest batch job kicks off. But during the execution window of the job, online users are still accessing the bank application to pay bills, transfer funds and so on. This creates a unique bottleneck that stresses the storage subsystem. Prior to the start of the batch job, the online user transactions are executing against data that is typically on EFD. Once the batch job starts, batch data becomes hot and is promoted to EFD, while the OLTP data downgrades to the fiber channel. The data movement typically takes several minutes to complete. As a result, the batch job gets off to a slow start. The opposite effect occurs once the batch job completes and the batch data takes several minutes to move off of EFD. The later issue is a silent killer.

Immediately after the batch job completes, if a user complains about transaction performance, IT will look at what’s currently running in the system. IT will only see OLTP running and will have a hard time understanding the root cause of the performance issue. To avoid this problem, IT must develop a just in time (JIT) storage-tiering strategy. Both pre-staging and re-staging of batch data is critical to maximize transaction performance.

In large companies, batch processing-related performance issues are common and complex, since IT is often running many batch jobs simultaneously. Managing all those workloads and moving data back and forth between storage drives is sometimes too much of a manual, error-prone effort. Often, IT wastes too much time trying to determine what happened with performance levels, and if the batch job was the culprit. How do organizations effectively manage this?

Automating JIT is the answer. Precise can help IT departments automate JIT and make decisions more proactively about when to pre-stage and re-stage data. Our software understands workflows and business prioritization from managing transactions over time.

With an automated JIT strategy, batch processing and OLTP can once again coexist peacefully.

John Kelly is the Chief Architect at Precise.

The Insight Your DBA Needs

Assaf SagiBy Assaf Sagi

It’s not uncommon to find a single DBA responsible for hundreds of different database instances. As companies increase efforts in Big Data initiatives, the problem will only worsen. Asking DBAs to deliver the same level of service to all transactions and applications isn’t logical. In the words of TV psychologist Dr. Phil: Does that make sense? No, it doesn’t.

It’s time that companies figure out how their DBAs and application managers are going to manage this growing volume of transactions, applications, databases and shifting business priorities. The answer is part technology, part process — and probably part culture.

Let’s say I am a DBA responsible for five hundred instances; each of the instances I manage may have a performance problem or two. This amounts to an overwhelming number of problems, alerts and blinking red lights that I need to analyze. How can I pick my battles?

Traditional application performance management (APM) tools focus on the technology silos rather than holistic end-to-end business context, and therefore don’t give advice as to where the DBA should focus his efforts first. For instance, the transaction that runs the longest may seem to be the culprit but it’s not necessarily the most critical transaction in business terms. Too often, the squeaky wheel gets the grease–but the business does not benefit if critical transactions are handled the same as less-critical ones.

APM systems need to do a better job of delivering business context along with the performance data.

Here are two examples:

1. SQL Statements: DBAs typically tune SQL statements. However, SQL statements rarely give a good indication of the transaction’s importance. So how to choose what to tune and what performance trade-offs to accept? APM tools must provide the business context here, enabling the DBA to view each statement and table in terms of which applications and end-user transactions (including from mobile devices) invoked them, and what was the overall response time. With this information, for example, the DBA may avoid working hard to tune a SQL statement that ultimately amounts to only 5% of the overall end-user transaction performance. Similarly, the DBA may choose to focus on a statement that is being run by the most important business transaction (e.g. quote to cash), even if it is not the heaviest statement on the database.

2. Indexes: Sophisticated DBAs often focus on tuning objects, such as tables, rather than individual SQL statements. Suppose I want to add an index to a table. Typically, some queries would run faster as a result, while others may actually suffer. Tables may be accessed by both important and trivial transactions. It’s hard to know whether the trade-off is acceptable without understanding the business context of the affected transactions. So, if one of the affected transactions is a batch process that runs at night, then it’s okay if performance lags a bit. If it’s a critical online transaction that directly generates revenue, then it’s not. Having the business context allows the DBA to make informed decisions and ensure that the correct trade-offs are being made, thereby improving the overall performance of the business.

Connecting the business perspective to the database silo does not just help DBAs focus their attention and scarce resources on the business priorities. It also facilitates teamwork and common language between the silo teams. For example, when the Java admin walks to the DBA and asks why  the ‘withdraw funds’ transaction is slow, the DBA could connect this immediately to the set of SQL statements and DB objects which participated in the transaction, and examine their performance and overall contribution to the business transaction. Similarly, the DBA can approach the application team and point to a specific transaction that throttles the databases by throwing too many queries at it.

There are many more examples of how IT is forced to manage performance in the dark, without the necessary insight into transactional importance to make the right decisions for the business. In future posts, we’ll talk about how IT managers can have a broader, more contextual view of application performance management.

Assaf Sagi is a Product Manager at Precise