Top Tips to Improve Your SQL Server Database Performance
A series of blogs about the critical problem faced by today’s DBAs
By Ehud Eshet, Chief Architect, Precise Software Solutions
#1: The Silent Killer: The Impact of I/O Performance on the Database
Most people know that I/O performance is critical. In many environments, I/O is the main bottleneck of a database, for which the main task, after all, is to retrieve and to store records within files. When encountering I/O performance issues, many DBAs will blame their storage administrator, demanding expensive storage hardware upgrades or changes to the way they manage the database-to- storage connection. But new hardware is very expensive, and, from our experience, simple changes and small investments can often yield performance boosts on par with the most expensive upgrades.
This blog is part of a series of posts targeting SQL Server DBAs who want to improve I/O and overall performance of their databases. While it primarily deals with SQL Server terminology and concepts, many of the lessons apply just as well to other database systems.
Identifying slow I/O
When your database is encumbered with what seems to be I/O performance issues, your first step should be to identify if I/O is indeed the performance bottleneck of your database, and, if so, the root cause for its slowness.
There are many ways to identify I/O statistics related to your SQL Server performance.
- You can use Windows tools such as “perfmon” to keep track on disk statistics. Per Windows volume (identified by drive letter), you can get: average read latency, average write latency, and average disk queue length.
- SQL Server itself provides a wealth of I/O related information in its Dynamic Management Views (DMV) and functions.
- File I/O statistics are reported by “sys.dm_io_virtual_file_stats” function.
- More granular tables and indexes I/O statistics are reported by the “sys.dm_db_index_operational_stats” function.
While identifying slow I/O may be easy, without visibility into the performance of your I/O, you still can’t make informed decisions. For example, fixing slow I/O will not make SQL queries run considerably faster, if only 10% of their execution time is spent waiting for I/O. The opposite is also true: Even if the average I/O latency is about one millisecond (seemingly pretty fast), if 80% of SQL queries execution time is spent waiting for I/O, then you’ve still got an I/O performance problem on your hands.
Even when you know how much time is spent waiting for I/O, the answer on how to fix it may not always be obvious.
Improving I/O performance is challenging
Now that you know how I/O affects your database’s performance, it’s time to improve it. Surely, the quickest way to improve I/O performance is to move files to quick (and expensive) Flash storage, right? Well, not always.
Here’s a counter-intuitive example. Let’s examine the case of massive INSERT INTO … VALUES ( … ) statement. Here are a few facts to consider:
- DML statements (INSERT, UPDATE, DELETE, MERGE) are not waiting for updated data pages to be written to disk.
- Most write-requests are processed asynchronously by background processes, such as the Lazy Writer.
- Even when committing a transaction, data pages are not necessarily written to disk (only log buffer is flushed to disk on commit).
In this case, you may actually need to improve the I/O performance of the log file.
Suppose you need a frequently-executed transaction, which inserts hundreds of rows, to run faster? The INSERT statement may not wait for I/O at all since updated data pages are written asynchronously by the Lazy Writer. These data pages are new pages created in memory (no need to wait for reading them into buffer cache). In this case, purchasing an expensive hardware upgrade, and moving data files to faster storage will not help at all! Moving log files, on the other hand, to faster storage may help much more, and should be much less expensive. Some additional considerations may also apply:
- If the inserted table already contains millions of rows and has indexes, the INSERT statement may wait for I/O.
- The key columns of each inserted row should be inserted to the right index page.
- The right index page (based on key order) may not reside in buffer cache and synchronous read I/O is required in order to update it.
- You may consider reducing synchronous reads during INSERT by increasing buffer cache size (real memory) or by removing unused indexes.
- Alternatively, range partitioning the table by date (directing all inserted rows to a single partition) will require keeping a smaller index in buffer cache. This way, smaller buffer cache and reduced server real memory might be enough to avoid synchronous reads during INSERT.
Measuring Time Spent
How do you measure time spent waiting for I/O by SQL statements? Actually, it’s pretty tricky. Some may try to identify top wait-events by querying the “sys.dm_os_wait_stats” view:
- The events “ASYNC_IO_COMPLETION, IO_COMPLETION, IO_RETRY”, and all events that start with “PAGEIOLATCH” should be considered as I/O waits.
Unfortunately, the total wait time of all these I/O wait events cannot be the answer. It includes time spent waiting for I/O by background processes (not affecting your application’s throughput).
A closer estimate would be summing all read-wait times of data files + write times of log files.
You can gather these statistics by executing the “sys.dm_io_virtual_file_stats” function. Then, in order to gauge the impact of I/O waits on your statements, divide the total SQL activity time by the total I/O wait time experienced by statements. Total SQL activity time can be queried from “sys.dm_exec_query_stats” view. It reports total response time and CPU consumption per SQL statement. However, it does not report I/O.
A new approach for tuning I/O performance
Precise Software’s newest release, Precise SQLyzer, provides a clear and accurate report of wait time breakdown of SQL statements. It reports the true amount of time spent waiting for I/O out of the true total SQL activity (while ignoring background processes that do not contribute to SQL performance).
Total SQL activity, broken down into “Using CPU”, “I/O wait”, “Lock wait”, “Log wait”, etc., is reported for the entire instance, per database, per SQL statement, per program, per login, and per machine. After you have identified that I/O is the bottleneck of your application or database, it’s time to drill down. When using local disk drives, SQLyzer will also present I/O statistics for all DB files placed on the same disk.
The DBA will be able to identify contention between DB files of different databases. For example, if the same SQL Server instance stores several database files on the same disk, that disk may experience contention. This also applies to DB files of other SQL Server instances running on the same Windows server.
Both storage and I/O performance have a huge impact on database efficiency, however, until now, it hasn’t been easy for practitioners to gauge the impact of I/O on SQL query performance or to understand the potential impact of changes to storage and file configuration.
Precise SQLyzer provides DBAs with all the power that they need to quickly identify whether or not their database or statements suffer from I/O performance issues, and to understand and correct I/O related problems.
Our next blog: How to detect and resolve the root cause of storage-related performance issues