System Characteristics

Similar documents
How to Stop Hating MySQL: Fixing Common Mistakes and Myths

MySQL Performance Tuning 101

File Structures and Indexing

The MySQL Query Cache

STORING DATA: DISK AND FILES

L9: Storage Manager Physical Data Organization

Disks, Memories & Buffer Management

MySQL Architecture and Components Guide

CPSC 421 Database Management Systems. Lecture 11: Storage and File Organization

Performance Optimization for Informatica Data Services ( Hotfix 3)

a process may be swapped in and out of main memory such that it occupies different regions


Database Applications (15-415)

1Z MySQL 5 Database Administrator Certified Professional Exam, Part II Exam.

MySQL Database Scalability

Virtual Memory. Reading: Silberschatz chapter 10 Reading: Stallings. chapter 8 EEL 358

I/O CANNOT BE IGNORED

Covering indexes. Stéphane Combaudon - SQLI

Operating Systems. Designed and Presented by Dr. Ayman Elshenawy Elsefy

Kathleen Durant PhD Northeastern University CS Indexes

CS5460: Operating Systems Lecture 20: File System Reliability

CSE 153 Design of Operating Systems

Advanced Database Systems

MySQL 5.0 Certification Study Guide

Storing Data: Disks and Files. Storing and Retrieving Data. Why Not Store Everything in Main Memory? Chapter 7

Database Systems. November 2, 2011 Lecture #7. topobo (mit)

Chapter 8 Virtual Memory

Storing and Retrieving Data. Storing Data: Disks and Files. Solution 1: Techniques for making disks faster. Disks. Why Not Store Everything in Tapes?

Memory management. Requirements. Relocation: program loading. Terms. Relocation. Protection. Sharing. Logical organization. Physical organization

MySQL Database Administrator Training NIIT, Gurgaon India 31 August-10 September 2015

OPTIMIZING MYSQL SERVER ON SUN X64 SERVERS AND STORAGE. Luojia Chen, ISV Engineering. Sun BluePrints Online February 2008

Plot SIZE. How will execution time grow with SIZE? Actual Data. int array[size]; int A = 0;

Database performance becomes an important issue in the presence of

Coding and Indexing Strategies for Optimal Performance

Mastering the art of indexing

IT Best Practices Audit TCS offers a wide range of IT Best Practices Audit content covering 15 subjects and over 2200 topics, including:

Chapter 8 & Chapter 9 Main Memory & Virtual Memory

Outlook. File-System Interface Allocation-Methods Free Space Management

Managing Storage: Above the Hardware

Memory Management. Dr. Yingwu Zhu

Learning Objectives : This chapter provides an introduction to performance tuning scenarios and its tools.

<Insert Picture Here> MySQL Cluster What are we working on

Advanced Database Systems

Disk scheduling Disk reliability Tertiary storage Swap space management Linux swap space management

Parser. Select R.text from Report R, Weather W where W.image.rain() and W.city = R.city and W.date = R.date and R.text.

Datenbanksysteme II: Caching and File Structures. Ulf Leser

csci 3411: Operating Systems

CSE 190D Database System Implementation

Storing Data: Disks and Files

RAID in Practice, Overview of Indexing

Monday, September 15, 14

Chapter 8: Working With Databases & Tables

Disks & Files. Yanlei Diao UMass Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Storing Data: Disks and Files

Performance Monitoring

CS317 File and Database Systems

Chapter 13: Query Processing

ECE Lab 8. Logic Design for a Direct-Mapped Cache. To understand the function and design of a direct-mapped memory cache.

Managing the Database

Jyotheswar Kuricheti

Outlines. Chapter 2 Storage Structure. Structure of a DBMS (with some simplification) Structure of a DBMS (with some simplification)

Optimizing RDM Server Performance

Storing Data: Disks and Files. Storing and Retrieving Data. Why Not Store Everything in Main Memory? Database Management Systems need to:

Chapter 8. Virtual Memory

Chapter 8 Virtual Memory

The Care and Feeding of a MySQL Database for Linux Adminstrators. Dave Stokes MySQL Community Manager

Storing and Retrieving Data. Storing Data: Disks and Files. Solution 1: Techniques for making disks faster. Disks. Why Not Store Everything in Tapes?

Chapter 4 File Systems. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for

Chapter 13: Query Processing Basic Steps in Query Processing

Unit 3 Disk Scheduling, Records, Files, Metadata


Chapter 4 Memory Management

Oracle Architectural Components

LECTURE 11. Memory Hierarchy

Chapter 8: Virtual Memory. Operating System Concepts Essentials 2 nd Edition

Chapter 12: Query Processing. Chapter 12: Query Processing

The Right Read Optimization is Actually Write Optimization. Leif Walsh

Writing High Performance SQL Statements. Tim Sharp July 14, 2014

Storing Data: Disks and Files

Performance Tuning Guide Version: 01.00

Lecture 21: Reliable, High Performance Storage. CSC 469H1F Fall 2006 Angela Demke Brown

Persistent Storage - Datastructures and Algorithms

Chapter 8: Virtual Memory. Operating System Concepts

Database Applications (15-415)

Lecture 16. Today: Start looking into memory hierarchy Cache$! Yay!

Chapter 12: Query Processing

Database Architecture 2 & Storage. Instructor: Matei Zaharia cs245.stanford.edu

Lock Tuning. Concurrency Control Goals. Trade-off between correctness and performance. Correctness goals. Performance goals.

Database System Concepts

CSE 431 Computer Architecture Fall Chapter 5A: Exploiting the Memory Hierarchy, Part 1

Chapter 3: Database Components

Cache introduction. April 16, Howard Huang 1

Why Is This Important? Overview of Storage and Indexing. Components of a Disk. Data on External Storage. Accessing a Disk Page. Records on a Disk Page

PostgreSQL Performance The basics

Storing Data: Disks and Files

Concurrency Control Goals

University of Waterloo Midterm Examination Sample Solution

UNIT III MEMORY MANAGEMENT

Disks and I/O Hakan Uraz - File Organization 1

Transcription:

System Characteristics Performance is influenced by characteristics of the system hosting the database server, for example: - Disk input/output (I/O) speed. - Amount of memory available. - Processor speed. - Network bandwidth. All of these factors have an impact on performance. For example, performance of a SQL statement depends upon whether or not a complete table can fit in memory, the time it takes to load the table into memory, the time it takes to process the data in memory, and the network bandwidth available for interaction between the user and the database server. Disk I/O Speed Database performance depends heavily on disk I/O speed. Disk accesses are very slow compared to memory operations. In addition to using faster disks, database performance can be improved by minimizing disk accesses. Also, once data is brought into memory (data buffer), keeping it there for future use as long as memory is available will reduce disk accesses there by speeding up query execution. The larger the data buffer, the more the data that can be kept in memory. Disks are mechanical devices with moving parts which is slow electronic memory. The time to read/write data consists of three components in order of decreasing time consumption: - Seek Time: Time taken by the disk head to move to the appropriate track. - Rotation Time: Time taken by the platter to rotate so that the appropriate block on the track is under the disk head. - Transfer Time: Time to read or write the data block. Optimizing Disk Accesses - Database systems try to optimize disk access time by storing tables, indexes, and other database items to minimize seek and rotational times. Rows inserted into a table are stored contiguously in a block with other rows or in adjacent blocks, as appropriate. The execution 1 / 13

time of a query, such as a SELECT statement without a WHERE clause, that needs to access a complete table will be minimized if the table is stored in contiguous blocks. Data placement decisions are made by the database system and are hidden from the users. - A disk block is the unit of disk I/O and storage. The size of the block can be explicitly specified by a database designer or a DBA, overriding the default size. A typical block size ranges from 2K to 16K bytes. The block size is specified when a database is created and typically cannot be altered once defined. - Large disk block sizes increase the number of bytes transferred per expensive seek thus reducing number of seeks. But using large blocks for applications with updates involving several blocks will cause each update to take longer as it gets expensive to write larger blocks. The default block size should suffice for most applications. In case of databases with special needs, say for multimedia databases with large items, a larger block size may be beneficial to reduce the number of items spread over more than one disk block for storage. - Disk seeks problem becomes more pronounced as data volumes starts to grow so large that effective caching becomes harder. For large databases with random data access, you will likely need at least one disk seek to read and a couple of disk seeks to write things. For database servers, employ disks with lower seek times. - Increase the number of available disk spindles which reduces the seek overhead by either symlinking files to different disks or striping the disks. Using Symbolic Links You can move tables and databases from the database directory to other locations and replace them with symbolic links to the new locations. You might want to do this, for example, to move a database to a file system with more free space or increase the speed of your system by spreading your tables to different disk. The recommended way to do this is simply to symlink databases to a different disk. Striping If an installation has several disks, Striping means putting the first block on the first disk, the second block on the second disk, and the N-th block on the (N MOD number_of_disks) disk, 2 / 13

and so on. If your average data size is less than or equal to the stripe size, performance for multiple reads will improve. Striping may be suitable for some parallel disk-access systems such as RAID. We can also tune the Operating System settings for the filesystem that the database is deployed under. Memory - If the needed data is in memory or data buffers, then disk accesses are not required saving time. If not, then disk blocks containing the data must be read from the disk into the data buffer. Larger data buffers increase the chance of finding the needed data in the buffer. - If the data buffer is running out of space for new items, then some items in the buffer must be deleted. And if these memory blocks have been altered via updates, then they must be written to disk before removal. To determine which items should be deleted, database systems use algorithms similar to those used in operating systems such as the least- recently-used (LRU) and the first-in-first-out (FIFO) algorithms and their variants. Note that the data buffer is conceptually similar to a cache in an operating system. - The DBA has the primary responsibility of determining and specifying memory needs of a database system to ensure optimal performance. To tune performance, the DBA must determine and specify buffer sizes. In MySQL, it is possible to configure the buffer sizes for: - Indexes - Joins - Data - Sorting 3 / 13

- Caching queries - Caching query results Buffers for caching queries and their results are very useful in situations where the database does not change often and the same queries are executed repeatedly. This is typical for many web servers that serve dynamic pages using database content. How MySQL Uses Memory - All threads share the same base memory. When a thread is no longer needed, the memory allocated to it is released and returned to the system unless the thread goes back into the thread cache. In that case, the memory remains allocated. - The key buffer is shared by all threads; its size is determined by the key_buffer_size variable - Each thread that is used to manage client connections uses some thread-specific space. The following list indicates these and which variables control their size: - A stack (default 192KB, variable thread_stack) - A connection buffer (variable net_buffer_length) - A result buffer (variable net_buffer_length) The connection buffer and result buffer both begin with a size given by net_buffer_length but are dynamically enlarged up to max_allowed_packet bytes as needed. The result buffer shrinks to net_buffer_length after each SQL statement. While a statement is running, a copy of the current statement string is also allocated. - Most requests that perform a sort allocate a sort buffer and also zero to two temporary files depending on the result set size. - Each request that performs a sequential scan of a table allocates a read buffer defined by variable read_buffer_size. - For each table having BLOB columns, if you scan a table, a buffer as large as the largest BLOB value is allocated. - Some memory is allocated for query itself for column structures, parsing, calculating and queries. - MySQL tests mysqld with several memory-leakage detectors to prevent memory leaks. - All joins are executed in a single pass, and most joins can be done without even using a temporary table. - In some cases, the server creates internal temporary tables while processing queries. A 4 / 13

temporary table can be held in memory and processed by the MEMORY storage engine, or stored on disk and processed by the MyISAM storage engine. Processor Speed Faster processors are always a good thing. Typically, disks are likely to be the cause of performance problems because of disk I/O is very slow compared to processor execution time. However, if there are multiple disks and I/O is being performed in parallel, then processors can be a bottleneck. In such cases, faster processors can help speed up response time and improve throughput. Network Bandwidth In a client-server architecture, one factor that can affect response time is available bandwidth. In case of intranets and broadband networks, network bandwidth is usually not an issue. However, even in such cases, transferring large amounts of data between the client and the database server can be time consuming and will adversely affect response time. Tuning Server Parameters You can determine the default buffer sizes used by the mysqld server using this command: Code Sample: TuneMySQL/Demos/Show-Vars.bat mysqld --verbose --help This command produces a list of all mysqld options and configurable system variables. The output includes the default variable values When tuning a MySQL server, the two foremost variables to configure are key_buffer_size and table_cache. You should set these appropriately before varying other variables. The following examples indicate some typical variable values for different runtime configurations. 5 / 13

If you have at least 256MB of memory and many tables and want maximum performance with a moderate number of clients, you should use something like this: shell> mysqld_safe --key_buffer_size=64m --table_cache=256 --sort_buffer_size=4m --read_buffer_size=1m & If you have only 128MB of memory and only a few tables, but you still do a lot of sorting, you can use something like this: shell> mysqld_safe --key_buffer_size=16m --sort_buffer_size=1m If there are very many simultaneous connections, swapping problems may occur unless mysqld has been configured to use very little memory for each connection. mysqld performs better if you have enough memory for all connections. With little memory and lots of connections, use something like this: shell> mysqld_safe --key_buffer_size=512k --sort_buffer_size=100k --read_buffer_size=100k & Or even this: shell> mysqld_safe --key_buffer_size=512k --sort_buffer_size=16k --table_cache=32 --read_buffer_size=8k --net_buffer_length=1k & For GROUP BY or ORDER BY operations on tables larger than available memory, you should increase the value of read_rnd_buffer_size to for faster reading of rows post-sorting. Some other parameters are discussed in other lessons, and more details can be gathered using MySQL documentation. MySQL Query Cache - The query cache stores the text of a SELECT statement together with the corresponding result that was 6 / 13

sent to the client. If an identical statement is received later, the server retrieves the results from the query cache rather than parsing and executing the statement again. - The query cache is extremely useful in an environment where you have tables that do not change very often and for which the server receives many identical queries. This is a typical situation for many Web servers that generate many dynamic pages based on database content. - The query cache does not return stale data. When tables are modified, any relevant entries in the query cache are flushed. - The query cache does not work in an environment where you have multiple mysqld servers updating the same MyISAM tables. - A query also is not cached under these conditions: - Server-side prepared statements, or sub-queries or Queries executed within the body of a stored function or trigger. - user-defined functions (UDFs) or stored functions, user or internal variables, uses explicit locking,uses TEMPORARY tables, generates warnings, or if The user has a column-level privilege for any of the involved tables. - A query cannot be cached if it contains some of the system-info functions such as DATABASE() or LAST_INSERT_ID(). - Searches for a single row in a single-row table are 238% faster with the query cache than without it. - To disable the query cache at server startup, set the query_cache_size system variable to 0. - Incoming queries are compared to those in the query cache before parsing - Queries must be exactly the same (byte for byte) to be seen as identical. In addition, query strings that are identical may be treated as different for other reasons. Queries that use different databases, different protocol versions, or different default character sets are considered different queries and are cached separately. - the following two queries are regarded as different by the query cache: SELECT * FROM film Select * from film - If a table changes, all cached queries that use the table become invalid and are removed from the cache. - The query cache also works within transactions when using InnoDB tables. - Two query cache-related options may be specified in SELECT statements: - SQL_CACHE: The query result is cached if it is cacheable and the value of the query_cache_type system variable is ON or DEMAND. - SQL_NO_CACHE: The query result is not cached. 7 / 13

Here are a few examples: Code Sample: TuneMySQL/Demos/Query-Cache-In-Select.sql SELECT SQL_CACHE customer_id, first_name, last_name FROM customer; SELECT SQL_NO_CACHE customer_id, first_name, last_name FROM customer; The first query is cached if cache is enabled or on DEMAND. The second query is not cached at all. Query Cache Parameters By default, the query cache is not enabled. You can set up query caching on your system by using the following three system variables: - query_cache_type: Specifies the operating mode of the query cache, values are explained below. - query_cache_limit: Specifies the maximum size for a result set that can be cached. For example, if the limit is set to 2M, no result set larger than 2M will be cached. The default limit is 1M. - query_cache_size: Specifies the amount of memory allocated for caching queries. By default, this variable is set to 0, which means that query caching is turned off. To implement query caching, you should specify a query_cache_size setting in the [mysqld] section of your option file. As you can see, the only action that you need to take to implement query caching is to set the query_cache_size variable, which is set to 0 by default. If the query cache size is greater than 0, the query_cache_type variable influences how it works. This variable can be set to the following values: 8 / 13

- 0 or OFF: Prevents caching or retrieval of cached results. - 1 or ON: Allows caching except of those statements that begin with SELECT SQL_NO_CACHE. - 2 or DEMAND: Cache only those statements that begin with SELECT SQL_CACHE. Code Sample: TuneMySQL/Demos/set-Query-Cache-Type.sql SET SESSION query_cache_type = OFF; SET SESSION query_cache_type = 2; The first SET turns off query caching. The second SET will cache only queries demanding caching via SQL_CACHE option. You can defragment the query cache to better utilize its memory with: Code Sample: TuneMySQL/Demos/Flush-Query-Cache.sql FLUSH QUERY CACHE; The statement does not remove any queries from the cache. To remove all query results from the query cache: Code Sample: TuneMySQL/Demos/Reset-Query-Cache.sql RESET QUERY CACHE; The FLUSH TABLES statement can also do the same thing. To monitor query cache performance: 9 / 13

Code Sample: TuneMySQL/Demos/Show-Qcache.sql SHOW STATUS LIKE 'Qcache%'; Descriptions of each of these variables can be found in MySQL documentation. The system variables related to your query cache are not the only variables that can affect performance. There are other system variables related to your table and index cache, as well as other components of MySQL. Refer to the MySQL product documentation for more. Warning: You cannot use a SET statement to specify the cache size. The following exercise walks you through the process of viewing the settings for each of these variables and enabling the query cache. The MyISAM Key Cache To minimize disk I/O, the MyISAM storage engine exploits a strategy that is used by many database management systems. It employs a cache mechanism to keep the most frequently accessed index blocks in memory. A special structure called the key cache (or key buffer) is maintained. The structure contains a number of block buffers where the most-used index blocks are placed. When data from any table index block must be accessed, the server first checks whether it is available in some block buffer of the key cache. You can set up multiple key caches and assign table indexes to specific caches. Multiple threads can access the cache concurrently, governed by concurrent updates. Shared access to the key cache enables the server to improve throughput significantly. To control the size of the key cache, use the key_buffer_size system variable. 10 / 13

Code Sample: TuneMySQL/Demos/Set-Key-Cache.sql SET GLOBAL keycache1.key_buffer_size=128*1024; SHOW VARIABLES LIKE 'key_buffer_size'; The key cache is set to 128K. To destroy a key cache, set its size to zero: Code Sample: TuneMySQL/Demos/Destroy-Key-Cache.sql SET GLOBAL keycache1.key_buffer_size=0; Index Preloading If there are enough blocks in a key cache to hold blocks of an entire index or the blocks corresponding to its non-leaf nodes, you may preload the key cache with index blocks. Preloading reads the index blocks from disk sequentially which is more efficient than the normal random reads. Without preloading, the blocks are placed into the key cache as needed by queries. The blocks will stay in the cache if there are enough buffers, the fetching from disk in random order. To preload an index into a cache, use the LOAD INDEX INTO CACHE statement. For example, the following statement preloads nodes (index blocks) of indexes of the tables table1 and table2: Code Sample: TuneMySQL/Demos/Load-Index-Into-Cache.sql CREATE TABLE table1 ( id1 SMALLINT NOT NULL AUTO_INCREMENT PRIMARY KEY ) Engine=MyISAM; CREATE TABLE table2 ( id2 SMALLINT NOT NULL AUTO_INCREMENT PRIMARY KEY ) Engine=MyISAM; LOAD INDEX INTO CACHE table1, table2 IGNORE LEAVES; Create two tables table1 and table2 in MyISAM engine with indexes. 11 / 13

The IGNORE LEAVES modifier causes only blocks for the non-leaf nodes of the index to be preloaded. The statement shown preloads all index blocks from table1, but only blocks for the non-leaf nodes from table2. Examining Thread Information It may be worthwhile to diagnose what your MySQL server is doing, to identify any bottlenecks. One can examine the process list - the set of threads currently executing within the server: Code Sample: TuneMySQL/Demos/Show-Process-List.sql SHOW FULL PROCESSLIST; -- Can also use (on command line) -- myqladmin processlist; Show detailed information about currently running MySQL threads. Each process list entry contains several pieces of information: - Id is the connection identifier for the client associated with the thread. - User and Host indicate the account associated with the thread. - db is the default database for the thread, or NULL if none is selected. - Command and State indicate what the thread is engaged in. Most states correspond to very quick operations. If a thread stays in a given state for unusual time, there might be a problem that needs further investigation. - Time indicates how long the thread has been in its current state. - Info contains the text of the statement being executed by the thread, or NULL if it is not executing one. By default, this value contains only the first 100 characters of the statement. To see the complete statements, use SHOW FULL PROCESSLIST. See MySQL documentation for various Command and State values and their meaning. 12 / 13

Tuning MySQL for Performance Conclusion This lesson covered the aspects of tuning a MySQL installation at operating system level. To continue to learn MySQL go to the top of this page and click on the next lesson in this MySQL Tutorial's Table of Contents. 13 / 13