To really solve a problem, it is imperative to know what the problem is. This applies to database experts greatly, who use an array of techniques to identify MySQL performance issues. These techniques involve taking into account badly designed queries, lack of proper schema, level of normalization, storage, OS and RDBMS settings. We will discuss more about these below.
1. Identifying badly designed queries are a big part of optimization. This is because the major function of a database, such as MySQL is to execute the queries and yield the a) expected results and b) at the correct time. So, how does a MySQL database administrator figure out ‘badly’ designed queries?
In essence, badly designed queries are actually obvious during various testing procedures. A badly designed query is noticeable when:
i) They make the application slow and give a horrible UX to the end-user
ii) They add unnecessary stress on the MySQL server
iii) They prevent other queries from executing/running
iv) Their logic has an infinite loop
Here is one example of a bad vs. good MySQL query:
Bad Practice: “SELECT * FROM user WHERE user_id = 1”
Good Practice: “SELECT username FROM user WHERE user_id = 1”
The differences are more significant with bigger result sets. It is important to identify the MySQL queries where “SELECT *” can be avoided. As more data is read from the tables, the query tends to become slower. This not only increases ‘query execution’ time but also puts great stress on the I/O or disk operations on the database server.
Moreover, since MySQL allows you to use a very useful keyword called “EXPLAIN”. The use of the “EXPLAIN” keyword on a “SELECT” statement allows MySQL database administrators to identify the processes/paths MySQL undergoes to execute the query. This is very useful in identifying bottlenecks and other issues in the table (and query) structures. For web developers, phpmyadmin is a very good tool for this purpose.
In addition to the use of “EXPLAIN” keyword for identifying MySQL performance issues, another trick of the trade is to use indexes. For example, if the MySQL application contains multiple JOIN queries, it is important to ensure that the columns are indexed, and also use the same data type. The other tricks of the trade to allow MySQL itself to help you analyse queries is by using “PROCEDURE ANALYSE ()”. It is useful to use this operation with existing table data so MySQL can make suggestions.
The usage instructions for “PROCEDURE ANALYSE ()” from MySQL developer documentation is as:
“SELECT col1, col2 FROM table1 PROCEDURE ANALYSE (10,2000)”. In this case, the arguments, in order are, max_elements and max_memory. The end results generated from the PROCEDURE ANALYSE operation will allow MySQL experts to actually find out the recommended data types, like for instance SMALLINT instead of INT.
2. Identifying a bad schema of a MySQL database can be one of the major discoveries for a database administrator for his/her quest to performance optimization.
Although there are database schema analyzers in place, it is, in essence, really hard to actually quantify a conceptual entity, such as a database schema, and claim whether it is good, bad or worse. However, there are some basic reasons that can be attributed to bad database designs.
Here are a few examples of badly designed databases:
This is a table called Drinkers. It has the following rows.
|John||10 Odd Lane||Budweiser||A.B||WickedAle|
a) Database is redundant. Whenever there is duplicate data, you know something is really wrong. In this case, the beer names are appearing twice.
b) There seems to be no primary key such an ID that could distinguish the unique row in this particular table. There can obviously be 100 Johns.
The father of databases, Codd, describes this as lack of normalization. The goal of normalization is to avoid anomalies in the relational schema design, such as redundancy and inability of maintaining row’s uniqueness. This can be resolved by normal form or normalization. In an ideal case, the table schema would look like this below:
|1||John||10 Odd Lane||1||2|
The use of two tables, for beers and the drinkers normalized the schema and removes the redundancy. Therefore, the drinkers table has and ID, a primary Key, and BeerLiked and FavoriteBeer, foreign keys, relating/linking both tables.
3. Identifying software issues for improvement of MySQL performance is crucial on one part. But, what about the other side of the story, that’s the hardware? True, to get an all round perspective of all the potential issues for performance of a MySQL database, it is necessary to look at the hardware side. Without it, the analysis is incomplete.
Here are a few scenarios when the OS, Servers, Storage and other hardware components need to be looked at for potential performance issues:
i. The software optimization procedures have been thoroughly carried out and all that could be done has been accomplished on the software level. By quantifying the performance improvements over a period of time in terms of %, it can be obvious when no more software performance improvement would actually bring a major increase in performance of the MySQL database.
This calls for a hardware upgrade, which can be anything ranging from increase storage, RAM, processing power, new operating system or maybe an entire architecture. No one scenario has a specific or standard setup!
The economics of hardware upgrades is itself a trivial topic in its own right, which involves caveats such as incremental upgrades, costs, hardware goals, performance goals and strategies.
ii. In some scenarios, the hardware is pretty much sufficient to handle the load. However, the problem lies balancing hardware usage rather than hardware upgrades.
For example, say you have 10 servers, and only one of them is overloaded, while the rest 9 are almost idle. That’s really bad! Once identified, this can be resolved using the proper architectural measures in place.
iii. When there is a scalability issue, there seems to be a clear goal of X amount of users. This implies that hardware resources be improved to meet the demands of the growing application.
For example, here are the cases where one should upgrade a hardware component:
a) Lookups is slow: Add memory to avoid disk IO.
b) Million users coming soon: Add more storage.
c) Slow disk IO: Add SSD.