My SQL Interview Questions and Answers
Question - 71 : - Using Symbolic Links for Databases and Tables?
Answer - 71 : - You can move tables and databases from the database directory to other locations and replace them with symbolic links to the new locations. You might want to do this, for example, to move a database to a file system with more free space.
If MySQL notices that a table is symbolically linked, it will resolve the symlink and use the table it points to instead. This works on all systems that support the realpath() call (at least Linux and Solaris support realpath())! On systems that don't support realpath(), you should not access the table through the real path and through the symlink at the same time! If you do, the table will be inconsistent after any update.
MySQL doesn't that you link one directory to multiple databases. Replacing a database directory with a symbolic link will work fine as long as you don't make a symbolic link between databases. Suppose you have a database db1 under the MySQL data directory, and then make a symlink db2 that points to db1:
shell> cd /path/to/datadir
shell> ln -s db1 db2
Now, for any table tbl_a in db1, there also appears to be a table tbl_a in db2. If one thread updates db1.tbl_a and another thread updates db2.tbl_a, there will be problems.
If you really need this, you must change the following code in `mysys/mf_format.c':
if (flag & 32 || (!lstat(to,&stat_buff) && S_ISLNK(stat_buff.st_mode)))
to
if (1)
On Windows you can use internal symbolic links to directories by compiling MySQL with -DUSE_SYMDIR. This allows you to put different databases on different disks.
Question - 72 : - How MySQL Opens and Closes Tables ?
Answer - 72 : - table_cache, max_connections, and max_tmp_tables affect the maximum number of files the server keeps open. If you increase one or both of these values, you may run up against a limit imposed by your operating system on the per-process number of open file descriptors. However, you can increase the limit on many systems. Consult your OS documentation to find out how to do this, because the method for changing the limit varies widely from system to system.
table_cache is related to max_connections. For example, for 200 concurrent running connections, you should have a table cache of at least 200 * n, where n is the maximum number of tables in a join.
The cache of open tables can grow to a maximum of table_cache (default 64; this can be changed with the -O table_cache=# option to mysqld). A table is never closed, except when the cache is full and another thread tries to open a table or if you use mysqladmin refresh or mysqladmin flush-tables.
When the table cache fills up, the server uses the following procedure to locate a cache entry to use:
Tables that are not currently in use are released, in least-recently-used order.
If the cache is full and no tables can be released, but a new table needs to be opened, the cache is temporarily extended as necessary.
If the cache is in a temporarily-extended state and a table goes from in-use to not-in-use state, the table is closed and released from the cache.
A table is opened for each concurrent access. This means that if you have two threads accessing the same table or access the table twice in the same query (with AS) the table needs to be opened twice. The first open of any table takes two file descriptors; each additional use of the table takes only one file descriptor. The extra descriptor for the first open is used for the index file; this descriptor is shared among all threads.
You can check if your table cache is too small by checking the mysqld variable opened_tables. If this is quite big, even if you haven't done a lot of FLUSH TABLES, you should increase your table cache.
MySQL - Drawbacks to Creating Large Numbers of Tables in the Same Database
If you have many files in a directory, open, close, and create operations will be slow. If you execute SELECT sta
Question - 73 : - Why So Many Open tables?
Answer - 73 : - When you run mysqladmin status, you'll see something like this:
Uptime: 426 Running threads: 1 Questions: 11082 Reloads: 1 Open tables: 12
This can be somewhat perplexing if you only have 6 tables.
MySQL is multithreaded, so it may have many queries on the same table simultaneously. To minimize the problem with two threads having different states on the same file, the table is opened independently by each concurrent thread. This takes some memory and one extra file descriptor for the data file. The index file descriptor is shared between all threads.
Question - 74 : - How MySQL Locks Tables ?
Answer - 74 : - You can find a discussion about different locking methods in the appendix.
All locking in MySQL is deadlock-free. This is managed by always requesting all needed locks at once at the beginning of a query and always locking the tables in the same order.
The locking method MySQL uses for WRITE locks works as follows:
If there are no locks on the table, put a write lock on it. Otherwise, put the lock request in the write lock queue. The locking method MySQL uses for READ locks works as follows:
If there are no write locks on the table, put a read lock on it. Otherwise, put the lock request in the read lock queue. When a lock is released, the lock is made available to the threads in the write lock queue, then to the threads in the read lock queue.
This means that if you have many updates on a table, SELECT statements will wait until there are no more updates.
To work around this for the case where you want to do many INSERT and SELECT operations on a table, you can insert rows in a temporary table and update the real table with the records from the temporary table once in a while.
This can be done with the following code:
mysql> LOCK TABLES real_table WRITE, insert_table WRITE;
mysql> insert into real_table select * from insert_table;
mysql> TRUNCATE TABLE insert_table;
mysql> UNLOCK TABLES;
You can use the LOW_PRIORITY options with INSERT if you want to prioritize retrieval in some specific cases.
You could also change the locking code in `mysys/thr_lock.c' to use a single queue. In this case, write locks and read locks would have the same priority, which might help some applications.
MySQL - Table Locking Issues
The table locking code in MySQL is deadlock free.
MySQL uses table locking (instead of row locking or column locking) on all table types, except BDB tables, to achieve a very high lock speed. For large tables, table locking is MUCH better than row locking for most applications, but there are, of course, some pitfalls.
For BDB tables, MySQL only uses ta
Question - 75 : - How MySQL uses DNS ?
Answer - 75 : - When a new threads connects to mysqld, mysqld will span a new thread to handle the request. This thread will first check if the hostname is in the hostname cache. If not the thread will call gethostbyaddr_r() and gethostbyname_r() to resolve the hostname.
If the operating system doesn't support the above thread-safe calls, the thread will lock a mutex and call gethostbyaddr() and gethostbyname() instead. Note that in this case no other thread can resolve other hostnames that is not in the hostname cache until the first thread is ready.
You can disable DNS host lookup by starting mysqld with --skip-name-resolve. In this case you can however only use IP names in the MySQL privilege tables.
If you have a very slow DNS and many hosts, you can get more performance by either disabling DNS lookop with --skip-name-resolve or by increasing the HOST_CACHE_SIZE define (default: 128) and recompile mysqld.
You can disable the hostname cache with --skip-host-cache. You can clear the hostname cache with FLUSH HOSTS or mysqladmin flush-hosts.
If you don't want to allow connections over TCP/IP, you can do this by starting mysqld with --skip-networking.
MySQL - Get Your Data as Small as Possible
One of the most basic optimization is to get your data (and indexes) to take as little space on the disk (and in memory) as possible. This can give huge improvements because disk reads are faster and normally less main memory will be used. Indexing also takes less resources if done on smaller columns.
MySQL supports a lot of different table types and row formats. Choosing the right table format may give you a big performance gain.
You can get better performance on a table and minimize storage space using the techniques listed below:
Use the most efficient (smallest) types possible. MySQL has many specialized types that save disk space and memory.
Use the smaller integer types if possible to get smaller tables. For example, MEDIUMINT is often better than INT.
Declare columns to be NOT NULL if possible. It makes everything faster and you save one bit per column. Note that if you really need NULL in your application you should definitely use
Question - 76 : - How MySQL Uses Indexes ?
Answer - 76 : - Indexes are used to find rows with a specific value of one column fast. Without an index MySQL has to start with the first record and then read through the whole table until it finds the relevant rows. The bigger the table, the more this costs. If the table has an index for the colums in question, MySQL can quickly get a position to seek to in the middle of the data file without having to look at all the data. If a table has 1000 rows, this is at least 100 times faster than reading sequentially. Note that if you need to access almost all 1000 rows it is faster to read sequentially because we then avoid disk seeks.
All MySQL indexes (PRIMARY, UNIQUE, and INDEX) are stored in B-trees. Strings are automatically prefix- and end-space compressed.
Indexes are used to:
Quickly find the rows that match a WHERE clause.
Retrieve rows from other tables when performing joins.
Find the MAX() or MIN() value for a specific indexed column. This is optimized by a preprocessor that checks if you are using WHERE key_part_# = constant on all key parts < N. In this case MySQL will do a single key lookup and replace the MIN() expression with a constant. If all expressions are replaced with constants, the query will return at once:
SELECT MIN(key_part2),MAX(key_part2) FROM table_name where key_part1=10
Sort or group a table if the sorting or grouping is done on a leftmost prefix of a usable key (for example, ORDER BY key_part_1,key_part_2 ). The key is read in reverse order if all key parts are followed by DESC. The index can also be used even if the ORDER BY doesn't match the index exactly, as long as all the unused index parts and all the extra are ORDER BY columns are constants in the WHERE clause. The following queries will use the index to resolve the ORDER BY part:
SELECT * FROM foo ORDER BY key_part1,key_part2,key_part3;
SELECT * FROM foo WHERE column=constant ORDER BY column, key_part1;
SELECT * FROM foo WHERE key_part1=const GROUP BY key_part2;
In some cases a query can be optimized to retrieve values without consulting the data file. If all used columns for some table are numeric and form a leftmost prefix for some key, the values may be retrieved from the index tree for greater speed:
SELECT key_part3 FROM table_name WHERE key_
Question - 77 : - Speed of Queries that Access or Update Data ?
Answer - 77 : - First, one thing that affects all queries: The more complex permission system setup you have, the more overhead you get.
If you do not have any GRANT statements done, MySQL will optimize the permission checking somewhat. So if you have a very high volume it may be worth the time to avoid grants. Otherwise more permission check results in a larger overhead.
If your problem is with some explicit MySQL function, you can always time this in the MySQL client:
mysql> select benchmark(1000000,1+1);
+------------------------+
| benchmark(1000000,1+1) |
+------------------------+
| 0 |
+------------------------+
1 row in set (0.32 sec)
The above shows that MySQL can execute 1,000,000 + expressions in 0.32 seconds on a PentiumII 400MHz.
All MySQL functions should be very optimized, but there may be some exceptions, and the benchmark(loop_count,expression) is a great tool to find out if this is a problem with your query.
Question - 78 : - How MySQL Optimizes WHERE Clauses ?
Answer - 78 : - The WHERE optimizations are put in the SELECT part here because they are mostly used with SELECT, but the same optimizations apply for WHERE in DELETE and UPDATE statements.
Also note that this section is incomplete. MySQL does many optimizations, and we have not had time to document them all.
Some of the optimizations performed by MySQL are listed below:
Removal of unnecessary parentheses:
((a AND b) AND c OR (((a AND b) AND (c AND d))))
-> (a AND b AND c) OR (a AND b AND c AND d)
Constant folding:
(a b>5 AND b=c AND a=5
Constant condition removal (needed because of constant folding):
(B>=5 AND B=5) OR (B=6 AND 5=50) OR (B=7 AND 5=6)
-> B=5 OR B=6
Constant expressions used by indexes are evaluated only once.
COUNT(*) on a single table without a WHERE is retrieved directly from the table information. This is also done for any NOT NULL expression when used with only one table.
Early detection of invalid constant expressions. MySQL quickly detects that some SELECT statements are impossible and returns no rows.
HAVING is merged with WHERE if you don't use GROUP BY or group functions (COUNT(), MIN()...).
For each sub-join, a simpler WHERE is constructed to get a fast WHERE evaluation for each sub-join and also to skip records as soon as possible.
All constant tables are read first, before any other tables in the query. A constant table is:
An empty table or a table with 1 row.
A table that is used with a WHERE clause on a UNIQUE index, or a PRIMARY KEY, where all index parts are used with constant expressions and the index parts are defined as NOT NULL.
All the following tables are used as constant tables:
mysql> SELECT * FROM t WHERE primary_key=1;
mysql> SELECT * FROM t1,t2
WHERE t1.primary_key=1 AND t2.primary_key=t1.id;
The best join combination to join the tables is found by trying all possibilities. If all columns in ORDER BY and in GROUP BY come from the same table, then this table is preferred first when joining.
If there is an ORDER BY clause and a different GROUP BY clause, or if the ORDER BY or GROUP BY contains columns from
Question - 79 : - How MySQL Optimizes LIMIT ?
Answer - 79 : - In some cases MySQL will handle the query differently when you are using LIMIT # and not using HAVING:
If you are selecting only a few rows with LIMIT, MySQL will use indexes in some cases when it normally would prefer to do a full table scan.
If you use LIMIT # with ORDER BY, MySQL will end the sorting as soon as it has found the first # lines instead of sorting the whole table.
When combining LIMIT # with DISTINCT, MySQL will stop as soon as it finds # unique rows.
In some cases a GROUP BY can be resolved by reading the key in order (or do a sort on the key) and then calculate summaries until the key value changes. In this case LIMIT # will not calculate any unnecessary GROUP BY's.
As soon as MySQL has sent the first # rows to the client, it will abort the query.
LIMIT 0 will always quickly return an empty set. This is useful to check the query and to get the column types of the result columns.
The size of temporary tables uses the LIMIT # to calculate how much space is needed to resolve the query
Question - 80 : - Speed of INSERT Queries ?
Answer - 80 : - The time to insert a record consists approximately of:
Connect: (3)
Sending query to server: (2)
Parsing query: (2)
Inserting record: (1 x size of record)
Inserting indexes: (1 x number of indexes)
Close: (1)
where the numbers are somewhat proportional to the overall time. This does not take into consideration the initial overhead to open tables (which is done once for each concurrently running query).
The size of the table slows down the insertion of indexes by N log N (B-trees).
Some ways to speed up inserts:
If you are inserting many rows from the same client at the same time, use multiple value lists INSERT statements. This is much faster (many times in some cases) than using separate INSERT statements.
If you are inserting a lot of rows from different clients, you can get higher speed by using the INSERT DELAYED statement.
Note that with MyISAM you can insert rows at the same time SELECTs are running if there are no deleted rows in the tables. When loading a table from a text file, use LOAD DATA INFILE. This is usually 20 times faster than using a lot of INSERT statements.
It is possible with some extra work to make LOAD DATA INFILE run even faster when the table has many indexes. Use the following procedure:
Optionally create the table with CREATE TABLE. For example, using mysql or Perl-DBI.
Execute a FLUSH TABLES statement or the shell command mysqladmin flush-tables.
Use myisamchk --keys-used=0 -rq /path/to/db/tbl_name. This will remove all usage of all indexes from the table.
Insert data into the table with LOAD DATA INFILE. This will not update any indexes and will therefore be very fast.
If you are going to only read the table in the future, run myisampack on it to make it smaller.
Re-create the indexes with myisamchk -r -q /path/to/db/tbl_name. This will create the index tree in memory before writing it to disk, which is much faster because it avoids lots of disk seeks. The resulting index tree is also perfectly balanced. Execute a FLUSH TABLES statement or the shell command mysqladmin flush-tables.
This procedure will be built into LOAD DATA INFILE in some future version of MySQL.
You can