Quantcast
Viewing all articles
Browse latest Browse all 8

Poor man’s Online Optimize in 5.6

Table space fragmentation has generally 2 origins :

  • File System fragmentation : the data file is spread physically on many non contiguous locations on the disk.
  • Internal Fragmentation : the data and index pages have “holes” : this happens when  rows are deleted or updated, especially at random.

As a result, performance is affected by table space fragmentation. Data typically takes more space on disk and in memory. The disk is more busy than it should.

File System fragmentation can be detected using the filefrag command on Linux (and similar on different OS). When using MyISAM, MYI files are usually very fragmented on the FS, much more than the MYD files.

ls -al frag
total 883304
drwx------  2 aadant common      4096 Sep 30 18:41 .
drwxr-xr-x 17 aadant common      4096 Sep 30 18:59 ..
-rw-rw----  1 aadant common        65 Sep 30 18:40 db.opt
-rw-rw----  1 aadant common      8608 Sep 30 18:41 t.frm
-rw-rw----  1 aadant common 551944192 Sep 30 19:41 t.MYD
-rw-rw----  1 aadant common 426150912 Sep 30 19:41 t.MYI
filefrag frag/t.MYD
frag/t.MYD: 23 extents found
filefrag frag/t.MYI
frag/t.MYI: 4949 extents found

To measure internal InnoDB fragmentation, there is no general formula. It is possible to compute the theoretical size and compare it to the actual size on disk. It only works on fixed length data types. I have this trick : it is mentioned in 50 Tips for Boosting MySQL Performance [CON2655]. The idea is to compare the fragmented table and a much smaller table having the same table definition and sample data.
It works well when the average row length can be estimated from the sample data even for variable length rows.

create table t_defrag like t; 
insert into t_defrag select * from t limit 20000;

Here 20000 rows 15-20 Mb on disk gives a good idea. Inserting the sample data the same empty table actually defragments the sample, since it contains non empty rows.

The table is likely internally fragmented

if Avg_row_length(t) > Avg_row_length(t_defrag)

where Avg_row_length comes from show table status.

Example : some random data is created, uniformly distributed random row length : 1M rows, 640Mb. Internal fragmentation is created by deleting the even rows (one every two rows).

drop database test_case2;
create database test_case2;
use test_case2;
set global query_cache_size = 0;
set global innodb_flush_log_at_trx_commit = 2;
drop table if exists t1;
create table t1(
id int primary key auto_increment,
k  varchar(50),
value varchar(1000), unique key(k)) engine=InnoDB;
set @t1 =(select now());
set @i := 1;
insert into t1(k,value) values(@i:=@i+1,repeat('a',rand()*1000)); 
insert into t1(k,value) values(@i:=@i+1,repeat('a',rand()*1000)); 
replace into t1(k,value) select @i:=@i+1,repeat('a',rand()*1000) from t1 t1, t1 t2, t1 t3, t1 t4, t1 t5, t1 t6, t1 t7, t1 t8, t1 t9, t1 t10, t1 t11, t1 t12, t1 t13, t1 t14, t1 t15, t1 t16, t1 t17,
t1 t18, t1 t19, t1 t20;
set @t2 =(select now());
select @@version,timediff(@t2,@t1) duration;

analyze table t1\G
show table status like 't1'\G
delete from t1 where mod(id,2) = 0; 
show table status like 't1'\G
drop table if exists t1_defrag;
create table t1_defrag like t1; 
insert into t1_defrag select * from t1 limit 20000;
analyze table t1\G
analyze table t1_defrag\G
show table status like 't1'\G
show table status like 't1_defrag'\G

mysql> show table status like 't1'\G
*************************** 1. row ***************************
           Name: t1
         Engine: InnoDB
        Version: 10
     Row_format: Compact
           Rows: 476676
 Avg_row_length: 1293
    Data_length: 616562688
Max_data_length: 0
   Index_length: 37257216
      Data_free: 5242880
 Auto_increment: 1114098
    Create_time: 2013-09-30 15:19:01
    Update_time: NULL
     Check_time: NULL
      Collation: latin1_swedish_ci
       Checksum: NULL
 Create_options:
        Comment:
1 row in set (0.01 sec)

mysql> show table status like 't1_defrag'\G
*************************** 1. row ***************************
           Name: t1_defrag
         Engine: InnoDB
        Version: 10
     Row_format: Compact
           Rows: 19773
 Avg_row_length: 610
    Data_length: 12075008
Max_data_length: 0
   Index_length: 1589248
      Data_free: 4194304
 Auto_increment: 40000
    Create_time: 2013-09-30 15:23:38
    Update_time: NULL
     Check_time: NULL
      Collation: latin1_swedish_ci
       Checksum: NULL
 Create_options:
        Comment:
1 row in set (0.01 sec)

ls -al data/test_case2/t1*
-rw-rw---- 1 aadant common      8612 Sep 30 20:39 data/test_case2/t1_defrag.frm
-rw-rw---- 1 aadant common  22020096 Sep 30 20:39 data/test_case2/t1_defrag.ibd
-rw-rw---- 1 aadant common      8612 Sep 30 20:40 data/test_case2/t1.frm
-rw-rw---- 1 aadant common 671088640 Sep 30 20:44 data/test_case2/t1.ibd

As you can see, the table t1 is fragmented since its average row length is 2 times larger than t1_defrag.

Fortunately, there is a command that fixes both FS and internal fragmentation on most storage engines :

optimize table t1;

For InnoDB, optimize does rebuild the table and analyze it. So this command also works :

alter table t1 engine=InnoDB;

mysql> alter table t1 engine=InnoDB;
Query OK, 524289 rows affected (5 min 58.99 sec)
Records: 524289  Duplicates: 0  Warnings: 0

mysql> show table status like 't1'\G
*************************** 1. row ***************************
           Name: t1
         Engine: InnoDB
        Version: 10
     Row_format: Compact
           Rows: 500516
 Avg_row_length: 611
    Data_length: 305971200
Max_data_length: 0
   Index_length: 18399232
      Data_free: 5242880
 Auto_increment: 1048578
    Create_time: 2013-09-30 19:53:47
    Update_time: NULL
     Check_time: NULL
      Collation: latin1_swedish_ci
       Checksum: NULL
 Create_options:
        Comment:
1 row in set (0.26 sec)

After the table rebuild, the average row length is back to normal and the table size is minimal on disk, here 2 times smaller, 324Mb.

ls -al data/test_case2/t1*
-rw-rw---- 1 aadant common      8612 Sep 30 20:35 data/test_case2/t1.frm
-rw-rw---- 1 aadant common 339738624 Sep 30 20:37 data/test_case2/t1.ibd

Note that the optimize and alter table engine=InnoDB commands take an exclusive lock on the table.

Update : from MySQL 5.6.17, optimize and alter table engine=InnoDB are online operations. See this blog and the MySQL manual for more information.

http://mysqlserverteam.com/mysql-5-6-17-improved-online-optimize-table-for-innodb-and-partitioned-innodb-tables/

It means that if you use these commands before MySQL 5.6.17, your application : reads, writes, other DDL will be blocked during the table rebuild. So most of the time these commands were run in maintenance windows. Or using special tricks such as pt-online-schema-change or Online Schema Change at Facebook.

In MySQL 5.6, this is no longer required in most cases thanks to InnoDB online DDL. Online DDL is a major 5.6 feature. Even though optimize table is not an online operation, in practice, you can use this online DDL :

alter table t1 row_format = <row format>;

where <row_format> is given by show table status. Usually :

alter table t1 row_format = Compact;

Make sure :

Check also : innodb_sort_buffer_size that can speed up online DDL.

 alter table t1 row_format=Compact;
Query OK, 0 rows affected (5 min 26.53 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> show table status like 't1'\G
*************************** 1. row ***************************
           Name: t1
         Engine: InnoDB
        Version: 10
     Row_format: Compact
           Rows: 502312
 Avg_row_length: 613
    Data_length: 308068352
Max_data_length: 0
   Index_length: 9977856
      Data_free: 0
 Auto_increment: 1114098
    Create_time: 2013-09-30 20:53:41
    Update_time: NULL
     Check_time: NULL
      Collation: latin1_swedish_ci
       Checksum: NULL
 Create_options: row_format=COMPACT
        Comment:
1 row in set (0.02 sec)

The disk usage is even slightly smaller, 320Mb, than the classical rebuild and it is also slightly faster.

ls -al data/test_case2/t1*
-rw-rw---- 1 aadant common      8612 Sep 30 20:51 data/test_case2/t1.frm
-rw-rw---- 1 aadant common 318767104 Sep 30 20:53 data/test_case2/t1.ibd

With innodb_sort_buffer_size= 32M, the alter is faster :

alter table t1 row_format=Compact;
Query OK, 0 rows affected (2 min 51.96 sec)
Records: 0  Duplicates: 0  Warnings: 0

 


Viewing all articles
Browse latest Browse all 8

Trending Articles