Saturday, January 21, 2012

InnoDB plugin row format performance

InnoDB plugin row format performance

Here is a quick comparison of the new InnoDB plugin performance between different compression, row formats that is introduced recently.

The table is a pretty simple one:

CREATE TABLE `sbtest` (
`id` int(10) unsigned NOT NULL,
`k` int(10) unsigned NOT NULL DEFAULT '0',
`c` char(120) NOT NULL DEFAULT '',
`pad` char(60) NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
KEY `k` (`k`)
) ENGINE=InnoDB ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE=8;

The table is populated with 10M rows with average row length being 224 bytes. The tests are performed for Compact, Dynamic and Compressed (8K and 4K) row formats using MySQL-5.1.24 with InnoDB plugin-1.0.0-5.1 on Dell PE2950 1x Xeon quad core with 16G RAM, RAID-10 with RHEL-4 64-bit.

Here are the four test scenarios:

1. No compression, ROW_FORMAT=Compact
2. ROW_FORMAT=Compressed with KEY_BLOCK_SIZE=8
3. ROW_FORMAT=Compressed with KEY_BLOCK_SIZE=4
4. ROW_FORMAT=Dynamic

All the above tests are repeated with innodb_buffer_pool_size=6G and 512M to make sure one fits everything in memory and another one overflows. The rest of the InnoDB settings are all default except that innodb_thread_concurrency=32.

Here is the summary of the test results:

Table Load:

Load time from a dump of SQL script having 10M rows (not batched)
Compact Compressed (8K) Compressed (4K) Dynamic
28m 18s 29m 46s 36m 43s 27m 55s

File Sizes:

Here is the size of the .ibd file after each data load
Compact Compressed (8K) Compressed (4K) Dynamic
2.3G 1.2G 592M 2.3G

Data and Index Size from Table Status:

Here is the Data and Index size in bytes from SHOW TABLE STATUS and you can see the original data size here rather than the compressed size
Compact Compressed (8K) Compressed (4K) Dynamic
Data 2247098368 2247098368 2249195520 2247098368
Index 137019392 137035776 160301056 137019392

Compression Stats:

Here is the compression stats after the table is populated from information_schema.InnoDB_cmp; and you notice that 4K takes more operations and time for both compression and un-compression
Page_size Compress_ops Compress_ops_ok Compress_time Uncompress_ops Uncompress_time
8K 8192 446198 445598 73 300 0
4K 4096 1091421 1012917 463 38801 13

Performance:

Here is the performance of various row formats with threads ranging from 1-512 for both 512M and 6G buffer pool size for both concurrent reads and writes.

compress512m

compress6g

Observations:

Few key observations from the performance tests that I performed without looking to any of the sources, as I could be wrong, someone can correct me here. Its hard to draw from these input scenarios, but helps to estimate what is what.

* The load time is almost same except that the 4K compression seems to take longer than the rest; and compression in general is hitting the INSERT/Load performance a little bit.
* Compact or Dynamic, there is no compression; so the data and index file sizes will be almost same
* The SHOW TABLE STATUS for compressed table will have its original Data_Length and Index_Length statistics rather than the compressed statistics (may be a bug or InnoDB needs to extend SHOW TABLE STATUS to show any compressed sizes or other means, right now only option is to view your files manually)
* 8K compression reduced the .ibd file by nearly 50% (1.2G out of 2.3G) and 4K compression reduced the size by 1/4th (592M out of 2.3G); and it could vary based on table types and data.
* 8K compression takes less ops and time for both compression and de-compression when compared to 4K (obvious)
* When there is enough Innodb buffer pool size to act data in memory, the compression is a bit overhead, but you will be saving space
* When there is a overflow from buffer pool (IO bound), compression seems to really help
* 4K compression in general seems to be slower when compared with 8K or any other row_format.

How To Obtain hierarchical data / Parent - Child relationship

IPv4 vs IPv6

I had compiled differences between IPv6 and IPv4 long back. 

Hope someone might find this useful.


IPv4
IPv6
Addresses are 32 bits (4 bytes) in length. Addresses are 128 bits (16 bytes) in length
Address (A) resource records in DNS to map host names to IPv4 addresses. Address (AAAA) resource records in DNS to map host names to IPv6 addresses.
Pointer (PTR) resource records in the IN-ADDR.ARPA DNS domain to map IPv4 addresses to host names. Pointer (PTR) resource records in the IP6.ARPA DNS domain to map IPv6 addresses to host names.
IPSec is optional and should be supported externally IPSec support is not optional
Header does not identify packet flow for QoS handling by routers Header contains Flow Label field, which Identifies packet flow for QoS handling by router.
Both routers and the sending host fragment packets. Routers do not support packet fragmentation. Sending host fragments packets
Header includes a checksum. Header does not include a checksum.
Header includes options. Optional data is supported as extension headers.
ARP uses broadcast ARP request to resolve IP to MAC/Hardware address. Multicast Neighbor Solicitation messages resolve IP addresses to MAC addresses.
Internet Group Management Protocol (IGMP) manages membership in local subnet groups. Multicast Listener Discovery (MLD) messages manage membership in local subnet groups.
Broadcast addresses are used to send traffic to all nodes on a subnet. IPv6 uses a link-local scope all-nodes multicast address.
Configured either manually or through DHCP. Does not require manual configuration or DHCP.
Must support a 576-byte packet size (possibly fragmented). Must support a 1280-byte packet size (without fragmentation).

Network Sorcery is a great place to find RFC(s). 

Refer to http://www.networksorcery.com/enp/protocol/ipv6.htm and http://www.networksorcery.com/enp/protocol/ip.htm links for related RFC(s) of IPv6 and IPv4 respectively. 

Also there is good reference for Understanding IPv6 @ http://technet.microsoft.com/en-us/library/cc786127.aspx

Tech Search