9. A deep look at MySQL 5.5 partitioning enhancements

9. A deep look at MySQL 5.5 partitioning enhancements

|출처|http://dev.mysql.com/tech-resources/articles/mysql_55_partitioning.html

: 빨간 글자는 해석이 난해한 부분 입니다. 어색하게 해석해 다른의미를 전달하는 것보단 나은 것 같아서 표시해 놓았습니다. 정확한 의미 아시는 분은 댓글 첨부해 주심 감솨하겠습니당... (_ _)

The release of MySQL 5.5 has brought several enhancements. While most of the coverage went, understandably, to the semi-synchronous replication[1], the enhancements of partitioning were neglected, and sometimes there was some degree of misunderstanding on their true meaning. With this article, we want to explain these cool enhancements, especially the parts that were not fully understood.

새로운 개정판인 MySQL5.5는 몇 가지 향상된 점이 생겼다. Semi-synchronous replication에 대한 보도는 활발히 이루어 진 반면 파티셔닝에 대한 향상은 도외시 되었다. 심지어 이 의미를 잘못 이해하고 있는 사람들도 있었다. 그래서 우리는 향상점을 쉬원히 설명해 주려 한다. 특별히 완벽히 이해 하지 못했던 부분에 대해서 말이다.

The intuitive part: partition by non-integer columns

: 정수가 아닌 컬럼 으로의 파티션

Anyone who has used partitions so far (see MySQL 5.1 partitions in practice) has experienced some frustration at the amount of problems to face when using non-integer columns. Partitions in 5.1 can only deal with integer values, meaning that if you want to do partitions on dates or strings, you had to convert these columns with a function.

ð 지금까지 파티션을 사용해 왔던 사용자들은 non-integer column(정수로 이루어지지 않은 컬럼)을 사용 할 때 적잖은 불편감을 경험하였을 것이다. Partitions 5.1은 integer 값만을 다룬다. 때문에 만약 사용자가 날짜나, 문자형으로 파티션 하고 싶을 때 함수로써 이 컬럼들을 변경 시켜야만 하는 것 이다..

The new additions work with RANGE and LIST partitioning. There is a new COLUMNS keyword that introduces the new functionality.

ð column 키워드라는 새로운 기능이 나왔는데 RANGE,LIST 파티셔닝에서 이용 가능하다.

Let's assume a table like this one:

ð 테이블이 아래와 같이 구성되어 있다고 가정하자.

CREATE TABLE expenses

expense_date DATE NOT NULL,

category VARCHAR(30),

amount DECIMAL (10,3)

);

If you want to partition by category in MySQL 5.1, you will have to convert categories into integers, with an additional lookup table. As of MySQL 5.5., you can simply do

ð category 컬럼을 파티셔닝

MySQL5.1 버전: 카테고리를 정수형으로 변경하여야 함.

MySQL 5.5: 변경 작업 없이 간단히 작업할 수 있다.

ALTER TABLE expenses

PARTITION BY LIST COLUMNS (category)

           PARTITION p01 VALUES IN ( 'lodging', 'food'),

           PARTITION p02 VALUES IN ( 'flights', 'ground transportation'),

           PARTITION p03 VALUES IN ( 'leisure', 'customer entertainment'),

           PARTITION p04 VALUES IN ( 'communications'),

           PARTITION p05 VALUES IN ( 'fees')

);

This statement, in addition to being clearly readable and to organizing the data into efficient chunks, has the beneficial side effect of ensuring that only the listed categories are accepted.

ð 이 문장은 가독성이 좋고 데이터를 효과적으로 조직화 할 수 있다. 그리고 오로지 분류 된 데이터만을 받아들이는 것을 보장한다.

Another pain point in MySQL 5.1 is the handling of date columns. You can't use them directly, but you need to convert such columns using either YEAR or TO_DAYS, with situations like this one:

ð MySQL 5.1의 다른 안 좋은 단점 중 하나: 날짜 컬럼에 대한 관리이다.

아래 보이는 바와 같이 바로 원하는 컬럼의 형태를 사용할 수 없고,

YEAR나 TO_DAYS 같은 함수를 이용해 변경을 시켜줘야 한다.

/* with MySQL 5.1*/
CREATE TABLE t2
(
  dt DATE
)
PARTITION BY RANGE (TO_DAYS(dt))
(
  PARTITION p01 VALUES LESS THAN (TO_DAYS('2007-01-01')),
  PARTITION p02 VALUES LESS THAN (TO_DAYS('2008-01-01')),
  PARTITION p03 VALUES LESS THAN (TO_DAYS('2009-01-01')),
  PARTITION p04 VALUES LESS THAN (MAXVALUE));
 
SHOW CREATE TABLE t2 \G
*************************** 1. row ***************************
       Table: t2
Create Table: CREATE TABLE `t2` (
  `dt` date DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1
/*!50100 PARTITION BY RANGE (TO_DAYS(dt))
(PARTITION p01 VALUES LESS THAN (733042) ENGINE = MyISAM,
 PARTITION p02 VALUES LESS THAN (733407) ENGINE = MyISAM,
 PARTITION p03 VALUES LESS THAN (733773) ENGINE = MyISAM,
 PARTITION p04 VALUES LESS THAN MAXVALUE ENGINE = MyISAM) */

How dreadful. A real pain in the ... code. Of course, there were workarounds, but the trouble was quite a lot. Not to mention that it was really puzzling to define a partition using YEAR or TO_DAYS, and then having to query by bare column, as the queries by function did not kick the partition pruning[i].[2]

ð 끔직하다. 코드상의 골칫거리이다. 물론 제2의 해결책이 있긴 하지만 문제는 많다.

Now it's a different story. Partitioning by date has become easy and immediate.

ð 자, 이제 새로 바뀐 date로 구성하는 파티셔닝의 쉬운 구성을 보도록 하자.

/*With MySQL 5.5*/

CREATE TABLE t2

  dt DATE

PARTITION BY RANGE COLUMNS (dt)

  PARTITION p01 VALUES LESS THAN ('2007-01-01'),

  PARTITION p02 VALUES LESS THAN ('2008-01-01'),

  PARTITION p03 VALUES LESS THAN ('2009-01-01'),

  PARTITION p04 VALUES LESS THAN (MAXVALUE));

SHOW CREATE TABLE t2 \G

*************************** 1. row ***************************

       Table: t2

Create Table: CREATE TABLE `t2` (

  `dt` date DEFAULT NULL

) ENGINE=MyISAM DEFAULT CHARSET=latin1

/*!50500 PARTITION BY RANGE  COLUMNS(dt)

(PARTITION p01 VALUES LESS THAN ('2007-01-01') ENGINE = MyISAM,

 PARTITION p02 VALUES LESS THAN ('2008-01-01') ENGINE = MyISAM,

 PARTITION p03 VALUES LESS THAN ('2009-01-01') ENGINE = MyISAM,

 PARTITION p04 VALUES LESS THAN (MAXVALUE) ENGINE = MyISAM) */

The partition pruning will kick as in the previous case. There is no confusion between defining by function and querying by column because the definition is by column; the values we insert in the definition are preserved, making the DBA job much easier.

ð 파티셔닝의 결과는 앞서 나왔던 방식과 같을 것이다. 파티셔닝을 구성할 때 column 정의 방식으로 구성하므로 혼동도 없다; 정의 한 값의 변화도 생기지 않아 DBA의 일 처리가 훨씬 쉬워진다.

Everyone's happy then? Well, almost. Let's have a look at some more obscure trait of the COLUMNS feature.

ð 모두가 만족하는가? 거의가 그럴 것이다. 이젠 COLUMN 의 알려져 있지 않은 특성을 보도록 하겠다.

The counter-intuitive part: multiple columns

: multiple columns

The COLUMNS keyword does more than allowing string and date columns as partition definers. It also allows using multiple columns to define a partition.

ð COLUMN키워드는 string이나 date 이상의 파티셔닝 정의를 허용한다. 그 중에 하나가 바로 multiple column 방식이다.

You probably have seen some examples in the official docs, with something like the ones below:

ð 아래 예의 상황들을 본 적이 있을 것이다.

CREATE TABLE p1 (

  a INT,

  b INT,

  c INT

PARTITION BY RANGE COLUMNS (a,b)

  PARTITION p01 VALUES LESS THAN (10,20),

  PARTITION p02 VALUES LESS THAN (20,30),

  PARTITION p03 VALUES LESS THAN (30,40),

  PARTITION p04 VALUES LESS THAN (40,MAXVALUE),

  PARTITION p05 VALUES LESS THAN (MAXVALUE,MAXVALUE)

);

CREATE TABLE p2 (

  a INT,

  b INT,

  c INT

PARTITION BY RANGE COLUMNS (a,b)

  PARTITION p01 VALUES LESS THAN (10,10),

  PARTITION p02 VALUES LESS THAN (10,20),

  PARTITION p03 VALUES LESS THAN (10,30),

  PARTITION p04 VALUES LESS THAN (10,MAXVALUE),

  PARTITION p05 VALUES LESS THAN (MAXVALUE,MAXVALUE)

There are also examples with PARTITION BY RANGE COLUMNS (a,b,c), and more. If you are the kind of reader who gets the whole idea from looking at such examples, feel free to make fun of me, because I didn't.

Having been using MySQL 5.1 partitions for long time, I failed to grasp immediately the significance of partitioning by multiple columns. What is the meaning of LESS THAN (10,10)? And what happens if the next partition is LESS THAN (10,20)? What if, instead, is a completely different pair, like (20,30)?

ð MySQL 5.1파티셔닝을 사용할 때 필자는 multiple column 파티셔닝의 중요성에 대해 이해하지 못했다. LESS THAN (10,10)이 뭐지? 다음에 나오는 LESS THAN (10,20)?는 도대체 어떻게 작동하는 거지? like (20,30) 같은 아예 다른형식이 오면 어떻게 되는 걸까?

All these questions need an answer, and before an answer they need a better understanding of what we are dealing with.

ð 모든 궁금증에는 답이 필요했지만 답 이전에 우리가 무엇을 다루고 있는가를 알 필요가 있었다.

In the beginning, there was some confusion, even among MySQL engineers. And it has fooled me as well! It was believed that, when all the partitions have different first range values, for all practical purposes it was the same as if the table were partitioned on one column only. But this is not the case. In the following example:

ð 처음에는 MySQL 엔지니어들 사이에서까지 혼동이 있었다.

CREATE TABLE p1_single (

  a INT,

  b INT,

  c INT

PARTITION BY RANGE COLUMNS (a)

  PARTITION p01 VALUES LESS THAN (10),

  PARTITION p02 VALUES LESS THAN (20),

  PARTITION p03 VALUES LESS THAN (30),

  PARTITION p04 VALUES LESS THAN (40),

  PARTITION p05 VALUES LESS THAN (MAXVALUE)

);

This is not equivalent to the table p1 above. If you insert (10, 1, 1) in p1, it will go to the first partition. In p1_single, instead, it will go to the second one.

ð 위 파티션은 p1과 다른 구성이다.

만약, (10,1,1)을

p1에 insert하였다면,

è 첫 번째 파티션 PARTITION p01 VALUES LESS THAN (10,20) 안에 들어갈 것이고,

P1_single에 insert하였다면,

è PARTITION p01 VALUES LESS THAN (10),

PARTITION p02 VALUES LESS THAN (20), 중 두번 째 파티션에 들어갔을 것이다.

The reason is that (10,1) is LESS THAN (10, 10). If you only focus on the first value, you will fail to realize that you are comparing a tuple, not a single value.

ð 그 이유는 (10,1)은 LESS THAN(10,10) 이기 때문이다. 값 하나를 비교하는게 아니라 하나의 튜플 단위로 비교해야 한다.

Now for the difficult part. What happens when you need to determine where a row will be placed? How do you evaluate an operation like (10,9) < (10,10)?

ð 좀 더 어려운 부분이다. (10,9) < (10,10) 둘 을 비교했을 때 무엇을 더 크게 볼 것인가?

The answer is simple: the same way you evaluate two records when you are sorting them.

ð 답은 간단하다. Sorting을 해 보면 쉽게 알 수 있다.


a=10
b=9
(a,b) < (10,10) ?
 
# evaluates to:
 
(a < 10)
OR
((a = 10) AND ( b < 10))
 
# which translates to:
 
(10 < 10)
OR((10 = 10) AND ( 9 < 10))

If you have three columns, the expression is longer, but not more complex. You first test for less than on the first item. If two or more partitions match that, then you test the second item. If after that you still have more than one candidate partition, then you test the third item.

ð 3개의 컬럼을 가지고 있다고 가정

- 1번째 test : less than 테스트

- 2번째 test : 앞의 하나, 두 개의 조건 범위가 일치(=)될 경우,

- 3번째 test : 앞의 테스트 이후에 여전히 하나 이상의 후보 파티션이 있을 시

The figures below will walk you through the evaluation of three records being inserted into a table with a partition definition of

ð 세 게의 레코드들이 아래 파티셔닝의 기준으로 들어갈 것이다.

(10,10),

(10,20),

(10,30),

(10, MAXVALUE)

Fig. 1. Comparing tuples. When the first value is less than the first range in the partition definition, things are easy. The row belongs here

ð 튜플 비교. 첫번째 값이 파티션의 첫번째 range정의 보다 작으면, 해당 row는 그 파티션에 해당 된다.

Fig. 2. Comparing tuples. When the first value is equal to the first range in the partition definition, then we need to compare the second item. If that one is less than the second range, then the row belongs here.

ð 튜플 비교. 첫 번째 값이 파티션의 첫 번째 range값과 같을 경우, 두 번째 item과 비교한다.

두 번째 값과 비교했을 때 그 값 보다 작을 경우 해당 파티션에 속하게 된다.

Fig. 3. Comparing tuples. Both the first and second values are equal to their corresponding ranges. The tuple is not LESS THAN the defined range, and thus it doesn't belong here. Next step.

ð 튜플비교. 두 가지 값이 모두 파티션의 range와 같을(=) 경우 튜플은 LESS THAN정의에 속하지 않는 것이다. 그러므로 다음 순서로 간다.

Fig. 4. Comparing tuples. At the next range, the first item is equel, and the second item is smaller. Thus the tuple is smaller, and the row belongs here.

ð 튜플비교. 다음 range에서 첫번째 item은 같고 두번째 item이 작을 때 이튜플은 P02 보다 작다는 것이다. 해당 row는 이 범위에 남는다.

With the help of these figures, we have now a better understanding of the procedure to insert a record into a multi-column partitioned table. That was the theory.

ð 위 예제들로 multi-column partitioned table의 이해를 도울 수 있다.

To help you grasp the new feature better than I did at the beginning, let me offer a different example, which should make more sense to the non-theoretically oriented readers. I will use a table taken from the MySQL test employees database on Launchpad, with some modifications.

ð 기초 지식이 없는 구독자들에게 더 나은 이해를 위해 다른 예를 들어 보이겠다.

예에 나오는 테이블은 MySQL test employees database로부터 받아 온 테이블에 적당한 변경을 가한 것 이다.


CREATE TABLE employees (
  emp_no int(11) NOT NULL,
  birth_date date NOT NULL,
  first_name varchar(14) NOT NULL,
  last_name varchar(16) NOT NULL,
  gender char(1) DEFAULT NULL,
  hire_date date NOT NULL
) ENGINE=MyISAM
PARTITION BY RANGE  COLUMNS(gender,hire_date)
(PARTITION p01 VALUES LESS THAN ('F','1990-01-01') ,
 PARTITION p02 VALUES LESS THAN ('F','2000-01-01') ,
 PARTITION p03 VALUES LESS THAN ('F',MAXVALUE) ,
 PARTITION p04 VALUES LESS THAN ('M','1990-01-01') ,
 PARTITION p05 VALUES LESS THAN ('M','2000-01-01') ,
 PARTITION p06 VALUES LESS THAN ('M',MAXVALUE) , PARTITION p07 VALUES LESS THAN (MAXVALUE,MAXVALUE)

Unlike the above examples, which lie too much on the theoretical side, this one is understandable. The first partition will store female employees hired before 1990, the second one female employees hider between 1990 and 2000, and the third one all the remaining female employees. For partitions p04 to p06 we get the same cases, but for male employees. The last partition is a control case: if anyone ends up in this partition, there must have been a mistake somewhere.

ð Create문의 설명

P01 : 1990년 이전에 입사한 여자사원

P02 : 1990년~2000년 사이에 입사 한 여자사원

P03 : 그 외의 모든 여자사원

P04 ~p06 : 같은 조건의 남자사원

P07 : 제어case => 이 부분에 들어 간 데이터가 있다면 어딘 가 잘못된 부분이 있어서 일 것이다.

Reading the latest sentence, you may rightfully ask: how do I know in which partition the rows are stored?

There are two ways actually. The first one is to use a query with the same conditions used to define the partitions.

ð 마지막 p07을 보고 어떻게 데이터가 어느 파티션에 들어가는지 확인할 수 있지? 라고 생각 할 것이다.

확인 할 수 있는 두 가지 방법이 있다.

1. Partiton을 정의했던 같은 조건으로 데이터를 select 하는 것

SELECT

CASE

  WHEN gender = 'F' AND hire_date < '1990-01-01'

  THEN 'p1'

  WHEN gender = 'F' AND hire_date < '2000-01-01'

  THEN 'p2'

  WHEN gender = 'F' AND hire_date < '2999-01-01'

  THEN 'p3'

  WHEN gender = 'M' AND hire_date < '1990-01-01'

  THEN 'p4'

  WHEN gender = 'M' AND hire_date < '2000-01-01'

  THEN 'p5'

  WHEN gender = 'M' AND hire_date < '2999-01-01'

  THEN 'p6'

ELSE

  'p7'

END as p,

COUNT(*) AS rows

FROM employees

GROUP BY p;

+------+-------+

| p    | rows  |

+------+-------+

| p1   | 66212 |

| p2   | 53832 |

| p3   |     7  |

| p4   | 98585 |

| p5   | 81382 |

| p6   |     6  |

+------+-------+

If the table is MyISAM or ARCHIVE, then you can trust the statistics provided by the INFORMATION_SCHEMA.

ð 만약 테이블이 MyISAM이나 ARCHIVE 인경우라면 infromation_Schema에서 제공되는 통계값을 확인하면 된다.

SELECT

  partition_name part,

  partition_expression expr,

  partition_description descr,

  table_rows

FROM

  INFORMATION_SCHEMA.partitions

WHERE

  TABLE_SCHEMA = schema()

  AND TABLE_NAME='employees';

+------+------------------+---------------------+------------+

| part | expr             | descr              | table_rows |

+------+------------------+---------------------+--------------+

| p01  | gender,hire_date | 'F','1990-01-01'  |      66212 |

| p02  | gender,hire_date | 'F','2000-01-01'  |      53832 |

| p03  | gender,hire_date | 'F',MAXVALUE     |          7 |

| p04  | gender,hire_date | 'M','1990-01-01'  |      98585 |

| p05  | gender,hire_date | 'M','2000-01-01'  |      81382 |

| p06  | gender,hire_date | 'M',MAXVALUE    |           6 |

| p07  | gender,hire_date | MAXVALUE,MAXVALUE |      0 |

+------+------------------+----------------------+-------------+

If the engine is InnoDB, then the above values are approximated, and you can't trust them if you need exact values.

ð 만약 InnoDB엔진이라면 위의 결과는 그냥 대략적인 데이터 값이 된다.

One more question may still be floating in the air after all the above explanation, and it's about performance. Do these enhancements trigger the partition pruning? The answer is an unequivocal yes. Unlike the 5.1, where partitioning by date only works with two functions, in 5.5 every partition defined with the COLUMNS keyword will use the partition pruning. Let's try:

ð 위의 설명 끝에 여전히 하나의 물음이 맴돈다. 바로 성능이다.

이 파티셔닝 기능의 향상이 partition pruning의 향상으로 이어지는 것인가? 대답은 명백히 yes 라는 것이다. Date로 파티셔닝시 오직 두 개의 함수만을 사용할 수 있는 MySQL 5.1과는 다르게 5.5에서는 column 키워드로 정의되어진 모든 파티션들이 partition pruning을 사용할 수 있다.

예를 보자.


select count(*) from employees where gender='F' and hire_date < '1990-01-01';
+----------+
| count(*) |
+----------+
|    66212 |
+----------+
1 row in set (0.05 sec)
 
explain partitions select count(*) from employees where gender='F' and hire_date < '1990-01-01'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: employees
   partitions: p01
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 300024        Extra: Using where

Using the conditions that define the first partition, we get a very optimized query. Not only that, but also a partial condition will benefit from the partition pruning:

ð 첫번째 파티션을 타도록 조건을 정한 후 우리는 최적화된 쿼리를 얻었다. 뿐만 아니라 부분적인 조건은 partition pruning으로 인해 이익을 얻을 것이다.


select count(*) from employees where gender='F';
+----------+
| count(*) |
+----------+
|   120051 |
+----------+
1 row in set (0.12 sec)
 
explain partitions select count(*) from employees where gender='F'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: employees
   partitions: p01,p02,p03,p04
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 300024        Extra: Using where

This is the same algorithm used for composite indexes. If your condition refers to the leftmost part of the index, MySQL will use it. Similarly, if you refer to the leftmost part of the partition definition, MySQL will prune as much as possible. As it happens with composite indexes, if you only use the rightmost condition, the partition pruning doesn't work:

ð 이 것은 복합 인덱스와 같은 알고리즘이다. 만약 조건을 가장 왼 쪽에 있는 부분으로부터의 인덱스를 참조하도록 조건을 걸었다면 MySQL 은 이것을 사용할 것이다. 유사하게 가장 왼 쪽의 파티션 정의를 참조한다면 MySQL은 최대한 필요한 부분만을 사용하려는 시도를 할 것이다. 마침 복합인덱스를 사용하는 상황에서 오른쪽 조건을 참조하려 한다면 partition pruning은 작동하지 않을 것 이다.


select count(*) from employees where hire_date < '1990-01-01';
+----------+
| count(*) |
+----------+
|   164797 |
+----------+
1 row in set (0.18 sec)
 
explain partitions select count(*) from employees where hire_date < '1990-01-01'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: employees
   partitions: p01,p02,p03,p04,p05,p06,p07
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 300024        Extra: Using where

Referring to the second part of the partition definition without using the first one will generate a full table scan. This is always something to keep in mind when designing partitions and writing queries.

복합 파티션 정의에서 첫번째 파티션이 아닌 두번째 파티션만을 참고하게 된다면 full table scan을 타게 될 것이다. 이 점은 파티셔닝 구성을 할 때나 쿼리를 짤 때 항상 염두 해야 할 부분이다.

Usability enhancements: TRUNCATE PARTITION

: TRUNCATE PARTITION의 편리함

One of the most appealing features of partitions id the ability of removing large amounts of records almost instantly. In a scheme that has become quite popular, DBAs are using partitions to rotate historical records in tables partitioned by date, dropping the partition with the oldest records at regular intervals. This method works very well. You drop the first partition (i.e. the one with the oldest records) and add a new one at the end (i.e. the one which will get the newest records).

ð 파티션 기능 중 매력적인 것 중의 하나가 바로 많은 양의 레코드들을 거의 즉시(한거번에) 지울 수 있다는 것이다. DBA들은 date로 파티셔닝 된 테이블의 역사기록적인 레코드 들을 교체하는데 정기적으로 가장 오래 된 레코드들이 있는 파티션들을 drop 하는 방식으로 진행한다. 이 방법은 효율적인 방법으로 테이블의 한 파티션부분을 지운 후 새로운 파티션을 추가 해 주게 된다.(새로운 레코드들을 모을 파티션).

All is well until you only need to trim from the bottom. But when you need to remove records from a partition in between, things are not as smooth. You can drop the partition, no problem about that. But if you want just to empty it, you face quite a painful problem. To remove all records from a partition you can:

ð 아래서부터 수정한다면 효과를 얻을 수 있지만, 파티션들 사이에 있는 파티션들을 지우게 될 경우 일이 순조롭지 않게 된다. Drop partition에는 문제가 없지만 만약 그냥 파티션이 비어있도록 두고 싶을 시 해당 파티션에서 레코드들만을 지우는데 수고스러운 문제를 직면하게 된다. 모든 레코드들을 파티션에서 지우고 싶을 때는 아래와 같이 하면 된다.:

l Use the DELETE statement, thus relinquishing most of the advantages of partitions trimming;

l use the DROP PARTITION, followed by a REORGANIZE PARTITIONS to re-create it, but it's is often more costly than the previous choice.

ð DELETE 문장을 사용하여라 대부분의 파티션들의 정리의 이점을 포기해야 할 것이다.

ð DROP PARTITION을 REORGANIZE PARTITIONS(re-create라는 뜻)과 함께 사용하라. 하지만 이전 선택보다 비용적으로 높은 경우가 많다.

|참고| 5.1의 partition add: 중간부분의 partition을 지우고 다시 create할 수 없다. 오직 마지막 partition범위 뒤로만 add partition가능

MySQL 5.5 introduces TRUNCATE PARTITION, a statement that works like DROP PARTITION, but leaving the partition in place, ready to be filled in again.

TRUNCATE PARTITION is a statement that should be in every DBA's tool chest.

ð MySQL 5.5는 TRUNCATE PARTITION을 소개하였다. DROP PARTITION같은 작업과 비슷하지만 파티션은 제자리에 남겨둔다. 다시 데이터를 채울 수 있는 준비가 되는 것이다.

TRUNCATE PARTITION은 DBA들의 공구상자와도 같은 기능을 한다.

More fine tuning: TO_SECONDS

As a bonus, the partitions enhancement package has a new function to manipulate DATE and DATETIME columns. With the TO_SECONDS function your can convert a date/time column into the number of seconds from year "0". It is useful if you want to partition on time intervals smaller than one day.

ð 보너스로 파티션의 향상된 기능의 패키지에는 DATE 그리고 DATETIME컬럼을 재조정하는 기능이 있다. TO_SECONDS 함수기능으로 date/time 컬럼을 초 단위로 표현 할 수 있다. 이 기능은 하루 미만의 시간간격을 가지고 파티셔닝 할 때 유용하게 사용된다.

Like the rest of the enhancements, TO_SECONDS triggers the partition pruning, thus raising to three the number of date functions that you can efficiently use with partitions.

Unlike TO_DAYS, which can be reversed with FROM_DAYS, there is no such function for TO_SECONDS, but it is not that hard to create one.

ð 나머지 향상과 관련하여 TO_SECONDS 는 partition pruning의 계기가 된다. 그러므로

FROM_DAYS의 변형인 TO_DAYS와 다르게 TO_SECONDS는 그러한 기능이 없다. 하지만 쉽게 만들 수 있다.

Armed with these new weapons, we can confidently create a table with less than one day temporal partitions, as follows:

ð 이러한 새로운 무기로 아래처럼 우리는 확신을 갖고 하루 이전의 시간간격으로 파티션 된 테이블을 만들 수 있다.

Since we aren't using the COLUMNS keyword (and we can't, because mixing COLUMNS and functions is not allowed), the values recorded in the table definitions are the results of the TO_SECONDS function.

But thanks to the new function, we can reverse the value and get a human readable value, as shown in this old blog post.

ð 우리는 columns 키워드를 쓰지 않았기 때문에 (사용할 수도 없다. 이유는 혼합된 column과 function들은 허용되지 않기 때문이다.), 테이블에 레코드 형으로 들어가 있는 값들은 TO_SECONDS function의 결과 이다.

ð 하지만 old blog post에 나온 것 처럼 새로운 function 덕분에 우리는 값을 바꿀 수 있고 가독성 좋은 값을 얻을 수 있다.

Summing up

MySQL 5.5 is definitely good news for partitions users. While there is no direct improvement on the performance (if you evaluate performance as response time), the ease of use of the enhancements and the time saved by the new TRUNCATE PARTITION statement will result in much time saved by the DBA, and sometime by the final users.

ð MySQL 5.5는 확실히 파티션을 사용하는 유저들에게 좋은 소식을 가지고 있다. 성능에 직격적인 향상을 가져온 것은 없지만(당신이 응답속도로서 performance를 실행하였다면), 사용의 편의성이나 시간절약을 가능하게 한 새로운 TRUNCATE PARTITION문장은 DBA에게나 시간적인 절약을 가능하게 하였다.

These additions will be still updated in the next milestone release, and eventually will be GA in mid 2010. Time for all partitions users to give it a try!

ð 이 추가분은 milestone release때도 계속 업데이트 될 것이고 결국 2010년 중반 모든 사람이 파티션을 쓸 수 있도록 GA로 될 것이다.

[1] semi-synchronous replication: MySQL5.5 N.F 기능 중의 하나로, master-slave 간의 binlog전 달 시 슬레이브의 상태에 관계없이 그냥 보내기만 하는 기존의 비동기방식 replication과는 다르게 트랜젝션이 commit된 후 슬레이브에서도 해당 트렌젝션이 commit되었다는 상태를 마스터에서 확인해야지만 해당 트렌젝션의 작업을 완료하고 다음 트렌젝션 수행이 진행되는 방식이다.

[2] Pruning on partitioned table : It is easy to see that none of the rows which ought to be returned will be in either of the partitions p0 or p3; that is, we need to search only in partitions p1 and p2 to find matching rows. By doing so, it is possible to expend much less time and effort in finding matching rows than would be required to scan all partitions in the table. This “cutting away” of unneeded partitions is known as pruning. When the optimizer can make use of partition pruning in performing a query, execution of the query can be an order of magnitude faster than the same query against a nonpartitioned table containing the same column definitions and data.

: 파티셔닝 테이블에서는 p0~p3까지의 네 부분의 파티셔닝이 생성 되었을 경우, 원하는 데이터가 p1과 p2에만 있을 경우 이 데이터들을 반환하기 위해 full table scan을 하지 않고 필요한 p1,p2부분만 탐색하여 값을 반환하게 된다. 이런 구조는 데이터를 찾고 반환하는데 시간적으로 효율적이다. 여기서 불필요한 파티션 부분에 대한 “cutting away(제외)”를 pruning이라고 한다. 쿼리를 수행하는데 있어서 옵티마이저가 partition pruning을 사용하돋록 만드는데 파티셔닝 하지 않은 데이터에 대해 똑 같은 쿼리를 적용시켰을 때와는 수행속도가 비교할 수 없을 정도로 빠르게 나타나는 것들 볼 수 있다.

'Advanced MySQL' 카테고리의 다른 글

MySQL InnoDB Caching (0)	2011.06.09
SQL문을 병렬 수행할 시에 알아야할 8가지 대체 공식 (0)	2011.05.19
7. PlanetMySQL Blog: Does the insert buffer work? (0)	2011.04.21
Getting History of Table Sizes in MySQL (0)	2011.03.21
5. Shard-Query adds parallelism to queries (0)	2011.03.02

최고의 dbAdmin을 위해~

9. A deep look at MySQL 5.5 partitioning enhancements

'Advanced MySQL' 카테고리의 다른 글

티스토리툴바

9. A deep look at MySQL 5.5 partitioning enhancements

'Advanced MySQL' 카테고리의 다른 글

'Advanced MySQL' Related Articles

티스토리툴바