usally we can use the dbc.TABLESIZE table to see perm size of each AMP on a table to find that if there is any significant data skew exist
SELECT
VPROC, SUM(CURRENTPERM)
FROM
DBC.TABLESIZE
WHERE
DATABASENAME = 'CVDTST'
AND TABLENAME = 'K_CALENDER'
GROUP BY
VPROC
Vproc Sum(CurrentPerm)
0 5,632.00
1 5,632.00
2 4,608.00
3 4,608.00
4 5,120.00
5 5,632.00
.
.
.
.
81 5,120.00
82 6,144.00
83 5,632.00
84 5,120.00
85 5,120.00
86 5,120.00
87 5,120.00
More accurate method is, count table's row count on each amp but not perm size.
it is more accurate, but much more costly
SELECT
HASHAMP (HASHBUCKET(HASHROW ( BOT_CD ,KAISYA_CD ,REKI_YMD))) AS "AMP NO", COUNT(*)
FROM
CVDTST.K_CALENDER
GROUP BY
1
AMP NO Count(*)
0 57
1 60
2 45
3 43
4 51
5 55
.
.
.
.
.
80 50
81 49
82 62
83 56
84 48
85 52
86 54
87 52
一般来说,我们可以通过查看DBC.Tablesize中某个表在各个AMP上占用的PERM空间来判定是否有显著的SKEW。
更精确的方法是利用PI字段依次计算出row hash,bucket,amp,通过统计每个amp上分配的行数来精确判定SKEW。
这个方法更精确,当然也需要更大的代价
No comments:
Post a Comment