2010/06/15

Built-in Functions in Teradata

http://readvitamin.com/2007/08/09/how-to-use-built-in-functions-in-teradata/


Written by admin in Thursday, August 9th 2007 under IT  

First of all what are Built-in functions? These are functions which do not need any sort of arguments but still returns information about the system; user, date, time, session etc. Built-in functions are sometimes referred to as special registers. Mostly used with the SELECT statement. Usage;

SELECT

• ACCOUNT: - If you are wondering about your Teradata Account information just use this function to see your current account string.

• CURRENT_DATE:- Returns the current system date and it is same as the function DATE.

• CURRENT_TIME:- This function returns the current system time and current session ‘Time Zone’ displacement. This function is similar to the TIME function but with extra feature of Time Zone displacement.

• CURRENT_TIMESTAMP:- Returns the current system timestamp (including year, month and day) and current session Time Zone displacement.

• DATABASE: - If you are wondering how to find your current database name, use this function which returns the name of the default database for the current user.

• DATE: - As mentioned earlier, it does return the current date.

• SESSION: - Returns a number for the session the current user is in. Two sessions does not have the same number. If you want to identify your session, this is the function to use.

 • TIME: - As mentioned, this function provides the current time based on a 24-hour day; mean to say for 4:00 pm, you would see 16:00:00.

 • USER: - If you have forgotten your username after you have logged in, this command would come to your rescue. This one gives the user name of the current user.

Import/Export Data by SQL Assistant

Export
SQL Assistant -> Exprort Data ->输出至文件
Import
SQL Assistant -> Import Data
insert into dwhtst.TMSL_CUST_MAJ  values (?,?,?,'Y',NULL,NULL,NULL)
执行并选择输入源文件

不过好慢

2010/06/14

SSRS动态设置URL有无

原来URL的有无也是可以设定的


=IIF(Fields!qty_act.Value Is Nothing or Fields!customer.Value = "-" or Fields!customer.Value = "" or UCASE(Fields!customer.Value) = "OTHERS",
Nothing,
"MBI_TVSIRPTRS017")

2010/06/12

TD性能调优和物理实现培训总结 2

原帖地址
http://blog.chinaunix.net/u3/103331/showart_2050758.html

总结 2
1、Parcel
   Client to Server(Request Parcel(1)、Data Parcel(0 or N)、Response Parcel)
   Parcel步骤:REQUEST PARCEL -> SYNTAXER -> RESOLVER -> SECURITY -> OPTIMIZER -> GENERATOR -> GNCApply -> AMP STEPS
   
   使用Data Parcel的好处,是可以让PE缓存执行计划,重复执行,提高效率。
   使用意见:对于那种重复执行的动作,比如重复插入一堆的数据的时候,最好将数据弄成文件的形式,采用Using 的方式在BTEQ中提交,能提高很多效率。
   如何查看执行的sql是否被cache呢?
   Answer:通过DBQLogTbl表,里面的字段CacheFlag可以查看该语句在执行的时候是否有被cache(PE的解析Cache是采用LRU算法,4个小时最长生命周期,既4个小时候会把该sql语句给unmarked)
   Parcel的RESOLVER过程,会使用系统字典表来检查sql语句是否正确,那么是否会锁系统表呢?
   Answer:Parcel的RESOLVER也是使用缓存来存系统字典表信息,不会锁系统表。Data Dictionary Cache
2、Join Strategies
   Teradata的设计思想是:Share Nothing,也就是说各个AMP之间是独立,每个AMP都不知道其他AMP到底存储有什么样的数据,所以对于Join、比较等,都需要在本AMP进行了,也就有了所有的Redistribution和Duplicate to all amp。
   Product Join:Duplicate smaller table to all amp,(not sort)
   Merge Join:Duplicate smaller table or redistribute table which don't join with pi. Sort on join column row hash.
   merge join 由于Sort column row hash,所以可以减少比较的次数(不用像Product Join那样每个都需要比较)。
   
   Join如果非正常的慢可能有以下几个情况:
   which is redistributed by hash code to all AMPs : Skew表被Redistribute了
   which is duplicated on all AMPs : 大表被Duplicate了。  
   这就涉及到一个统计信息的问题。如果收集的统计信息没有及时的更新,比如说建完一个表,收集了统计信息,然后统计信息里面这个表是小表(因为是空表)。而当这个表数据量变大时,没有更新统计信息,则优化器还是认为他是小表,可能就会duplicate这张表,那就是个大工作量了。
   
   Where条件中,关于OR操作的优化,可以采用UNION的方式,放到2个SQL语句中执行,而不做product join。
    
SELECT  Department_Name, Last_Name, First_Name 
FROM     Employee INNER JOIN Department 
ON           Employee.Department_Number = Department.Department_Number 
OR           Employee.Employee_Number = Department.Manager_Employee_Number; 
变为
SELECT  Department_Name, Last_Name, First_Name
FROM     Employee INNER JOIN Department
ON          Employee.Department_Number = Department.Department_Number
UNION UNION
SELECT  Department_Name, Last_Name, First_Name
FROM     Employee INNER JOIN Department
ON          Employee.Employee_Number = Department.Manager_Employee_Number;

   Nested Join(最简单的join):join的一个表而且有确定的where条件(UPI或者USI) ,只能选择出一条记录
3、PPI
  PPI的好处:特殊查询能力的增强,比如说按日期查询(设置按日期的PPI),避免全表扫描
  PPI可能出现的问题:多了2个字节的空间浪费(partition号),对PI的查询可能会比没有定义PPI的慢(先确定了partition在访问里面的row hash)。
  如果partition 不是PI的一部分的话,PI不能定义为UPI,而为了保证唯一性,可以定义USI在字段上。
  思考?为什么partition不是PI的一部分的话,PI不能定义为UPI
  Answer:UPI能确保唯一性,而如果partition不是PI的一部分的话,如果使用PI来Access的话,会扫描每个partition在定位,已经失去了作为UPI的意义了。
  
  DBC.IndexConstraints中ConstraintsType标志位Q的表示PPI。
  PPI的适用情形:
  1、大表
  2、partition字段经常被Access
  3、partition的数量是有限制的(65535)
4、Aggregation
  ARSA算法:
  Aggregate - Local on the AMPs
  Redistribute - The Local Aggregations
  Sort - The Redistribution Aggregations
  Aggregate - Globally for the final aggregation
  Group by VS Distinct
  1、适用情况:
     Group by 先在本AMP上分组,然后在在Globally上分组
     Distinct 是先把数据分布到All AMP上在去重。
     所以,在重复值比较多的情况下,Group by 能提高点性能,其他情况下,Distinct比较快点。(不是绝对的)
  2、关于Eliminate
     group by 没有进行eliminate操作,distinct有进行这样的操作。
     (in、not in、unique、distinct操作会去重)
5、Date
  TD中,日期以数字来存储。如2009-09-09在数据库中被转换为20090909,在数据库中存储过数字10090909(20090909-19000000)
  如果时间小于19000000则在数据库中存储为负值。
  
        SELECT EXTRACT(YEAR FROM DATE '2004-03-31') ||
        EXTRACT(MONTH FROM DATE '2004-03-31')
        result:       2004          3
        整数转换为11位字符在||
 

TD性能调优和物理实现培训总结 1

原帖地址
1、几种索引的判定:
Join Index 预先进行关联的表,类似于“物化视图”,需要占用表空间,Join Index可以自动更新。
Global Index 在选择出来的字段中有rowid
Sparse Index 在Index中使用了Where条件
Aggregate Join Index 预先在index中进行Aggregate操作(同summary table的区别:summary table需要手动更新,Aggregate Join index可以自动更新)
2、Store Procedure和Macro的区别:
   Macro和Stored Produre同样能使用变量,调用sql语句。其中Macro一般比Stored Procedure 简单,同时Stored Procedure支持SPL操作,既IF..ELSE...THEN等。
3、TDWM:Teradata Dynamic Workload Manager,混合负载管理,动态的管理资源分布和优先级。(目前正在做试验)
4、关于视图
  加视图的好处:
  1、SQL语句---保存复杂的SQL语句来直接运行
  2、保密---可以在sql语句中使用一定的方式来加密字段
  3、权限和索机制---最主要的,通过视图来限制对基表的访问,同时对基本的访问时候可以加Access锁
  基于视图的视图访问是极其没有效率的:
  可以试下语句create table tmp as (select * from t03_agreement),可以发现tmp的主键已经不是t03_agreement的主键了,嵌套视图同样的,当访问嵌套视图时候,第二个视图的索引已经没用了,会造成很大的SKEW,影响数据的选取。(根据这段时间的工作,发现在生产上还是有一些使用到了嵌套视图的。)

5、事务和锁
   事务处理系统的典型特点是具备ACID特征。ACID指的是Atomic(原子的)、Consistent(一致的)、Isolated(隔离的)以及Durable(持续的),它们代表着事务处理应该具备的四个特征。
   TD的锁这种级别:Access、Read、Write、Exclude锁。其中,DDL语句一般是Exclude锁,Insert、update一般是Write锁,Select一般是read锁。同级的锁可以互相叠加,Exclude锁除外。Access锁可以获得除Exclude锁以外的全部锁,但是可能会造成脏读的现象,视图就是以这种方式为考虑的。(问题分析,DW允许脏读的产生,因为DW多以大数据量的查询为主,比较少进行update的操作,除非是在数据批处理的时候,不像OLTP系统,需要对数据的一致性有很高的要求。)
   TD基于MPP的架构,每个AMP之间是独立的,所以对于同样的一个请求,可能会有不同的执行顺序,没有一个同步的策略,这个时候TD采用的是一个Pesudo表,这个表是每个AMP都公用的,以此来保证同步,每次需要加锁某个对象的时候,就先加Pessudo表,而不是对每个AMP进行加锁。(可以通过Explain来查看得到)
   BT ET跟Muti-statement加锁的不一样:
   1、BT ET是一个语句一个语句加锁,锁进行累积,直到遇到ET的时候才释放锁
   2、muti-statement 是一次性全部加锁
TD对于死锁机制的处理:
   在TD数据库中,每4分钟会检查一次死锁,如果出现死锁的话,就roll back 最后的那个Session
   
Presudo的思考---是不是每次访问都需要使用到Presudo表呢?
   answer:
      2个语句explain select * from table where pi字段 = ? 和explain select * from table where notpi字段= ?
      你会发现,使用pi字段的时候就不会所presudo表,这是因为TD的Presudo表是针对MPP架构设计的,如果是单个AMP操作的时候,其实就没有所谓的同步的现象,所以就不用使用Presodu。
      如上,也就是说单AMP操作不需要presodu表。
对于提交Multi-Statement Requests,将“;”号放在sql语句的前面,多个sql语句同时提交,如下
                  Update Dept 
                        set Salary_Change_Date = Date 
                 ; Update Manager 
                        set Salary_Amt = Salary_Amt *1.08 
                 ; Update Employee 
                        set Salary_Amt = Salary_Amt * 1.04;
    上述语句被放到同一个Session中提交,但是在数据库处理的时候并不是按照顺序一个个下来的,而是随机执行的,同时优化器可能根据判断,来并行的执行语句,所以不能妄图使用Multi-Statement来达到顺序执行的目的。
6、关于DBQLogtbl表
   DBQLogTbl表,是DBC的系统表,FallBack的,双倍存储空间。一般来说如果在数据库中开启了日志记录的话,这个表的空间会增长很快。
   DBQLogTbl里面有很多非常有用的信息。
   比如字段CacheFlag可以得出本sql是否使用了cache,比如说里面有CPU时间,语句执行时间等,可以由此来判断多长时间是等待,多长时间是真正执行,同时也可以日志下对于数据库的sql语句,可以由此来进行调优。
   DBQLogTbl表可以使用的策略如下:
     1、开启数据库的日志记录
     2、每天清理DBQLogTbl表,把数据库传输到一个没有fallback的表中(可以适当的删除些不需要的字段)
     3、分析导出的数据

Hash Collision 和 Skew

1.计算每个hash值的记录个数
SELECT                 HASHROW (CHARACTER_COLUMN) 
                        AS "Hash Value"
                        ,COUNT(*)
FROM                DB.TABLE
GROUP BY        1
ORDER BY        2 DESC;

2.计算每个AMP上的记录个数
SELECT                 HASHAMP (HASHBUCKET
                        (HASHROW (CHARACTER_COLUMN))) 
                        AS "AMP #"
                        ,COUNT(*)
FROM                DB.TABLE
GROUP BY        1
ORDER BY        2 DESC; 
3.查找出skew偏移严重的表

SELECT 
 DATABASE,TABLENAME,SUM(CURRENTPERM) AS TableSize,
 MAX(CURRENTPERM) * NUM_OF_AMP  AS VirtualSize,
 100 - AVG(CURRENTPERM)/MAX(CURRENTPERM) * 100 AS TableSkew,
 MAX(CURRENTPERM) * NUM_OF_AMP - SUM(CURRENTPERM) AS SkewSavings
 FROM DBC.TABLESIZE
 WHERE CURRENTPERM > 0
 GROUP BY 1,2,3
 HAVING(SkewSavings > 1000000000 and TableSkew > 10)
 OR TableSkew > 50;

2010/06/11

TUNING TIPS

BETWEEN比IN有效
IN比EXIST有效
LIKE条件的开头如果不是通配符则优化器使用索引的可能。否则进行全表扫描
DISTINCT很耗费资源(但是和GROUP BY相比如何呢?)
如果可以的话使用UNION ALL而不是UNION
使用多个SQL语句计算不同范围的合计值会导致同一个表的多次扫描,如果将
这些SQL合并,并用CASE区分范围,可以减少扫描次数。这会极大的提高性能
DISTINCT会创建临时表,可以更改SQL的结构避免DISTINCT的使用。
使用派生表而非临时表。减少磁盘IO
避免使用混合类型的表达式。例如字符串和数值比较,TERADATA会将字符串
转换成数值。这会使索引失效并导致全表扫描。结论:混合类型比较时做显示转换
SI
JI
GLOBAL JI
PPI

TBL01.A LIKE TBL02.B

SELECT DISTINCT
id_int_org_pty,
nm_ext_org_grp_pty
FROM
VMSL_CUST_MAJ
) TBL01
ON
TBL02.id_int_org_pty = TBL01.id_int_org_pty
WHERE
INDEX(TBL02.nm_ext_org_eng,TBL01.nm_ext_org_grp_pty) > 0

或者


SELECT
AB.A
FROM
AB
LEFT OUTER JOIN
BB
ON
AB.A LIKE '%' || B||'%'

EXTRACT

EXTRACT
(
YEAR
MONTH
DAY
HOUR
MINUTE
SECOND
TIMEZONE_HOUR
TIMEZONE_MINUTE
FROM value)

ANSI SQL-2003

SSRS一处BUG

设置单元格背景颜色,颜色由行中某单元格内容决定。在本地执行报表,反应正常。部署到服务器上不正常工作。

http://www.sqldev.org/sql-server-reporting-services/background-color-differences-between-visual-studio-amp-reporting-services-39081.shtml


SSRS条件语句

switch

=Switch(Fields!Orders.Value < 100,"Red", Fields!Orders.Value > 100, "Green")

iif
=IIF(Fields!LineTotal.Value > 100, True, False)

Expression Examples (Reporting Services)
http://msdn.microsoft.com/en-us/library/ms157328.aspx

2010/06/09

JOIN INDEX

A JOIN B
A,B表PI不同的情况下,需要重新分布A或者B。可以考虑在A上建立于B同PI的SINGLE TABLE JOIN INDEX,避免重新分布大量数据。
或者使用于B同PI的中间表,或者VOLATILE表。
总之尽量避免重新分布数据就好

Types of Tables in Teradata

1 SET
2 MULTI SET
3 GLOBAL TEMPORARY TABLES
4 TEMPORARY TABLES
5 VOLATILE TEMPORARY TABLES

SET表中如有UPI,则由UPI保证非重复记录

定义字段
ename CHAR(10) CHARACTER SET LATIN NOT CASESPECIFIC
select * from scott.emp where ename(CASESPECIFIC) = 'FAN'
select * from scott.emp_case where ename(not CASESPECIFIC) = 'FAN';
select * from scott.emp_case where upper(ename) = 'FAN';

CREATE GLOBAL TEMPORARY MULTISET TABLE scott.gt_deptsal
(deptno SMALLINT
,avgsal DEC(9,2)
,maxsal DEC(9,2)
,minsal DEC(9,2)
,sumsal DEC(9,2)
,empcnt SMALLINT)
PRIMARY INDEX (deptno)
ON COMMIT PRESERVE ROWS;

查数据字典:select * from dbc.tables where tablename = 'gt_deptsal'
物化表:insert into scott.gt_deptsal values (22,33,33,22,22,22);
查询物化表:物化后DBC.TempTables 表会存在一条表记录,物化表drop后记录删除:
select * from DBC.TempTables;

本文来自CSDN博客,转载请标明出处:http://blog.csdn.net/fanxiaoliang83/archive/2009/07/30/4395457.aspx

IS-NUMERIC FUNCTIONS IN TERADATA

SELECT * FROM TABLE WHERE TRIM(FIELD) NOT BETWEEN '0' AND '9999999999999999'
OR
(UPPER(TRIM(cd_itm)) (CASESPECIFIC) = LOWER(TRIM(cd_itm)) (CASESPECIFIC))

http://forums.teradata.com/forum/analytics/is-numeric-functions

AVG OVER

SELECT
nm_bu,
id_cal_mth ,
AVG(qty) OVER (PARTITION BY nm_bu ORDER BY id_cal_mth ROWS 2 PRECEDING)
FROM
(
SELECT
nm_bu,
id_cal_mth,
SUM(qty_itm) AS qty
FROM
vugd_sal_act_mth
GROUP BY
nm_bu,id_cal_mth
WHERE
nm_bu = 'TV'
) AS TBL
ORDER BY
id_cal_mth

CONNECT FIELD BY RECURSIVE TABLE

CREATE MULTISET VOLATILE TABLE VTBL_BASE,
NO FALLBACK ,
CHECKSUM = DEFAULT,
NO LOG
AS(
SELECT
id,
name,
ROW_NUMBER() OVER(PARTITION BY id ORDER BY name) AS row_no
FROM
DWHDEV.TBL_TEST_RECURSIVE
) WITH DATA
PRIMARY INDEX
(id)
ON COMMIT PRESERVE ROWS
;

WITH RECURSIVE temp_table (id, name, row_no, depth) AS
(
SELECT
id,
name,
row_no,
depth
FROM
(SELECT DISTINCT
id,
CAST('' AS VARCHAR(1000)) AS name,
0 AS row_no,
0 AS depth
FROM
VTBL_BASE
)tbl

UNION ALL

SELECT
temp_table.id,
temp_table.name || ' ' || base_table.name,
temp_table.row_no + 1 AS row_no,
temp_table.depth + 1 AS newdepth
FROM
temp_table
INNER JOIN
VTBL_BASE AS base_table
ON
temp_table.id = base_table.id
AND
temp_table.row_no + 1 = base_table.row_no
WHERE
newdepth < 10
)

SELECT
id,
name
FROM
temp_table
WHERE
(id, row_no)
IN
(SELECT
id,
MAX(row_no)
FROM
temp_table
GROUP BY
id
)
ORDER BY
id
;

DROP TABLE VTBL_BASE
;

RECURSIVE TABLE

WITH RECURSIVE temp_table (id, name, row_no, depth) AS
(
SELECT
id,
name,
row_no,
depth
FROM
(
SELECT DISTINCT
id,
CAST('' AS VARCHAR(2000)) AS name,
0 AS row_no,
0 AS depth
FROM
TBL_FOR_RECURSIVE_TEST
)tbl
UNION ALL
SELECT
temp_table.id,
temp_table.name || ', ' || base_table.name,
temp_table.row_no + 1 AS row_no,
temp_table.depth + 1 AS newdepth
FROM
temp_table
INNER JOIN
TBL_FOR_RECURSIVE_TEST AS base_table
ON
temp_table.id = base_table.id
AND
temp_table.row_no + 1 = base_table.row_no
WHERE
newdepth < 10
)

SELECT
*
FROM
temp_table
WHERE
(id, row_no) IN (SELECT id,MAX(row_no) FROM temp_table GROUP BY id)
ORDER BY
id

CASE

CASE WHEN NM_INT_ORG_PTY = 'TOTAL'
THEN SUM(T_TRAN.QTY_ACT) OVER (PARTITION BY M_MST.ID_CAL)
ELSE T_TRAN.QTY_ACT
END AS QTY_ACT

CREATE TABLE FROM TABLE

CREATE TABLE TBL_TST AS (SELECT DISTINCT ID_OGO_FRM_INT_ORG_PTY FROM DWHDEV.WK_QDDGD11010) WITH DATA

REPLACE VIEW

REPLACE VIEW View_name AS

LOCKING TABLE Table_name FOR ACCESS

SELECT
field1
,field2
FROM
Table_name

HAVING

SELECT Sum(qty),itm_id FROM table_name GROUP BY itm_id HAVING SUM(qty) >= 6000000;

QUALIFY

SELECT
id_ver
,nm_ver
,dttm_ver_upd
FROM
VUGD_VER_INFO
GROUP BY
id_ver
,nm_ver
,dttm_ver_upd
QUALIFY row_number() over(order by dttm_ver_upd DESC) = '1'

COALESCE

COALESCE(Field1,Field2)

ADD_MONTHS

SELECT ADD_MONTHS(CAST(id_cal_mth AS DATE FORMAT 'YYYYMM'),-5)(FORMAT 'YYYYMM')(VARCHAR(6))

Resnode

exec Resnode ('2009-11-19' ,'2009-11-19' ,'17:55:00','17:57:00')

AMP SKEW

SELECT
HASHAMP (HASHBUCKET(HASHROW (
dt_dat_cre ,cd_prod_div ,cd_dpt ,cd_main_acct ,cd_dtl_acct ,cd_sub_acct ,cd_bu ,id_itm ,cd_loc ,cd_crcy
))) AS "AMP NO"
,COUNT(*)
FROM dwhtst.TAPD_BLNC_DAY
GROUP BY 1
ORDER BY 2 DESC;

DBQLogTbl

2009-09-15

SQL① SQL発行件数の分布 ユーザー別

sel
cast(StartTime as char(10)) AS TheDate,
substr( cast( StartTime as char(19) ), 12,3)||
(CASE WHEN substr( cast(StartTime as char(16)), 15,2) between '00' and '29' THEN '00' ELSE '30' END)||':00' AS TheTime,
CASE GROUPING(UserName) WHEN 1 THEN '--ALLUSER--' ELSE Trim(UserName) END AS UserName,
count(distinct QueryText) AS QueryCount
from DBC.DBQLogTbl
WHERE cast(StartTime as char(10)) = '2009-09-15'
GROUP BY GROUPING SETS ((TheDate,TheTime,UserName),(TheDate,TheTime))
order by 1,2,3
;


SQL② SQL発行件数の分布 全ユーザー
※結果を「10.SQL発行件数の分布」シートの表②に貼り付ける。
sel
cast(StartTime as char(10)) AS TheDate,
substr( cast( StartTime as char(19) ), 12,3)||
(CASE WHEN substr( cast(StartTime as char(16)), 15,2) between '00' and '29' THEN '00' ELSE '30' END)||':00' AS TheTime,
'--ALLUSER--' AS UserName,
count(distinct QueryText) AS QueryCount
from DBC.DBQLogTbl
WHERE cast(StartTime as char(10)) = '2009-09-15'
GROUP BY 1,2,3
order by 1,2,3
;


SQL③ 処理時間の長いクエリー
基準時間 > 1200
※結果を「11.処理時間の長いクエリー」シートに貼り付ける。
sel
A.UserName,
A.ExeYMD,
A.StartTime,
A.EndTime,
A.ExeTime,
A.SessionID,
A.QueryID,
T3.C_JOB_ID,
( substr( cast(T3.J_END as char(19)), 12, 2)*3600 + substr(cast( T3.J_END as char(19)), 15, 2)*60 + substr( cast(T3.J_END as char(19)), 18, 2) )
-
( substr( cast(T3.J_ST as char(19)), 12, 2)*3600 + substr(cast( T3.J_ST as char(19)), 15, 2)*60 + substr( cast( T3.J_ST as char(19)), 18, 2) )
AS SHELL_ExeTime
from
(
SELECT
Trim(T1.UserName) UserName,
substr( cast( T1.StartTime as char(19) ), 1,10) AS ExeYMD,
substr( cast( T1.StartTime as char(19) ), 12,8) AS StartTime,
substr( cast( T1.LastRespTime as char(19) ), 12,8) AS EndTime,
(
substr( cast(T1.LastRespTime as char(19)), 12, 2)*3600 + substr(cast( T1.LastRespTime as char(19)), 15, 2)*60 + substr( cast(T1.LastRespTime as char(19)), 18, 2)
)
-
(
substr( cast(T1.StartTime as char(19)), 12, 2)*3600 + substr(cast( T1.StartTime as char(19)), 15, 2)*60 + substr( cast( T1.StartTime as char(19)), 18, 2)
) AS ExeTime,
T1.SessionID,
T1.QueryID,
Cast(T1.TotalCPUTime AS INTEGER) AS TotalCPUTime
FROM DBC.DBQLogTbl T1
LEFT OUTER JOIN DBC.DBQLSqlTbl T2
ON (T1.QueryID=T2.QueryID)
WHERE CAST(T1.StartTime AS CHAR(10))='2009-09-15' AND
ExeTime> 1200
GROUP BY 1,2,3,4,5,6,7,8
) A
LEFT OUTER JOIN DWHRUN.BCH_PGM_LOG T3
ON (
A.ExeYMD = substr( cast( T3.J_ST as char(19) ), 1, 10)
AND
A.StartTime
BETWEEN substr( cast( T3.J_ST as char(19) ), 12, 9) AND substr( cast( T3.J_END as char(19) ), 12, 9)
AND
( substr( cast(T3.J_END as char(19)), 12, 2)*3600 + substr(cast( T3.J_END as char(19)), 15, 2)*60 + substr( cast(T3.J_END as char(19)), 18, 2) )
-
( substr( cast(T3.J_ST as char(19)), 12, 2)*3600 + substr(cast( T3.J_ST as char(19)), 15, 2)*60 + substr( cast( T3.J_ST as char(19)), 18, 2) ) > 900
)
where SHELL_ExeTime-ExeTime between 0 and 100
group by UserName,ExeYMD,StartTime,EndTime,ExeTime,SessionID,QueryID,TotalCPUTime,C_JOB_ID,SHELL_ExeTime
ORDER BY StartTime, SHELL_ExeTime desc
;


Data Skew

SELECT
trim(DatabaseName) AS DatabaseName,
trim(TableName) AS TableName,
max(CurrentPerm)*1.00/min(CurrentPerm) AS Vproc_Skew,
sum(CurrentPerm) AS AllVproc_Perm,
max(CurrentPerm) AS MaxPerm_PerVproc,
min(CurrentPerm) AS MinPerm_PerVproc
/*
sum(CASE WHEN Vproc=00 THEN CurrentPerm ELSE 0 END) AS Vproc00_CurrentPerm,
sum(CASE WHEN Vproc=01 THEN CurrentPerm ELSE 0 END) AS Vproc01_CurrentPerm,
sum(CASE WHEN Vproc=02 THEN CurrentPerm ELSE 0 END) AS Vproc02_CurrentPerm,
sum(CASE WHEN Vproc=03 THEN CurrentPerm ELSE 0 END) AS Vproc03_CurrentPerm,
sum(CASE WHEN Vproc=04 THEN CurrentPerm ELSE 0 END) AS Vproc04_CurrentPerm,
sum(CASE WHEN Vproc=05 THEN CurrentPerm ELSE 0 END) AS Vproc05_CurrentPerm,
sum(CASE WHEN Vproc=06 THEN CurrentPerm ELSE 0 END) AS Vproc06_CurrentPerm,
sum(CASE WHEN Vproc=07 THEN CurrentPerm ELSE 0 END) AS Vproc07_CurrentPerm,
sum(CASE WHEN Vproc=08 THEN CurrentPerm ELSE 0 END) AS Vproc08_CurrentPerm,
sum(CASE WHEN Vproc=09 THEN CurrentPerm ELSE 0 END) AS Vproc09_CurrentPerm,
sum(CASE WHEN Vproc=10 THEN CurrentPerm ELSE 0 END) AS Vproc10_CurrentPerm,
sum(CASE WHEN Vproc=11 THEN CurrentPerm ELSE 0 END) AS Vproc11_CurrentPerm,
sum(CASE WHEN Vproc=12 THEN CurrentPerm ELSE 0 END) AS Vproc12_CurrentPerm,
sum(CASE WHEN Vproc=13 THEN CurrentPerm ELSE 0 END) AS Vproc13_CurrentPerm,
sum(CASE WHEN Vproc=14 THEN CurrentPerm ELSE 0 END) AS Vproc14_CurrentPerm,
sum(CASE WHEN Vproc=15 THEN CurrentPerm ELSE 0 END) AS Vproc15_CurrentPerm,
sum(CASE WHEN Vproc=16 THEN CurrentPerm ELSE 0 END) AS Vproc16_CurrentPerm,
sum(CASE WHEN Vproc=17 THEN CurrentPerm ELSE 0 END) AS Vproc17_CurrentPerm,
sum(CASE WHEN Vproc=18 THEN CurrentPerm ELSE 0 END) AS Vproc18_CurrentPerm,
sum(CASE WHEN Vproc=19 THEN CurrentPerm ELSE 0 END) AS Vproc19_CurrentPerm
*/
FROM DBC.TableSize
WHERE DatabaseName in ('etlrun', 'dwhrun')
HAVING AllVproc_Perm > 100000000 AND Vproc_Skew >=1.25
Group by DatabaseName, TableName
Order by DatabaseName, TableName, Vproc_Skew
;


DBC.DiskSpace

sel
databasename,
sum(maxperm) AS MAXPERM,
sum(currentperm) AS CURRNETPERM,
sum(currentperm)*1.0/nullifzero(sum(maxperm)) AS USAGE
from
DBC.DiskSpace
group by 1
order by 2 desc

Res Usage

SELECT
TheDate (FORMAT'yy/mm/dd', TITLE 'Date'),
TheTime (FORMAT'99:99:99', TITLE 'Time'),
/* Percent of time the CPUs were busy doing work */
SUM(CPUBusy) / SUM(Secs) (FORMAT 'ZZ9', TITLE 'CPU使用率(%)'),
/* Parallel efficiency of total pct of time CPUs were busy */
(AVG(CPUBusy)*100)/NULLIFZERO(MAX(CPUBusy)) (FORMAT 'ZZ9', TITLE 'CPUの並列効率(%)'),
/* Percent of time spent waiting for I/O */
SUM(CPUWaitIO)/SUM(Secs) (FORMAT 'ZZ9', TITLE 'I/O待ち(%)'),
/* Pct logical device I/Os done for paging */
SUM(PageOrSwapIO)*100/NULLIFZERO(SUM(LogicalDeviceIO)) (FORMAT 'ZZ9', TITLE 'I/Oに占めるページング・スワップの割合(%)'),
/* Percent of memory that is unused */
SUM(MemFreeKB)*100/NULLIFZERO(SUM(MemSize(FLOAT))*1024) (FORMAT 'ZZ9', TITLE '空きメモリ(%)'),
/* Percent of memory allocation attempts that failed */
SUM(MemFails)*100/NULLIFZERO(SUM(MemFails+MemTextAllocs+MemVprAllocs)) (FORMAT 'ZZ9', TITLE 'メモリ割り当て失敗(%)')
FROM DBC.ResGeneralInfoView
WHERE TheDate BETWEEN '2009-11-13' AND '2009-11-13'
/* and SUBSTR( cast(TheTime AS VARCHAR(8) ), 4, 2 ) in ('00', '15', '30', '45') */
/* and SUBSTR( cast(TheTime AS VARCHAR(8) ), 5, 1 ) in ('0', '5') */
GROUP BY TheDate, TheTime
ORDER BY TheDate, TheTime ;

ResGeneralInfoView


対象日付 2009-09-15
利用状況作成SQL


SELECT
TheDate (FORMAT'yy/mm/dd', TITLE 'Date'),
TheTime (FORMAT'99:99:99', TITLE 'Time'),
/* Percent of time the CPUs were busy doing work */
SUM(CPUBusy) / SUM(Secs) (FORMAT 'ZZ9', TITLE 'CPU使用率(%)'),
/* Parallel efficiency of total pct of time CPUs were busy */
(AVG(CPUBusy)*100)/NULLIFZERO(MAX(CPUBusy)) (FORMAT 'ZZ9', TITLE 'CPUの並列効率(%)'),
/* Percent of time spent waiting for I/O */
SUM(CPUWaitIO)/SUM(Secs) (FORMAT 'ZZ9', TITLE 'I/O待ち(%)'),
/* Pct logical device I/Os done for paging */
SUM(PageOrSwapIO)*100/NULLIFZERO(SUM(LogicalDeviceIO)) (FORMAT 'ZZ9', TITLE 'I/Oに占めるページング・スワップの割合(%)'),
/* Percent of memory that is unused */
SUM(MemFreeKB)*100/NULLIFZERO(SUM(MemSize(FLOAT))*1024) (FORMAT 'ZZ9', TITLE '空きメモリ(%)'),
/* Percent of memory allocation attempts that failed */
SUM(MemFails)*100/NULLIFZERO(SUM(MemFails+MemTextAllocs+MemVprAllocs)) (FORMAT 'ZZ9', TITLE 'メモリ割り当て失敗(%)')
FROM
DBC.ResGeneralInfoView
WHERE
TheDate BETWEEN '2009-09-15' AND '2009-09-15'
/* and SUBSTR( cast(TheTime AS VARCHAR(8) ), 4, 2 ) in ('00', '15', '30', '45') */
/* and SUBSTR( cast(TheTime AS VARCHAR(8) ), 5, 1 ) in ('0', '5') */
GROUP BY
TheDate, TheTime
ORDER BY
TheDate, TheTime ;
※カラムを物理名で表示する場合は下記SQLを実行する。
SELECT
TheDate (FORMAT'yy/mm/dd', TITLE 'Date'),
TheTime (FORMAT'99:99:99', TITLE 'Time'),
/* Percent of time the CPUs were busy doing work */
SUM(CPUBusy) / SUM(Secs) (FORMAT 'ZZ9', TITLE 'CPU Bsy %'),
/* Parallel efficiency of total pct of time CPUs were busy */
(AVG(CPUBusy)*100)/NULLIFZERO(MAX(CPUBusy)) (FORMAT 'ZZ9', TITLE 'CPU Eff %'),
/* Percent of time spent waiting for I/O */
SUM(CPUWaitIO)/SUM(Secs) (FORMAT 'ZZ9', TITLE 'WIO %'),
/* Pct logical device I/Os done for paging */
SUM(PageOrSwapIO)*100/NULLIFZERO(SUM(LogicalDeviceIO)) (FORMAT 'ZZ9', TITLE 'P+S % of IOs'),
/* Percent of memory that is unused */
SUM(MemFreeKB)*100/NULLIFZERO(SUM(MemSize(FLOAT))*1024) (FORMAT 'ZZ9', TITLE 'Fre Mem %'),
/* Percent of memory allocation attempts that failed */
SUM(MemFails)*100/NULLIFZERO(SUM(MemFails+MemTextAllocs+MemVprAllocs)) (FORMAT 'ZZ9', TITLE 'Mem Fai %')
FROM
DBC.ResGeneralInfoView
WHERE
TheDate BETWEEN '2009-09-15' AND '2009-09-15'
/* and SUBSTR( cast(TheTime AS VARCHAR(8) ), 4, 2 ) in ('00', '15', '30', '45') */
/* and SUBSTR( cast(TheTime AS VARCHAR(8) ), 5, 1 ) in ('0', '5') */
GROUP BY
TheDate, TheTime
ORDER BY
TheDate, TheTime ;
参考: カラム論理名 カラム物理名 観測内容
CPU使用率(%) CPU Bsy % CPUを使用する割合
I/O待ち(%) WIO % I/Oを待つことで、CPUがアイドル状態になる。
CPUの並列効率性 CPU Eff% 複数のノートでバランスよくCPUを使用している。
I/Oに占めるページング・スワップの割合(%) P+S % of Ios ページング・スワップが発生すると、そのオーバーヘッドによりシステムの性能が低下する。
空きメモリ(%) Fre Mem % メモリが少ないと、資源枯渇となる。
メモリ割り当て失敗(%) Mem Fai % メモリ割り当て失敗すると、再割当するため、オーバーヘッドが発生する。

VOLATILE TABLE


CREATE MULTISET VOLATILE TABLE VTBL_01,
NO FALLBACK ,
CHECKSUM = DEFAULT,
NO LOG
AS(
SELECT * FROM ${_table_name2} WHERE nm_tran = 'GDN_C_ITEM_BALANCE'
) WITH DATA
PRIMARY INDEX
(dt_dat_cre ,cd_prod_div )
PARTITION BY RANGE_N(dt_dat_cre BETWEEN DATE '1950-01-01' AND DATE '2100-12-31' EACH INTERVAL '1' DAY );
ON COMMIT PRESERVE ROWS;

STATISTICS

Collects statistics on columns, group of columns or indexes.

collect statistics on table_1 index (field_1, field_2)
collect statistics on table_name column field_1;
collect statistics on table_name index(field_1 ,field_2 ,field_3)

help statistics table_name

Create/Drop SI

CREATE INDEX SI1(COL1_NM) ON TABLE_NM
DROP INDEX SI_NM ON TABLE_NM

---------------------------------
Hi Farid,
you'll find all the details in the manuals:
Database Design
Chapter 11: Secondary Indexes
Unique Secondary Indexes
Nonunique Secondary Indexes
Chapter 15: Database-Level Capacity Planning Considerations
Sizing a Unique Secondary Index Subtable
Sizing a Nonunique Secondary Index Subtable

There's no DDL for a index subtable, but it's similar to base tables with following definition:
USI:
create unique index USI(i int) on tab;
create table USI(
internal_overhead byte(7),
i int,
baseROWID byte(8) -- or byte(10) for a partitioned table)
unique primary index(i);

NUSI:
create index NUSI(i int) on tab;
create table NUSI(
internal_overhead byte(7),
i int,
baseROWIDa array of byte(8) -- or byte(10) for a partitioned table)
[non-hashed AMP-local primary] index(i);
That kind of NUSI-PI can't be created on a base table, that's why there's no easy way to calculate the exact size of a NUSI.
Dieter