2014/08/08

about hash index

http://forums.teradata.com/forum/database/hash-index

http://forums.teradata.com/forum/general/difference-between-single-table-join-index-nusi-and-hash-index

http://teradatardbms.blogspot.jp/2011/03/hash-index-examples.html

17 Nov 2006

A hash index is simply a restricted form of join index. It was introduced mainly to satisfy some rules in the TPC-H benchmark specification that did not allow regular join indexes.The syntax for creating a hash index is simpler than for creating an equivalent join index. As far as I know, that is the only reason one might prefer to use a hash index instead of a join index. The underlying technology is the same. There should be no difference in performance between a hash index and an equivalent join index.

Teradata Diagnostic Commands

DIAGNOSTIC HELPSTATS ON FOR SESSION;
DIAGNOSTIC COSTPRINT ON FOR SESSION;
DIAGNOSTIC OLDREWRITES ON FOR SESSION;
DIAGNOSTIC DUMP COSTS ON FOR SESSION;
DIAGNOSTIC HELP COSTS ON FOR SESSION;
DIAGNOSTIC SET COSTS ON FOR SESSION;
DIAGNOSTIC HELP PROFILE ON FOR SESSION;
DIAGNOSTIC SET PROFILE ON FOR SESSION;
DIAGNOSTIC DUMP SAMPLES ON FOR SESSION;
DIAGNOSTIC HELP SAMPLES ON FOR SESSION;
DIAGNOSTIC SET SAMPLES ON FOR SESSION;
DIAGNOSTIC DUMPCACHE ON FOR SESSION
DIAGNOSTIC JOINPLAN ON FOR SESSION;
DIAGNOSTIC NOJIND ON FOR SESSION;
DIAGNOSTIC PSTEPS ON FOR SESSION;
DIAGNOSTIC REQUEST ON FOR SESSION;
DIAGNOSTIC SPOIL ON FOR SESSION;
DIAGNOSTIC STEPS ON FOR SESSION;
DIAGNOSTIC USESTATS ON FOR SESSION;
DIAGNOSTIC VALIDATEINDEX ON FOR SESSION;
DIAGNOSTIC WHITE ON FOR SESSION;

2014/06/04

count(*) performance improved since teradata 14

since teradata 14, count(*) without where condition was computed as cylinder level

Explain sel count(*)  from SV_RY_CAMP_KIN
  1) First, we lock a distinct DEV_TRVL_RAW."pseudo table" for read on
     a RowHash to prevent global deadlock for
     DEV_TRVL_RAW.SV_RY_CAMP_KIN.
  2) Next, we lock DEV_TRVL_RAW.SV_RY_CAMP_KIN for read.
  3) We do an all-AMPs SUM step to aggregate from
     DEV_TRVL_RAW.SV_RY_CAMP_KIN by way of a cylinder index scan with
     no residual conditions.  Aggregate Intermediate Results are
     computed globally, then placed in Spool 3.  The size of Spool 3 is
     estimated with high confidence to be 1 row (23 bytes).  The
     estimated time for this step is 2.83 seconds.
  4) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of
     an all-rows scan into Spool 1 (group_amps), which is built locally
     on the AMPs.  The size of Spool 1 is estimated with high
     confidence to be 1 row (25 bytes).  The estimated time for this
     step is 0.01 seconds.
  5) Finally, we send out an END TRANSACTION step to all AMPs involved
     in processing the request.
  -> The contents of Spool 1 are sent back to the user as the result of
     statement 1.  The total estimated time is 2.83 seconds.

Explain sel count(*)  from SV_RY_CAMP_KIN   where HIZUKE >= cast(date'2014-05-01'   as timestamp) ;
  1) First, we lock a distinct DEV_TRVL_RAW."pseudo table" for read on
     a RowHash to prevent global deadlock for
     DEV_TRVL_RAW.SV_RY_CAMP_KIN.
  2) Next, we lock DEV_TRVL_RAW.SV_RY_CAMP_KIN for read.
  3) We do an all-AMPs SUM step to aggregate from
     DEV_TRVL_RAW.SV_RY_CAMP_KIN by way of an all-rows scan with a
     condition of ("DEV_TRVL_RAW.SV_RY_CAMP_KIN.HIZUKE >= TIMESTAMP
     '2014-05-01 00:00:00.000000'").  Aggregate Intermediate Results
     are computed globally, then placed in Spool 3.  The size of Spool
     3 is estimated with high confidence to be 1 row (23 bytes).  The
     estimated time for this step is 16.23 seconds.
  4) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of
     an all-rows scan into Spool 1 (group_amps), which is built locally
     on the AMPs.  The size of Spool 1 is estimated with high
     confidence to be 1 row (25 bytes).  The estimated time for this
     step is 0.01 seconds.
  5) Finally, we send out an END TRANSACTION step to all AMPs involved
     in processing the request.
  -> The contents of Spool 1 are sent back to the user as the result of
     statement 1.  The total estimated time is 16.24 seconds.

in older version, it was done by all row scan

Explain sel count(*)  from SV_RY_CAMP_KIN
  1) First, we lock a distinct DWHRUN."pseudo table" for read on a
     RowHash to prevent global deadlock for DWHRUN.SV_RY_CAMP_KIN.
  2) Next, we lock DWHRUN.SV_RY_CAMP_KIN for read.
  3) We do an all-AMPs SUM step to aggregate from DWHRUN.SV_RY_CAMP_KIN
     by way of an all-rows scan with no residual conditions.  Aggregate
     Intermediate Results are computed globally, then placed in Spool 3.
     The input table will not be cached in memory, but it is eligible
     for synchronized scanning.  The size of Spool 3 is estimated with
     high confidence to be 1 row (23 bytes).  The estimated time for
     this step is 3 minutes and 57 seconds.
  4) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of
     an all-rows scan into Spool 1 (group_amps), which is built locally
     on the AMPs.  The size of Spool 1 is estimated with high
     confidence to be 1 row (25 bytes).  The estimated time for this
     step is 0.00 seconds.
  5) Finally, we send out an END TRANSACTION step to all AMPs involved
     in processing the request.
  -> The contents of Spool 1 are sent back to the user as the result of
     statement 1.  The total estimated time is 3 minutes and 57 seconds.

2014/05/19

table name limits changed since teradata 14.10

since teradata 14.10, the table name limits was changed from 30 bytes to 128 bytes
that's maybe a good news , but DO NOT FORGET that when we refernce the DBC.TABLES view,
the tablename column still show the table name in 30 bytes. you should use DBC.Tablesv to view the "REAL" 128 bytes table name.

so when you reference dbc.tables and you find that two or more table under the same name in the same database, do not confused, use help database or DBC.Tablesv you will found that they were cut off by the 30 bytes and loooks like "the same name"