Teradata Scenario - interview question

A scenario like ,there is a BTEQ where it has delete statement followed by an insert and update.Suppose the delete statement has been executed successfully and there occurs an error during insert. So in this scenario it should exit the BTEQ and the table should get the previously loaded value which was present before Delete.(handling rollback in Bteqs)

Answer

The Delete and the Insert statements should be written within the BT and ET statemants and rollback should be given before end transaction. if insert fails then it should go to the rollback step. This works only in teradata mode and not in ANSI (please confirm the ans for this question before consolidation)


What do you mean by skewness?

Skewness refers to the variation in the number of rows among the various AMPs. The higher the variation (some have high number of rows and some have very much less number of rows) then it is said to have high skewness. This affects the effective parallel execution in teradata. The way to get around is by assigning proper indexes to bring about even distribution of rows across the AMPs.

Skewfactor:
It tells the distribution of the rows.
-->If uniformly distributed(normal distribution) skew
factor is 0.


There is aformula to identify the skewness.
Please find the below.
SELECT TableName,SUM(CurrentPerm) AS
CurrentPerm,SUM(PeakPerm) AS PeakPerm,(100 -
(AVG(CurrentPerm)/MAX(CurrentPerm)*100)) AS SkewFactor
FROM Dbc.TableSize
WHERE DataBaseName = 'SALES'
GROUP BY 1 ORDER BY 1

What is the process to restart the multiload if it fails?

MULTILOAD will creates 2 error tables, 1 work table When MULTILOAD fails We have to unlock the Main Table, here error tables and work tables are not locked like FLOAD.

To Unlock the Main Table in case of acquisation Phase :
Mload RELEASE MLOAD ;
To release lock in application phase failure :
RELEASE MLOAD .IN APPLY;

How would you execute the BTEQ?

Consider a BTEQ script contains the following code in it.

    .logon tpid/user, pass; /* Login credentials */
    SEL  * FROM tab_a; /* Query */
    .quit
You can submit the bteq script in batch mode by .RUN FILE command.

 .RUN FILE = BTEQfile.btq

At unix prompt,Please execute in the below way.

$Sh>BTEQ<[Script Path]>[Logfile Path]
or
$Sh>BTEQ<[Script Path]| TEE [Logfile Path]

Teradata Realtime Question - Query

The question was how to grade the students according to the total marks.

sel ,stu_id,
CASE WHEN marks>=400 THEN 'GRADE_A'
WHEN marks>300 AND marks < 400 THEN 'GRADE_B' WHEN marks>250 AND marks < 300 THEN 'GRADE_C' ELSE 'FAIL' END

Teradata Realtime Scenario Question

There is a scenario like the load should start after a particular table fetches a value else it should Bteq session should exit

there is a table A(flag_table) and main table B

No flag

table_a 1

Only when the Flag value is 1 in this table A, the main table B load should start else it should quit.


Answer

SELECT flag FROM table_a WHERE flag =1;

.IF ACTIVITYCOUNT = 0 THEN .GOTO .EXIT

so if the record is not found then activitycount will be zero and it ll exit the BTEQ

Query to find the highest 3 salaries in a department

SELECT emp_no ,RANK() OVER (ORDER BY salary DESC)"rank" FROM emp_table QUALIFY rank < 3( if there is only one dept) SELECT emp_no,RANK() OVER(PARTITION BY dept_id ORDER BY salary DESC) as hig_sal FROM emp_table QUALIFY high_sal < 3(mul depts)

Write a Query to find the last date of last(previous) month.

SELECT CURRENT_DATE-EXTRACT(DAY FROM CURRENT_DATE);

How long ROW ID is?

64 Bits...
AMP receives 32 bit hash value and row data from PE. AMP then append a 32 bit unique value to row HASH to make 64 bit ROW ID.
ROW ID is used to sort rows in AMP.

Teradata performance tuning and optimization steps?

1. collecting statistics
2. Explain Statements
3. Avoid Product Joins when possible
4. select appropriate primary index to avoid skewness in storage
5. Avoid Redistribution when possible
6. Use sub-selects instead of big "IN" lists
7. Use derived tables
8. Use GROUP BY instead of DISTINCT ( GROUP BY sorts the data locally on the VPROC. DISTINCT sorts the data after it is redistributed)
9. Use Compression on large tables.

What is TENACITY? What is its default value?

TENACITY specifies the amount of time in hours, to retry to obtain a loader slot or to establish all requested sessions to logon.
The default for Fast Load is “no tenacity”, meaning that it will not retry at all.
If several FastLoad jobs are executed at the same time, we recommend setting the TENACITY to 4, meaning that the system will continue trying to logon for the number of sessions requested for up to four hours.

what is FALLBACK in teradata?

FALLBACK is a table protection feature used in case an AMP fails. You can use FALLBACK on all tables.
FALLBACK tables use twice as much disk space as NON-FALLBACK rows

PERM space is a pre allocated space? True/ False?

False. PERM space is not pre allocated space. its a maximum allowed space
to user or a database.

what is the use of TEMP space?

TEMPORARY (TEMP) space :
A database may or may not have TEMP space, however, it is required if Global Temporary Tables are used.

what is FILLER command in Teradata?

while running Fastload or Multiload if you dont want to load a particular field from the datafile to the target
table then use the FILLER command to achieve.

what is the differnce between Sub Query and Corelated Sub Query?

Sub Query:
If Queries written in a nested manner then its termed as a sub query.
Inner query executed First and executed Only one time.
Corelated Sub Query:
Co-Related Sub query get executed once for each row of the parent query.

If collect stats is not done on the table what will happen?

Teradata uses a cost based optimizer and cost estimates are done based on statistics.
So if you donot have statistics collected then optimizer will use a Dynamic AMP Sampling method to get the stats.
If your table is big and data was unevenly distributed then dynamic sampling may not get right information and yourperformance will suffer.

Define Explain plan?

Explain plan displays the execution plan of SQL statement that is going to be executed by the database.
This plan will be specified by the component called optimiser.Generaly it displays below information:
a.Number of Amps
b.Amount of spool memory it is occupying.
c.Number of Rows its affecting.
d.Type of Join strategy it is taking.
e.Time it takes to execute.
f.Locks it is Using etc.

Syntax : EXPLAIN
Example : EXPLAIN SEL * FROM PARTY;

What is the use of Spool Space?

Teradata spool space is unused Perm space that is used for running queries.Teradata recommend 20 % of the available perm space
is allocated for spool space.This is used to hold intermittent results of the queries and volatile tables.

How many AMPs involved in case of Primary Index?

How many AMPs involved in case of Primary Index?

Its always one AMP.