Here are the reasons:
n When you are using DISTINCT Clause, data would be spooled first and then duplicate rows would be eliminated. This is not a very efficient way unless you have less number of rows present in individual AMPs.
n In case of GROUP BY, the elimination happens as a first step which makes a lot of difference in case of a large dataset.
n As you know, in a sub query, rows returned are automatically DISTICT so to make this sub query run faster make it GROUP BY.
n Using DISTINCT rows would be sorted automatically but not with the GROUP BY. I guess this might be another reason DISTINCT is less efficient.
To test the performance, create a temp table mentioned in the previous articles and insert some 50-100 rows and run the following queries and check the elapsed time.
SELECT emp_id, COUNT(*)
FROM emp GROUP BY 1 ORDER BY 1;
SELECT COUNT(distinct emp_id)
FROM emp
Our Random Articles
- Somdev Devvarman ousts Xavier Malisse 6-1, 3-6,7-6(5)
- How To Choose A Good Online Traffic School
- New Year Wishes 2010
- New Year Wishes
- How to use EXTRACT function with date and time columns
More Links




4 Comments
Material that you are providing is very easy to learn. But it is much better than this , if you ll give the full material of this at ONE LINK.
Thank you
Suresh.
That is right Suresh. This is the reason I have given all the links in one page, please visit this
http://readvitamin.com/teradata/
Hope this helps.
Hi. Can anyone tell me how to extract time from timestamp?
For Example
timestamp=’2008-03-28 14:11:04′
i want time ’14:11:04′ from timestamp through query.
Thanks in advance.
Hi Vaideesh,
If you need only time, you can probably try the following;
SELECT time;
To answer your question you should do something like this.
SELECT CAST(VaideeshTime as time(0)) from timestamp;
VaideeshTime
——–
14:11:04
Hope that helps.