Here are the reasons:
n When you are using DISTINCT Clause, data would be spooled first and then duplicate rows would be eliminated. This is not a very efficient way unless you have less number of rows present in individual AMPs.
n In case of GROUP BY, the elimination happens as a first step which makes a lot of difference in case of a large dataset.
n As you know, in a sub query, rows returned are automatically DISTICT so to make this sub query run faster make it GROUP BY.
n Using DISTINCT rows would be sorted automatically but not with the GROUP BY. I guess this might be another reason DISTINCT is less efficient.
To test the performance, create a temp table mentioned in the previous articles and insert some 50-100 rows and run the following queries and check the elapsed time.
SELECT emp_id, COUNT(*)
FROM emp GROUP BY 1 ORDER BY 1;
SELECT COUNT(distinct emp_id)
FROM emp
Popularity: 14% [?]
Our Random Articles
- How To Choose A Good Online Traffic School
- New Year Wishes 2010
- New Year Wishes
- How to use EXTRACT function with date and time columns
- How to Post on a Separate/New Page in WordPress
More Links




No Comment
Popular Articles