Categories
Development

Finding duplicate rows in a SQL table and printing out their ids

Today I was trying to find duplicate rows in db table rows, but besides finding them, I needed to get their indexes in order to address them. Given The table structure is simple: id – index, unique name – varchar To find all the duplicate names you just hit the following: select name, count(*) as duplicate_number […]

Today I was trying to find duplicate rows in db table rows, but besides finding them, I needed to get their indexes in order to address them.

Given
The table structure is simple:
id – index, unique
name – varchar
To find all the duplicate names you just hit the following:

select name, count(*) as duplicate_number
from table
group by name
having duplicate_number > 1

So, adding id into the above SELECT statement will only return one of the several duplicate rows’ indexes.

Concat ids
So, let us rather concatenate ids using GROUP_CONCAT:

SELECT name, count(*) as duplicates_number, GROUP_CONCAT(DISTINCT id SEPARATOR ',') as ids
FROM table
GROUP BY name
HAVING duplicate_number > 1

 *DISTINCT inside the function is redundant here, but it exposes the function’s ability.
Result:

+--------------+------------------+-----------+
| name         | duplicatesNumber | ids       |
+--------------+------------------+-----------+
| 1breFNXNlhE7 |                2 | 1245,1246 | 
| 1xwIndSMW6FJ |                2 | 1265,1266 |
| 896rAED3uIyv |                2 | 1239,1240 |
| 9D4R59oppCbG |                2 | 1189,1190 |
| aCbXKA8qm6fl |                2 | 1178,1179 |
| amF6Xm0qqPh9 |                2 | 1201,1202 |
| avOjvJvCq2Uo |                2 | 1148,1149 |
| bXtDoD25VE0J |                2 | 1418,1419 |
| dMbqoMoTsv8W |                2 | 1229,1230 |
| DTMSjVkoqlpM |                2 | 1416,1417 |
+--------------+------------------+-----------+

So, name is one, while the ids where there are duplications are concatenated, returned as a string.

One reply on “Finding duplicate rows in a SQL table and printing out their ids”

In SELECT HAVING you use duplicate_number but in COUNT you use duplicateNumber. Just fix it and works very well. Congratulations.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.