Today I was trying to find duplicate rows in db table rows, but besides finding them, I needed to get their indexes in order to address them.
Given
The table structure is simple:
id – index, unique
name – varchar
To find all the duplicate names you just hit the following:
select name, count(*) as duplicate_number from table group by name having duplicate_number > 1
So, adding id into the above SELECT statement will only return one of the several duplicate rows’ indexes.
Concat ids
So, let us rather concatenate ids using GROUP_CONCAT:
SELECT name, count(*) as duplicates_number, GROUP_CONCAT(DISTINCT id SEPARATOR ',') as ids FROM table GROUP BY name HAVING duplicate_number > 1
*DISTINCT inside the function is redundant here, but it exposes the function’s ability.
Result:
+--------------+------------------+-----------+ | name | duplicatesNumber | ids | +--------------+------------------+-----------+ | 1breFNXNlhE7 | 2 | 1245,1246 | | 1xwIndSMW6FJ | 2 | 1265,1266 | | 896rAED3uIyv | 2 | 1239,1240 | | 9D4R59oppCbG | 2 | 1189,1190 | | aCbXKA8qm6fl | 2 | 1178,1179 | | amF6Xm0qqPh9 | 2 | 1201,1202 | | avOjvJvCq2Uo | 2 | 1148,1149 | | bXtDoD25VE0J | 2 | 1418,1419 | | dMbqoMoTsv8W | 2 | 1229,1230 | | DTMSjVkoqlpM | 2 | 1416,1417 | +--------------+------------------+-----------+
So, name is one, while the ids where there are duplications are concatenated, returned as a string.
One reply on “Finding duplicate rows in a SQL table and printing out their ids”
In SELECT HAVING you use duplicate_number but in COUNT you use duplicateNumber. Just fix it and works very well. Congratulations.