I am getting duplicate terms in my term table (with the same taxonomy in some cases) which happens only under certain circumstances and was wondering if this is a bug (and if so how to fix it).
To recreate this: go to post -> categories and create a new category in which the name does not already exist in the term table and leave the slug blank (I'll use 'apples' as an example). So now you should have a new category (also a new term in the term table) called 'apples' with the slug of 'apples' (if the slug was not already taken). Now try to do the exact same thing again with 'apples'. You should get the message, "A term with the name provided already exists with this parent", makes sense. Now create a new category called 'Apples' making sure the 'A' is upper case with no slug - the term/category should be created as 'Apples' with a slug of 'apples-2', so far so good. Now try to create the category 'apples' again (same case as the first time around). You should get the same message as earlier, "A term with the name provided already exists with this parent."
So far we have the term 'apples' with a slug of 'apples' and a term 'Apples' with a slug of 'apples-2' and post -> categories will not allow us to create a new category/term 'apples' because it already exists in the term table.
Now the bug. Try creating a new category 'Apples' (respecting case) and leave the slug blank. It creates a new category/term with the name 'Apples' and slug 'apples-3'. At this point you can continue to create new categories/terms called 'Apples', which keeps filling up the term table with new terms called 'Apples' pointing at new taxonomies created in the term_taxonomy table which are all 'category'.
Is this a bug and if so is there a way to fix it? I have been poking around in the code and WP tables and noticed that in the term table the names are stored with a collation that is case insensitive (utf8_general_ci). I've also noticed that some term name comparisons done in the functions term_exists and wp_insert_term in the file wp-includes/taxonomy.php are done in PHP and in MySQL. The potential problem with this is that PHP comparisons will be case sensitive where as the MySQL comparisons won't be with the collation used (unless a type cast is used like BINARY).
Also, here is a link to the culprit that was causing issues which lead me here (terms join term_taxonomy on term_id): http://i.imgur.com/bFZkbFq.jpg and here is the change that "fixed" it (also removed the extra terms with term_id of 662 & 664): http://i.imgur.com/myLBZ7O.png