JavaScript – Unduplicating a List of Names

This is such an unbelievably good tip, I couldn’t wait to share it with you. It has to do with removing duplicates from a list. Tuck this one away for later; you WILL need it, it’s just a matter of WHEN.

Let’s talk about associative arrays (hashes, in Perl parlance), because this is essential to understanding the Power Tip for this column. JavaScript, as you may recall, can do associative arrays, i.e., arrays of a type where the index into the array is a string rather than a number:

var MovieStars = new Object;
MovieStars['Robert Downey Jr.'] = 'drug offender';

In this example, we use the string ‘Robert Downey Jr.’ as the index into what

amounts to an array (although in JavaScript, it’s just a generic Object). The value at that

index is ‘drug offender’. You could (alternatively) assign a numeric value to

MovieStars[’Robert Downey Jr.’], or, space permitting, you could assign a very long

string containing the young star’s entire rap sheet for drug arrests and parole violations.

(I doubt if JavaScript allows that much string storage, frankly.)

Now comes the tip I want to share with you.

If you’ve ever done much work revolving around mailing list maintenance (or any kind of

database maintenance), you know what a pain it can be to unduplicate (remove duplicate

entries from) a long list. You invariably start by sorting the list, which by itself

can take a long time depending on the size of the list and the stupidness of the sort

algorithm; then you go through and whack out adjacent identical entries.

Well, there’s a super-easy way to unduplicate lists in JavaScript (and a corresponding

technique in Perl), relying on associative array properties. Suppose you have a long list,

that needs dupes removed, stored as an array called Names. Here’s how to undupe it:

var unduped = new Object;for (var i = 0; i < Names.length; i++) {

unduped[Names[i]] = Names[i];}

That’s it. The unduped object now holds a list of names, with duplicate entries

removed. How do you get the names back out? Simple:

var uniques = new Array;for (var k in unduped) {

Now uniques is an Array containing the names, with dupes removed.

The reason this trick works is that in JavaScript (and Perl, too) an associative array

can only hold one value per index. That is, if you do:

Hues['PMS 179'] = 'brick red';Hues['PMS 179'] = 'dark red';

Now Hues[’PMS 179’] will contain ‘dark red’, because you overwrote

‘brick red’ with it. Simple, right? You can’t store two different values in one array

slot simultaneously. In any language, that I know of.

Incidentally, if the syntax for (var k in unduped) looked strange to you, this is

a legitimate JavaScript looping syntax for Objects. It lets you enumerate through the

complete list of attached object properties. See p. 98 of
Flanagan’s JavaScript book.

You May Also Like

About the Author: Kas Thomas

Leave a Reply