Reader Level:
ARTICLE

A Dictionary Class Which Permits Duplicate Keys

Posted by Vulpes Articles | .NET 4.5 November 08, 2011
The .NET framework contains a number of ‘Dictionary’ classes.
  • 0
  • 0
  • 18042
Download Files:
 

Introduction

The .NET framework contains a number of 'Dictionary' classes including:

  • Dictionary<K, V>
  • SortedDictionary<K, V>
  • Hashtable
  • ListDictionary
  • OrderedDictionary

All of these associate a key with a value and are represented internally by a collection of key/value pairs. However, none of them permit duplicate keys and so, if you try to add an item whose key is already present, an exception is thrown.

There is a Lookup<K, V> class which is a collection of keys mapped to one or more values. However, this is intended for use in LINQ queries and is not really suitable for general purpose use.

A strategy for a workaround

The absence of a general purpose dictionary class which permits duplicate keys is a serious shortcoming of the framework, in my opinion.

The most usual way to work around this shortcoming is to associate a list of values, rather than a single value, with each key in the dictionary.

In fact, the GroupBy extension method which is used in LINQ works by reading the input elements into a temporary dictionary of lists so that all elements with the same key end up in the list associated with that key. The lists are then emitted as a sequence of objects which implement the IGrouping interface.

Although this strategy works, it is tedious to employ in practice because you need to write code to manipulate the list associated with each key every time an item is added, removed or checked for containership. Also, when enumerating the dictionary, you have to enumerate first the keys and then their associated lists.

The Lexicon class

It would be nice if we had a class which did all the 'housekeeping' of maintaining the lists for us and which could be used in a natural way. I have therefore written such a class.

As Dictionary is already a long name, I decided to simply use the synonym 'Lexicon' for the new class rather than append yet another prefix to 'Dictionary'.

The Lexicon<K, V> class contains all the same members as the Dictionary<K, V> class but has some new ones to deal with the possibility that a key could have multiple values. The usage of some existing members has had to be modified for the same reason and some additional overloads have been added.

Although the Lexicon<K, V> class is a wrapper for a Dictionary<K, List<V>>, I decided that it would not be appropriate to formally implement the generic and non-generic IDictionary interfaces because these interfaces implicitly assume that classes which implement them will not support duplicate keys. However, it does implement all the other interfaces which Dictionary implements.

It also implements a new interface I've created called ILexicon<K, V> which contains signatures for its essential members.

This table lists its main constructors with a brief description of what they do:

Constructor Signature

 

Description

Lexicon()

creates a new empty Lexicon

Lexicon(dictionary)

creates a new Lexicon from a generic IDictionary

Lexicon(lexicon)

creates a new Lexicon from an ILexicon

Lexicon(capacity)

creates a new Lexicon with the specified initial capacity

Lexicon(comparer)

creates a new Lexicon which uses a specific IEqualityComparer  to test for equality of keys

This table lists its main properties:

Property Name

 

Description

Count

gets the total number of items in the Lexicon

KeyCount

gets the total number of unique keys

this[key]

gets or sets the list of values with this key

this[key, index]

gets or sets the value of the item at this index within this key's list of values

Keys

gets a collection of all unique keys  in the Lexicon

ValueLists

gets the lists of values of all items in the Lexicon in the same order as the Keys property

Values

gets an enumeration of all values  in the Lexicon in the same order as the Keys and Values properties

And, finally, this table lists its main methods:

Method Name and Signature

 

Description

Add(key, value)

adds an item to the Lexicon with this key and value

AddList(key, list)

adds  items  to the Lexicon with this key from this list of values

AddRange(keyValuePairs)

adds an enumerable collection of KeyValuePairs to the Lexicon

ChangeValue(key, oldValue, newValue)

changes the value of the first item with this key and this oldValue to this newValue

ChangeValueAt(key, index, newValue)

changes the value of the item with this index in  this key's list of values to newValue

Clear()

clears the Lexicon of all items

Contains(key, value)

indicates if an item with this key/value pair is within the Lexicon

ContainsKey(key)

indicates if an item with this key is within the Lexicon

ContainsValue(value)

indicates if an item with this value is within the Lexicon

ContainsValue(value, out key)

indicates if an item with this value is within the Lexicon and, if so, returns the first key found with that value  as an output parameter

CopyTo(array, index)

copies the key /value pairs of the Lexicon to an array starting at the specified index

FindKeyIndexPairs(value)

finds all key/index pairs in the Lexicon which have this value

GetValueCount(key)

gets the  number of all values in this key's list

IndexOfValue(key, value)

gets the first index of this value within this key's list

Remove(key, value)

removes the first item in the Lexicon with this key and value

RemoveAt(key, index)

removes the item at this index in this key's list of values

RemoveKey(key)

removes all items in the Lexicon with this key

TryGetValueList(key, out value)

tries to get the list of values with this key

TryGetValueAt(key, index, out value)

tries to get the value of the item at this index within this key's list of values

Notes on members

Lexicon's constructors mirror those of the generic Dictionary class except that there is an additional constructor to create a new Lexicon from an existing ILexicon.

The Count property returns the total number of key/value pairs in the Lexicon.

The indexer which takes a sole key argument gets or sets the whole list of values for that key. If there's already a list for that key, it is replaced (not merged) by the 'setter'. Otherwise, a new entry for that key is created.

The indexer which takes an additional index parameter gets or sets the individual value at that index in the key's value list. If the key doesn't exist or there's no value at that index (and it's the next index to be assigned) then a new entry is created.

The Keys property returns just the unique keys (however many times they're duplicated) and the KeyCount property returns how many such keys there are.

The ValueLists property returns the lists of value corresponding to the unique keys and the Values property returns an enumeration of all values in all key/value pairs in the Lexicon.

The Add, Contains and Remove methods also have overloads (not shown) which take a KeyValuePair structure as a parameter.

The AddList method adds a key with a list of values to the Lexicon. If the key already exists, its existing list of values is merged with the new one, not replaced by it. This differs from the indexer behaviour.

The FindAllKeyIndexPairs method does what it says on the tin for a given value. Notice, though, that this method will be slow for a large Lexicon as it needs to iterate through every key/value pair within it. It has therefore been implemented using deferred execution (i.e. an iterator).

The 'ChangeValue' and 'Remove' family of methods all return a bool value to indicate whether the operation was a success or not.

The other members are largely self-explanatory.

When Lexicons are created (or added to) from other objects, they are copied first so that subsequent changes don't affect the original objects. However, when lists of values are retrieved by external code, they are not copied and so may be manipulated directly by that code.

Enumerating the Lexicon

The best way to enumerate a Lexicon object is to use the enumerator returned by the GetEnumerator method which returns all key/value pairs in the same order as the keys are enumerated in the underlying Dictionary object

However, it's also possible to enumerate using the Keys property and then use the indexer to get the list of values for that key and iterate through those. The ValueLists property enables the lists of values for each key to be enumerated separately.

As mentioned in the previous section, the Values property and FindKeyIndexPairs method are implemented as generic IEnumerables and so can also be enumerated using the foreach statement.

Example of usage

The code for the Lexicon class and ILexicon interface can be downloaded from the link accompanying this article as it is too long to include in the body of the article itself. Both these types are included in a namespace called Lexicons and can be used in .NET 2.0 or later.

The following console application shows how to use some of its principal members:

using System;
using System.Collections.Generic;
using Lexicons;

class Program
{
    static void Main(string[] args)
    {
       
// create and populate a Lexicon object

        Lexicon<string, int> lex = new Lexicon<string, int>();
        lex.Add("Dave", 1);
        lex.Add("John", 2);
        lex.Add("Dave", 3);
        lex.Add("Stan", 4);
        lex.Add("Dave", 5);
        lex.Add(new KeyValuePair<string, int>("Fred", 6));

        // iterate through key/value pairs

        Console.WriteLine("The lexicon initially contains the following key/value pairs\n");

        foreach (KeyValuePair<string, int> kvp in lex)
        {
            Console.WriteLine("{0} : {1}", kvp.Key, kvp.Value);
        }

        // add a new entry to the Lexicon
        lex["Alan"] = new List<int> { 7 };

        // add some more values for the new key
        lex.AddList("Alan", new List<int> { 8, 9 });

        // add another new entry
        lex.Add("Mary", 10);

        // iterate the Lexicon again, this time using the Keys collection

        Console.WriteLine("\nFollowing the addition of new entries, the lexicon now contains\n");

        foreach (string key in lex.Keys)
        {
            foreach (int value in lex[key])
            {
                Console.WriteLine("{0} : {1}", key, value);
            }
        }

        Console.WriteLine("\nDave has {0} values", lex.GetValueCount("Dave"));

        lex.RemoveKey("Dave"); // remove key and all its values           
        lex.Remove("Alan", 8); // remove a single value
        lex.ChangeValue("Fred", 6, 5);
// change a value

        // iterate the Lexicon again

        Console.WriteLine("\nFollowing some removals and a change, the lexicon now contains\n");
        foreach (KeyValuePair<string, int> kvp in lex)
        {
            Console.WriteLine("{0} : {1}", kvp.Key, kvp.Value);
        }

        if (lex.Contains("Stan", 4))
        {
            Console.WriteLine("\nStan has a value of 4");
        }

        // create an array of key/value pairs and copy the Lexicon's contents to it

        KeyValuePair<string, int>[] kvpa = new KeyValuePair<string, int>[lex.Count];
        lex.CopyTo(kvpa, 0);

        Console.WriteLine("There are currently {0} key value pairs in the Lexicon", kvpa.Length);

        // try and get the value at index 1 for Alan

        int val;
        bool b = lex.TryGetValueAt("Alan", 1, out val);
        if (b) Console.WriteLine("Alan has a value of {0} at index 1", val);

        // create a new dictionary

        Dictionary<string, int> dict = new Dictionary<string, int>();
        dict["Nora"] = 3;
        dict["John"] = 4;
// uses a key already in the Lexicon

        // create a new Lexicon from the Dictionary

        Lexicon<string, int> lex2 = new Lexicon<string, int>(dict);

        // add some more members to it

        lex2["Zena"] = new List<int> { 11 };
        lex2["Myra", 0] = 12;

        // merge with existing Lexicon

        lex.AddRange(lex2);

        lex.Remove(new KeyValuePair<string, int>("Stan", 4)); // effectively remove Stan
        lex.RemoveAt("Mary", 0);
// effectively remove Mary

        // iterate the Lexicon again
        Console.WriteLine("\nFollowing a number of changes, the lexicon now contains\n");
        foreach (KeyValuePair<string, int> kvp in lex)
        {
            Console.WriteLine("{0} : {1}", kvp.Key, kvp.Value);
        }

        Console.WriteLine("\nNora has a value of 3 at index {0}", lex.IndexOfValue("Nora", 3));

        lex["Zena", 1] = 1; // add a new value for Zena

        if (lex.ContainsValue(12))
        {
            Console.WriteLine("The lexicon contains a value of 12");
        }

        string k;
        if (lex.ContainsValue(5, out k)) Console.Write("{0} had a value of 5 ", k);
        lex.ChangeValue(k, 5, 2);
        if (lex[k, 0] == 2) Console.WriteLine("but now has a value of 2", k);

        Console.WriteLine("\nThe following key/index pairs have a value of 2\n");
        foreach (KeyValuePair<string, int> kip in lex.FindKeyIndexPairs(2))
        {
            Console.WriteLine("Key : {0}  Value : 2  Index : {1}", kip.Key, kip.Value);
        }

        Console.ReadKey();
    }
}


Example output

A screenshot of the output is shown below:

Enum.gif

Conclusion

I hope you will find the Lexicon class to be a useful adjunct to the other 'Dictionary' classes in the .NET Framework.

It is not intended as a replacement for the generic Dictionary class and should only be used in situations where keys may be duplicated. As a List object needs to be assigned to each key, it will clearly use more memory and be slightly slower than an 'ordinary' Dictionary. This is not ideal in situations where the duplicated keys are relatively sparse but an inevitable consequence of the strategy used.

I am currently working on sorted versions of the Lexicon and a 'ToLexicon' extension method for use in LINQ queries . I am also considering the use of a different strategy to address the issue of Lexicons with sparse duplicate keys. The results will be the subject of a future article.

COMMENT USING

Trending up