A Dictionary Class Which Permits Duplicate Keys

Introduction

 
The .NET framework contains a number of 'Dictionary' classes including:
  • Dictionary<K, V>
  • SortedDictionary<K, V>
  • Hashtable
  • ListDictionary
  • OrderedDictionary
All of these associate a key with a value and are represented internally by a collection of key/value pairs. However, none of them permit duplicate keys and so, if you try to add an item whose key is already present, an exception is thrown.
 
There is a Lookup<K, V> class which is a collection of keys mapped to one or more values. However, this is intended for use in LINQ queries and is not really suitable for general purpose use.
 
A strategy for a workaround
 
The absence of a general-purpose dictionary class which permits duplicate keys is a serious shortcoming of the framework, in my opinion.
 
The most usual way to work around this shortcoming is to associate a list of values, rather than a single value, with each key in the dictionary.
 
In fact, the GroupBy extension method which is used in LINQ works by reading the input elements into a temporary dictionary of lists so that all elements with the same key end up in the list associated with that key. The lists are then emitted as a sequence of objects which implement the IGrouping interface.
 
Although this strategy works, it is tedious to employ in practice because you need to write code to manipulate the list associated with each key every time an item is added, removed, or checked for containership. Also, when enumerating the dictionary, you have to enumerate first the keys and then their associated lists.
 
The Lexicon class
 
It would be nice if we had a class which did all the 'housekeeping' of maintaining the lists for us and which could be used in a natural way. I have therefore written such a class.
 
As Dictionary is already a long name, I decided to simply use the synonym 'Lexicon' for the new class rather than append yet another prefix to 'Dictionary'.
 
The Lexicon<K, V> class contains all the same members as the Dictionary<K, V> class but has some new ones to deal with the possibility that a key could have multiple values. The usage of some existing members has had to be modified for the same reason and some additional overloads have been added.
 
Although the Lexicon<K, V> class is a wrapper for a Dictionary<K, List<V>>, I decided that it would not be appropriate to formally implement the generic and non-generic IDictionary interfaces because these interfaces implicitly assume that classes which implement them will not support duplicate keys. However, it does implement all the other interfaces which Dictionary implements.
 
It also implements a new interface I've created called ILexicon<K, V> which contains signatures for its essential members.
 
This table lists its main constructors with a brief description of what they do:
 
Constructor Signature Description
Lexicon()
creates a new empty Lexicon
Lexicon(dictionary)
creates a new Lexicon from a generic IDictionary
Lexicon(lexicon)
creates a new Lexicon from an ILexicon
Lexicon(capacity)
creates a new Lexicon with the specified initial capacity
Lexicon(comparer)
creates a new Lexicon which uses a specific IEqualityComparer  to test for equality of keys
 
This table lists its main properties:
 
Property Name Description
Count
gets the total number of items in the Lexicon
KeyCount
gets the total number of unique keys
this[key]
gets or sets the list of values with this key
this[key, index]
gets or sets the value of the item at this index within this key's list of values
Keys
gets a collection of all unique keys  in the Lexicon
ValueLists
gets the lists of values of all items in the Lexicon in the same order as the Keys property
Values
gets an enumeration of all values  in the Lexicon in the same order as the Keys and Values properties
 
And, finally, this table lists it's main methods:
 
Method Name and Signature Description
Add(key, value)
adds an item to the Lexicon with this key and value
AddList(key, list)
adds  items  to the Lexicon with this key from this list of values
AddRange(keyValuePairs)
adds an enumerable collection of KeyValuePairs to the Lexicon
ChangeValue(key, oldValue, newValue)
changes the value of the first item with this key and this oldValue to this newValue
ChangeValueAt(key, index, newValue)
changes the value of the item with this index in  this key's list of values to newValue
Clear()
clears the Lexicon of all items
Contains(key, value)
indicates if an item with this key/value pair is within the Lexicon
ContainsKey(key)
indicates if an item with this key is within the Lexicon
ContainsValue(value)
indicates if an item with this value is within the Lexicon
ContainsValue(value, out key)
indicates if an item with this value is within the Lexicon and, if so, returns the first key found with that value  as an output parameter
CopyTo(array, index)
copies the key /value pairs of the Lexicon to an array starting at the specified index
FindKeyIndexPairs(value)
finds all key/index pairs in the Lexicon which have this value
GetValueCount(key)
gets the  number of all values in this key's list
IndexOfValue(key, value)
gets the first index of this value within this key's list
Remove(key, value)
removes the first item in the Lexicon with this key and value
RemoveAt(key, index)
removes the item at this index in this key's list of values
RemoveKey(key)
removes all items in the Lexicon with this key
TryGetValueList(key, out value)
tries to get the list of values with this key
TryGetValueAt(key, index, out value)
tries to get the value of the item at this index within this key's list of values
 
Notes on members
 
Lexicon's constructors mirror those of the generic Dictionary class except that there is an additional constructor to create a new Lexicon from an existing ILexicon.
 
The Count property returns the total number of key/value pairs in the Lexicon.
 
The indexer which takes a sole key argument gets or sets the whole list of values for that key. If there's already a list for that key, it is replaced (not merged) by the 'setter'. Otherwise, a new entry for that key is created.
 
The indexer which takes an additional index parameter gets or sets the individual value at that index in the key's value list. If the key doesn't exist or there's no value at that index (and it's the next index to be assigned) then a new entry is created.
 
The Keys property returns just the unique keys (however many times they're duplicated) and the KeyCount property returns how many such keys there are.
 
The ValueLists property returns the lists of values corresponding to the unique keys and the Values property returns an enumeration of all values in all key/value pairs in the Lexicon.
 
The Add, Contains, and Remove methods also have overloads (not shown) which take a KeyValuePair structure as a parameter.
 
The AddList method adds a key with a list of values to the Lexicon. If the key already exists, its existing list of values is merged with the new one, not replaced by it. This differs from the indexer behavior.
 
The FindAllKeyIndexPairs method does what it says on the tin for a given value. Notice, though, that this method will be slow for a large Lexicon as it needs to iterate through every key/value pair within it. It has therefore been implemented using deferred execution (i.e. an iterator).
 
The 'ChangeValue' and 'Remove' family of methods all return a bool value to indicate whether the operation was a success or not.
 
The other members are largely self-explanatory.
 
When Lexicons are created (or added to) from other objects, they are copied first so that subsequent changes don't affect the original objects. However, when lists of values are retrieved by external code, they are not copied and so maybe manipulated directly by that code.
 
Enumerating the Lexicon
 
The best way to enumerate a Lexicon object is to use the enumerator returned by the GetEnumerator method which returns all key/value pairs in the same order as the keys are enumerated in the underlying Dictionary object
 
However, it's also possible to enumerate using the Keys property and then use the indexer to get the list of values for that key and iterate through those. The ValueLists property enables the lists of values for each key to be enumerated separately.
 
As mentioned in the previous section, the Values property and FindKeyIndexPairs method are implemented as generic IEnumerables and so can also be enumerated using the foreach statement.
 
Example of usage
 
The code for the Lexicon class and ILexicon interface can be downloaded from the link accompanying this article as it is too long to include in the body of the article itself. Both these types are included in a namespace called Lexicons and can be used in .NET 2.0 or later.
 
The following console application shows how to use some of its principal members:
  1. using System;  
  2. using System.Collections.Generic;  
  3. using Lexicons;  
  4.   
  5. class Program {  
  6.     static void Main(string[] args) {  
  7.         // create and populate a Lexicon object  
  8.         Lexicon < stringint > lex = new Lexicon < stringint > ();  
  9.         lex.Add("Dave", 1);  
  10.         lex.Add("John", 2);  
  11.         lex.Add("Dave", 3);  
  12.         lex.Add("Stan", 4);  
  13.         lex.Add("Dave", 5);  
  14.         lex.Add(new KeyValuePair < stringint > ("Fred", 6));  
  15.   
  16.         // iterate through key/value pairs  
  17.         Console.WriteLine("The lexicon initially contains the following key/value pairs\n");  
  18.   
  19.         foreach(KeyValuePair < stringint > kvp in lex) {  
  20.             Console.WriteLine("{0} : {1}", kvp.Key, kvp.Value);  
  21.         }  
  22.   
  23.         // add a new entry to the Lexicon  
  24.         lex["Alan"] = new List < int > { 7 };  
  25.   
  26.         // add some more values for the new key  
  27.         lex.AddList("Alan"new List < int > { 8, 9 });  
  28.   
  29.         // add another new entry  
  30.         lex.Add("Mary", 10);  
  31.   
  32.         // iterate the Lexicon again, this time using the Keys collection  
  33.         Console.WriteLine("\nFollowing the addition of new entries, the lexicon now contains\n");  
  34.   
  35.         foreach(string key in lex.Keys) {  
  36.             foreach(int value in lex[key]) {  
  37.                 Console.WriteLine("{0} : {1}", key, value);  
  38.             }  
  39.         }  
  40.   
  41.         Console.WriteLine("\nDave has {0} values", lex.GetValueCount("Dave"));  
  42.   
  43.         lex.RemoveKey("Dave"); // remove key and all its values             
  44.         lex.Remove("Alan", 8); // remove a single value  
  45.         lex.ChangeValue("Fred", 6, 5); // change a value  
  46.   
  47.         // iterate the Lexicon again  
  48.         Console.WriteLine("\nFollowing some removals and a change, the lexicon now contains\n");  
  49.         foreach(KeyValuePair < stringint > kvp in lex) {  
  50.             Console.WriteLine("{0} : {1}", kvp.Key, kvp.Value);  
  51.         }  
  52.   
  53.         if (lex.Contains("Stan", 4)) {  
  54.             Console.WriteLine("\nStan has a value of 4");  
  55.         }  
  56.   
  57.         // create an array of key/value pairs and copy the Lexicon's contents to it  
  58.         KeyValuePair < stringint > [] kvpa = new KeyValuePair < stringint > [lex.Count];  
  59.         lex.CopyTo(kvpa, 0);  
  60.   
  61.         Console.WriteLine("There are currently {0} key value pairs in the Lexicon", kvpa.Length);  
  62.   
  63.         // try and get the value at index 1 for Alan  
  64.         int val;  
  65.         bool b = lex.TryGetValueAt("Alan", 1, out val);  
  66.         if (b) Console.WriteLine("Alan has a value of {0} at index 1", val);  
  67.   
  68.         // create a new dictionary  
  69.         Dictionary < stringint > dict = new Dictionary < stringint > ();  
  70.         dict["Nora"] = 3;  
  71.         dict["John"] = 4; // uses a key already in the Lexicon  
  72.   
  73.         // create a new Lexicon from the Dictionary  
  74.         Lexicon < stringint > lex2 = new Lexicon < stringint > (dict);  
  75.   
  76.         // add some more members to it  
  77.         lex2["Zena"] = new List < int > { 11 };  
  78.         lex2["Myra", 0] = 12;  
  79.   
  80.         // merge with existing Lexicon  
  81.         lex.AddRange(lex2);  
  82.   
  83.         lex.Remove(new KeyValuePair < stringint > ("Stan", 4)); // effectively remove Stan  
  84.         lex.RemoveAt("Mary", 0); // effectively remove Mary  
  85.   
  86.         // iterate the Lexicon again  
  87.         Console.WriteLine("\nFollowing a number of changes, the lexicon now contains\n");  
  88.         foreach(KeyValuePair < stringint > kvp in lex) {  
  89.             Console.WriteLine("{0} : {1}", kvp.Key, kvp.Value);  
  90.         }  
  91.   
  92.         Console.WriteLine("\nNora has a value of 3 at index {0}", lex.IndexOfValue("Nora", 3));  
  93.   
  94.         lex["Zena", 1] = 1; // add a new value for Zena  
  95.   
  96.         if (lex.ContainsValue(12)) {  
  97.             Console.WriteLine("The lexicon contains a value of 12");  
  98.         }  
  99.   
  100.         string k;  
  101.         if (lex.ContainsValue(5, out k)) Console.Write("{0} had a value of 5 ", k);  
  102.         lex.ChangeValue(k, 5, 2);  
  103.         if (lex[k, 0] == 2) Console.WriteLine("but now has a value of 2", k);  
  104.   
  105.         Console.WriteLine("\nThe following key/index pairs have a value of 2\n");  
  106.         foreach(KeyValuePair < stringint > kip in lex.FindKeyIndexPairs(2)) {  
  107.             Console.WriteLine("Key : {0}  Value : 2  Index : {1}", kip.Key, kip.Value);  
  108.         }  
  109.         Console.ReadKey();  
  110.     }  
Example output
 
A screenshot of the output is shown below:
 
Enum.gif
 

Conclusion

 
I hope you will find the Lexicon class to be a useful adjunct to the other 'Dictionary' classes in the .NET Framework.
 
It is not intended as a replacement for the generic Dictionary class and should only be used in situations where keys may be duplicated. As a List object needs to be assigned to each key, it will clearly use more memory and be slightly slower than an 'ordinary' Dictionary. This is not ideal in situations where the duplicated keys are relatively sparse but an inevitable consequence of the strategy used.
 
I am currently working on sorted versions of the Lexicon and a 'ToLexicon' extension method for use in LINQ queries. I am also considering the use of a different strategy to address the issue of Lexicons with sparse duplicate keys. The results will be the subject of a future article.