Exploring Collections library In Python

Introduction

The Python collection module provides the container, which is used to store collections, like list, set, dict and tuple. The module provides alternatives to existing built-in Python containers. The module improves the functionalities of the existing containers.

Collection’s module provides many data structures, alternatives to all the existing Python Collections

  1. Counter
  2. namedtuple
  3. deque
  4. ChainMap
  5. OrderedDict
  6. defaultDict
  7. UserDict
  8. UserList
  9. UserString

In the article, we will focus on three important data structures from the Python collections module.

  1. Counter
  2. namedtuple
  3. OrderedDict

Let’s explore all 3 of them.

Counter

The counter is a subclass of Dictionary Object. In this collection, elements are stored as dictionary keys, and there count as dictionary values.

Syntax: collections.Counter([iterable / mapping]), since Counter accepts Iterable, it can accept String, List, Collection as argument. Let's understand by the example, 

from Collections import Counter
     
counter = Counter([5,2,3,5,5,3])
counter 

returns, Counter({5: 3, 2: 1, 3: 2})

Program execution is the same as the function description, Keys are the List elements which is the age here and the values are the counts of the elements, 5 is appearing 3 times and 3 is appearing 2 times.

We can access elements exactly like a dictionary, counter[5] will return value 3 and if the element does not exist in the dictionary, 0 returns c[50] returns 0.

Counter has lots of important functions, we will explore 3 most important ones.

  1. most_common
  2. elements
  3. subtract

most_common (): most_common Function returns the most_common elements in Counter in descending order, the function accepts integer parameter for returning the top ‘n’ elements if the parameter is not provided, the function returns all the elements from the Counter.

counter.most_common(2) 

returns: [(5,3), (3,2)]

counter.most_common()

returns: [(5,3), (3,2), (2,1)]

element (): The element function, returns the iterator of items of the Counter Object, we can use a list or sorted function to retrieve the elements.

counter.elements() # <itertools.chain at 0x21ffe3eba30>
list(counter.elements()) # [5, 5, 5, 2, 3, 3]
sorted(counter.elements()) # [2, 3, 3, 5, 5, 5]

 subtract (): The subtract function of the Counter class is used to subtract the elements of another counter or iterable. We already have one counter with elements

counter = Counter([5,2,3,5,5,3])

create another counter call it counter2 

counter2 = Counter([5,4,6])
counter.substract(counter2)) #Counter({5: 2, 2: 1, 3: 2, 4: -1, 6: -1})

In the original counter, 5 is appearing thrice in counter2 only once, after subtracting 3 – 1, the new value of 5 is 2.

2 and 3, not subtracted because 2 and 3 is not present in counter2.

4 is -1 because 4 is not present in a counter object from which we are subtracting counter2, the same is with 5.   

namedtuple

One disadvantage of Python tuple is, for accessing the element of tuple we have to use index,

tuple = ('Sam', 'David', 'Mick')
tuple[0] #Sam

If the tuple is having only a few elements using the index approach is Ok, but if the number of elements is more then it is error prone. The alternative is to use ‘namedtuple’, the 'namedtuple' acts as a bean a container to the elements.

from collections import namedtuple
Book = namedtuple('Book', 'title author price')

namedtuple takes 2 parameters, first one is the name of the tuple and the second is tuple object names.

pyBook = Book('Pragmatic Python', 'Sameer', '10')
scalaBook = Book('Scala in Action', 'Oderskey', '200')
print(pyBook.author’) #Sameer 

This approach protects us from the ‘IndexError’ always and improves the code readability.

OrderedDict

The OrderedDict is a dictionary that maintains the Insertion Order, it’s very much like Java’s LinkedHashMap object, the order of the key is not changed at all once inserted, irrespective of updating the value of the key later.

from collections import OrderedDict
student = OrderedDict()
student['Sameer'] = 35
student['David'] = 45
student['Mick'] = 20

print(student) # OrderedDict([('Sameer', 35), ('David', 45), ('Mick', 20)])

On modifying the key, the insertion order never changes. for example:

student['Mick'] = 55
print(student) #OrderedDict([('Sameer', 35), ('David', 45), ('Mick', 55)])

I highly recommend using the collections module.