Have you ever thought about using robust data structures while programming in Python? If yes, the Python “collections
” module is here to rescue you. This module offers a range of special functions and container points that can enhance your data structure management.
From namedtuples, DeQues, Counter, OrderedDict, defaultdict to user-friendly wrappers, the Python Collections module assists developers with the essential tools for simplifying coding tasks, efficiently handling data, and optimizing overall performance.
What Are Collection Modules in Python
Python Collections module is a built-in module that supports additional functions and structures for efficiently managing the data. It expands upon various standard data structures available in Python, like dictionaries, lists, and tuples, by ultimately offering specialized utility classes and container types.
Components of Python Collections Module
Here are the essential components of the Python collections module along with their description.
Component Name | Description |
NamedTuple | Factory faction used for creating tuples subclasses with named fields. |
DeQue | List-like container that supports pops and appends operations on both ends. |
ChainMap | Dict-like class utilized for creating or making a single view of multiple mappings. |
Counter | Dict subclass used for counting hashable objects and calculating frequencies. |
OrderedDict | Dict subclass that maintains the key-value pairs orders. |
DefaultDict | Dict subclass that invoked a factory function for offering default values for missing keys. |
UserDict | Wrapper about dictionary objects for subclassing of the dictionary. |
UserList | Wrapper around list objects for subclassing of lists. |
UserString | Wrapper around string objects for subclassing of strings. |
Let’s check out each component individually.
1. namedtuple() Function
In the Python collections module, the “namedtuple
” class permits you to create self-explanatory and lightweight data structures. It behaves as a factory function in order for creating tuple subclasses with named fields. This ultimately offers a more concise, readable alternative to regular classes and tuples.
In the given program, we will first import the “namedtuple
” class from the “collections
” module. The next step is to define a “Point
” namedtuple that comprises “x
” and “y
” name fields.
Then, we will create an instance of the Point with “a=4
” and “b=8
” and then access these named fields with the help of the dot notation. This enables intuitive and clear referencing of particular values within the namedtuple.
from collections import namedtuple # Defining a namedtuple for representing a Point with a and b coordinates Point = namedtuple('Point', ['a', 'b']) # Creating an instance of Point p = Point(4, 8) # Accessing the named fields print(p.a) print(p.b)
2. DeQue – Double-Ended Queues
“deque
” class of the Python collections module offers a list-like container with speedy append and pop operations on both left and right ends. More specifically, deques are optimized for the use cases that need efficient deletion or insertion from either sequence end. This makes deque more useful for implementing queues, stacks, and other essential data structures.
To utilize it, firstly, import the “deque
” class from the collections module. Then, create an empty queue and append the given three elements to the right end with the simple “append()
” method. Similarly, we will append two elements to the left by invoking the “appendleft()
” method.
Next, we have called the “pop()
” and “popleft()
” to pop elements from the right end and left end, one by one. Lastly, the elements are printed on the console with the print() function.
from collections import deque # Creating an empty deque d = deque() # Appending elements to the right end d.append(7) d.append(3) d.append(9) # Appending elements to the left end d.appendleft(6) d.appendleft(8) # Pop elements from the right end print(d.pop()) # Pop elements from the left end print(d.popleft()) # Accessing elements print(d[0]) print(d[-1])
3. ChainMap in Python
This class offers an efficient approach for combining multiple dictionaries into a single view. It acts as a dict-like class that can be utilized as a single view of multiple mappings, permitting efficient lookup and modification operations across all the underlying dictionaries.
Now, let’s import the ChainMap class from the collections module. Then, we will create two dictionaries named “dict1
” and “dict2
“. After that, we will create a ChainMap object named “chain_map
” by passing the newly created dictionaries as arguments.
These values can be accessed from the ChainMap as a single dictionary and you can also modify them which also affects the original dictionaries.
from collections import ChainMap # Creating two dictionaries dict1 = {'a': 9, 'b': 8} dict2 = {'c': 7, 'd': 6} # Creating a ChainMap from the dictionaries chain_map = ChainMap(dict1, dict2) # Accessing values print("Original Values:") print(chain_map['a']) print(chain_map['c']) # Updating values chain_map['b'] = 5 print("Updated Values:") print(chain_map['b']) print(dict1['b'])
4. Python’s Counter
The Counter class supports a convenient way of counting the occurrences of elements in an iterable. It is a specialized dict subclass that efficiently counts the hashable objects and permits easy frequency manipulation and calculation.
In the stated program, we first imported the Counter class from the collections module. Then, we created a Counter object named “c
” from a list of elements. This object will automatically count the occurrences of each element that are characters in this scenario.
Then, we updated the counts with the help of the “updated()
” method with another iterable. Lastly, the “most_common()
” method for retrieving the most common elements and their counts.
from collections import Counter # Creating a Counter from a list c = Counter(['x', 'y', 'z', 'x', 'y', 'x']) # Counting occurrences print(c) print("Original:", c['x']) # Updating counts c.update(['x', 'y', 'z']) print("Updated:", c) # print most common elements print("Most Common:", c.most_common(2))
5. OrderedDict() Function
Python collections module provides another dict subclass named OrderedDict. This class maintains the order in which the key-value pairs were defined. It also remembers the insertion order of the elements, enabling you to iterate over the dictionary’s keys or items in the order they were added.
For instance, we will import the OrderedDict class from the collections module. Then, create an empty ordered dictionary named “ordered_dict
“. Next, the “items()
” method is utilized for iterating over the created dictionary with the help of the for loop. Lastly, the print()function is used for displaying the items on the terminal.
from collections import OrderedDict # Creating an empty OrderedDict ordered_dict = OrderedDict() # Adding key-value pairs ordered_dict['x'] = 7 ordered_dict['y'] = 8 ordered_dict['z'] = 9 # Iterating over the OrderedDict for key, value in ordered_dict.items(): print(key, value)
6. defaultdict() Function
defaultdict class is a dict subclass that invokes the factory function for supplying the missing values when the accessed key does not exist. In such a scenario, it offers a convenient way for handling missing keys without raising KeyError exceptions.
Here, we have imported the defaultdict class from the collections module. Then, a new defaultdict object has been created named “d
” with “int
” as the factory function.
So, whenever a missing key is accessed, it outputs the default value of “int
“, which is 0. After doing so, the count has been incremented for a key by 1.
from collections import defaultdict # Defining a defaultdict with int as the factory function d = defaultdict(int) # Accessing a missing key returns the default value print(d['x']) # Incrementing the count for 'y' by 1 d['y'] += 1 print(d['y'])
As the next part of the program, a new defaultdict has been created named “d_list
” with the “list
” factory function. Then, we have appended two elements to it for missing values.
So, when a missing key is accessed, an empty list will be returned, permitting us to add values to the list.
from collections import defaultdict # Creating a defaultdict with list as the factory function d_list = defaultdict(list) # Appending to the list for a missing key d_list['colors'].append('black') d_list['colors'].append('blue') print(d_list['colors'])
7. UserDict in Pythons
“UserDict
” class in the Python collections module is referred to as a wrapper class that can simplify the subclassing of the dictionary respectively.
In this program, we have imported the “UserDict
” class from the collections module. After that, we created custom subclasses named “CustomDict
“. This is done by inheriting the corresponding wrapper class. Then, the “__setitem__
” method is overridden for adding custom functionality and is invoked along with the added behavior.
from collections import UserDict # Creating a custom dictionary subclass class CustomDict(UserDict): def __setitem__(self, key, value): print("Setting key-value pair") super().__setitem__(key, value) # Creating an instance of the custom dictionary d = CustomDict() d['a'] = 1 print(d)
8. UserList in Python
“UserList
” is another wrapper class of the collections module that permits the easier subclassing of the list objects in Python.
In the provided code, we have created a custom list named “UserList
” by importing the “UserList
” class from the collections module. Then, the “append()
” method is overridden and called for performing the added functionality of adding items to the list.
from collections import UserList # Creating a custom list subclass class CustomList(UserList): def append(self, item): print("Appending item") super().append(item) # Creating an instance of the custom list lst = CustomList() lst.append(10) print(last)
9. UserString in Python
Last but not least, the “UserString
” wrapper class of the Python collections module can be utilized for easier subclassing of the string objects.
For instance, here we have imported the “UserString
” class from the collections module and created a custom substring named “CustomString
“. Then, the “upper()
” method is overridden and we added the custom functionality for converting the string to uppercase as follows.
from collections import UserString # Creating a custom string subclass class CustomString(UserString): def upper(self): print("Converting to uppercase") return super().upper() # Creating an instance of the custom string s = CustomString("hello geeksveda user") print(s.upper())
It can be observed that the added functionality has been performed successfully.
Best Practices for Using Python Collections Module
Look at the provided best practices for using the Python Collections module.
- Import only the particular functions or classes from the module that are required.
- Utilized “
deque
” for the efficient deletion or insertion operations on both ends. - Use “
namedtuples
” for readable and concise data structures with named fields. - Utilize “
OrderedDict
” for preserving the order of key-value pairs. - Employ Counter for counting and frequency calculation.
- Use “
defaultdict
” to easily handle missing keys. - Utilize “
UserDict
“, “UserString
“, and “UserList
” for easier subclassing of dictionaries, strings, and lists.
That’s all from today’s guide related to the Python collections module.
Conclusion
The Python Collections module supports robust container types and functions that can improve data structure management. By using namedtuples, deques, Counter, OrderedDict, defaultdict, and the UserDict, UserList, and UserString wrappers, you can achieve efficient, concise, and maintainable code.
So, explore it and utilize it in your project accordingly.
Want to explore and learn more related to Python, do check out our dedicated Python Tutorial Series!