Open In App

Internal working of Set in Python

Last Updated : 28 Jan, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

Sets and their working Set in Python can be defined as the collection of items. In Python, these are basically used to include membership testing and eliminating duplicate entries. The data structure used in this is Hashing, a popular technique to perform insertion, deletion and traversal in O(1) on average. The operations on Hash Table are some what similar to Linked List. Sets in python are unordered list with duplicate elements removed. 

Basic Methods on Sets are:-

Creating Set:- In Python, Sets are created through set() function. An Empty list is created. Note that empty Set cannot be created through {}, it creates dictionary. 

Checking if an item is in : Time complexity of this operation is O(1) on average. However in worst case it can become O(n). 

Adding elements:- Insertion in set is done through set.add() function, where an appropriate record value is created to store in the hash table. Same as checking for an item, i.e., O(1) on average. However in worst case it can become O(n). 

Union:- Two sets can be merged using union() function or | operator. Both Hash Table values are accessed and traversed with merge operation perform on them to combine the elements, at the same time duplicates are removed. Time Complexity of this is O(len(s1) + len(s2)) where s1 and s2 are two sets whose union needs to be done. 

Intersection:- This can be done through intersection() or & operator. Common Elements are selected. They are similar to iteration over the Hash lists and combining the same values on both the Table. Time Complexity of this is O(min(len(s1), len(s2)) where s1 and s2 are two sets whose union needs to be done. 

Difference:- To find difference in between sets. Similar to find difference in linked list. This is done through difference() or – operator. Time complexity of finding difference s1 – s2 is O(len(s1)) 

Symmetric Difference:- To find element in both the sets except the common elements. ^ operator is used. Time complexity of s1^s2 is O(len(s1)) 

Symmetric Difference Update: Returns a new set which contains symmetric difference of two sets. Time complexity is O(len(s2)) clear:- Clears the set or Hash Table. 

Time complexity source : Python Wiki 

If Multiple values are present at the same index position, then the value is appended to that index position, to form a Linked List. In, Python Sets are implemented using dictionary with dummy variables, where key beings the members set with greater optimizations to the time complexity. Set Implementation:-  
Sets with Numerous operations on a single HashTable:- Examples:

# empty set, avoid using {} in creating set or dictionary is created
x = set()

# set {'e', 'h', 'l', 'o'} is created in unordered way
B = set('hello')

# set{'a', 'c', 'd', 'b', 'e', 'f', 'g'} is created
A = set('abcdefg')

# set{'a', 'b', 'h', 'c', 'd', 'e', 'f', 'g'}
A.add('h')

fruit ={'orange', 'banana', 'pear', 'apple'}

# True fast membership testing in sets
'pear' in fruit

'mango' in fruit # False

A == B # A is equivalent to B

A != B # A is not equivalent to B

A <= B # A is subset of B A <B>= B

A > B # A is proper superset of B

A | B # the union of A and B

A & B # the intersection of A and B

A - B # the set of elements in A but not B

A ˆ B # the symmetric difference

a = {x for x in A if x not in 'abc'} # Set Comprehension

Previous Article
Next Article

Similar Reads

Internal working of list in Python
Introduction to Python lists : Python lists are internally represented as arrays. The idea used is similar to implementation of vectors in C++ or ArrayList in Java. The costly operations are inserting and deleting items near the beginning (as everything has to be moved). Insert at the end also becomes costly if preallocated space becomes full.We ca
3 min read
Internal working of Python
Python is an object-oriented programming language like Java. Python is called an interpreted language. Python uses code modules that are interchangeable instead of a single long list of instructions that was standard for functional programming languages. The standard implementation of Python is called "cpython". It is the default and widely used im
5 min read
Internal Working of the len() Function in Python
The len() function in Python has a very peculiar characteristic that one had often wondered about. It takes absolutely no time, and equal time, in calculating the lengths of iterable data structures(string, array, tuple, etc.), irrespective of the size or type of data. This obviously implies O(1) time complexity. But have you wondered How? Python f
2 min read
Internal Structure of Python Dictionary
Dictionary in Python is an unordered collection of data values, used to store data values like a map, which unlike other Data Types that hold only a single value as an element, Dictionary holds key:value pair. Key-value is provided in the dictionary to make it more optimized The dictionary consists of a number of buckets. Each of these buckets cont
3 min read
Internal implementation of Data Structures in Python
Python provides a variety of built-in data structures, each with its own characteristics and internal implementations optimized for specific use cases. In this article we are going to discuss about the most commonly used Data structures in Python and a brief overview of their internal implementations: Data Structure Internal Implementation Static o
3 min read
marshal — Internal Python object serialization
Serializing a data means converting it into a string of bytes and later reconstructing it from such a string. If the data is composed entirely of fundamental Python objects, the fastest way to serialize the data is by using marshal module (For user defined classes, Pickle should be preferred). Marshal module contains functions that can read and wri
2 min read
Python | Working with Pandas and XlsxWriter | Set - 1
Python Pandas is a data analysis library. It can read, filter and re-arrange small and large datasets and output them in a range of formats including Excel. Pandas writes Excel files using the XlsxWriter modules. XlsxWriter is a Python module for writing files in the XLSX file format. It can be used to write text, numbers, and formulas to multiple
3 min read
Python | Working with Pandas and XlsxWriter | Set – 2
Prerequisite: : Python working with pandas and xlsxwriter | set-1 Python Pandas is a data analysis library. It can read, filter and re-arrange small and large datasets and output them in a range of formats including Excel. Pandas writes Excel files using the XlsxWriter modules. XlsxWriter is a Python module for writing files in the XLSX file format
4 min read
Python | Working with Pandas and XlsxWriter | Set – 3
Prerequisite: : Python working with pandas and xlsxwriter | set-1 Python Pandas is a data analysis library. It can read, filter and re-arrange small and large datasets and output them in a range of formats including Excel. Pandas writes Excel files using the XlsxWriter modules. XlsxWriter is a Python module for writing files in the XLSX file format
5 min read
Working with Datetime Objects and Timezones in Python
In this article, we are going to work with Datetime objects and learn about their behavior when Time zones are introduced. We are going to be working with the Python datetime module. Getting a Datetime objectMethod 1: Using now() method A very easy way to get a Datetime object is to use the datetime.now() method. A DateTime object is an instance/ob
5 min read
Article Tags :
Practice Tags :