Open In App

Python Strings encode() method

Last Updated : 17 Aug, 2022
Improve
Improve
Like Article
Like
Save
Share
Report

Python String encode() converts a string value into a collection of bytes, using an encoding scheme specified by the user.

Python String encode() Method Syntax:

Syntax: encode(encoding, errors)

Parameters: 

  • encoding: Specifies the encoding on the basis of which encoding has to be performed. 
  • errors: Decides how to handle the errors if they occur, e.g ‘strict’ raises Unicode error in case of exception and ‘ignore’ ignores the errors that occurred. There are six types of error response
    • strict – default response which raises a UnicodeDecodeError exception on failure
    • ignore – ignores the unencodable unicode from the result
    • replace – replaces the unencodable unicode to a question mark ?
    • xmlcharrefreplace – inserts XML character reference instead of unencodable unicode
    • backslashreplace – inserts a \uNNNN escape sequence instead of unencodable unicode
    • namereplace – inserts a \N{…} escape sequence instead of unencodable unicode

Return:  Returns the string in the encoded form

Python String encode() Method Example:

Python3




print("¶".encode('utf-8'))


Output:

b'\xc2\xb6'

Example 1: Code to print encoding schemes available

There are certain encoding schemes supported by Python String encode() method. We can get the supported encodings using the Python code below.

Python3




from encodings.aliases import aliases
 
# Printing list available
print("The available encodings are : ")
print(aliases.keys())


Output: 

The available encodings are : 
dict_keys(['ibm039', 'iso_ir_226', '1140', 'iso_ir_110', '1252', 'iso_8859_8', 'iso_8859_3', 'iso_ir_166', 'cp367', 'uu', 'quotedprintable', 'ibm775', 'iso_8859_16_2001', 'ebcdic_cp_ch', 'gb2312_1980', 'ibm852', 'uhc', 'macgreek', '850', 'iso2022jp_2', 'hz_gb_2312', 'elot_928', 'iso8859_1', 'eucjp', 'iso_ir_199', 'ibm865', 'cspc862latinhebrew', '863', 'iso_8859_5', 'latin4', 'windows_1253', 'csisolatingreek', 'latin5', '855', 'windows_1256', 'rot13', 'ms1361', 'windows_1254', 'ibm863', 'iso_8859_14_1998', 'utf8_ucs2', '500', 'iso8859', '775', 'l7', 'l2', 'gb18030_2000', 'l9', 'utf_32be', 'iso_ir_100', 'iso_8859_4', 'iso_ir_157', 'csibm857', 'shiftjis2004', 'iso2022jp_1', 'iso_8859_2_1987', 'cyrillic', 'ibm861', 'ms950', 'ibm437', '866', 'csibm863', '932', 'iso_8859_14', 'cskoi8r', 'csptcp154', '852', 'maclatin2', 'sjis', 'korean', '865', 'u32', 'csshiftjis', 'dbcs', 'csibm037', 'csibm1026', 'bz2', 'quopri', '860', '1255', '861', 'iso_ir_127', 'iso_celtic', 'chinese', 'l8', '1258', 'u_jis', 'cspc850multilingual', 'iso_2022_jp_2', 'greek8', 'csibm861', '646', 'unicode_1_1_utf_7', 'ibm862', 'latin2', 'ecma_118', 'csisolatinarabic', 'zlib', 'iso2022jp_3', 'ksx1001', '858', 'hkscs', 'shiftjisx0213', 'base64', 'ibm857', 'maccentraleurope', 'latin7', 'ruscii', 'cp_is', 'iso_ir_101', 'us_ascii', 'hebrew', 'ansi_x3.4_1986', 'csiso2022jp', 'iso_8859_15', 'ibm860', 'ebcdic_cp_us', 'x_mac_simp_chinese', 'csibm855', '1250', 'maciceland', 'iso_ir_148', 'iso2022jp', 'u16', 'u7', 's_jisx0213', 'iso_8859_6_1987', 'csisolatinhebrew', 'csibm424', 'quoted_printable', 'utf_16le', 'tis260', 'utf', 'x_mac_trad_chinese', '1256', 'cp866u', 'jisx0213', 'csiso58gb231280', 'windows_1250', 'cp1361', 'kz_1048', 'asmo_708', 'utf_16be', 'ecma_114', 'eucjis2004', 'x_mac_japanese', 'utf8', 'iso_ir_6', 'cp_gr', '037', 'big5_tw', 'eucgb2312_cn', 'iso_2022_jp_3', 'euc_cn', 'iso_8859_13', 'iso_8859_5_1988', 'maccyrillic', 'ks_c_5601_1987', 'greek', 'ibm869', 'roman8', 'csibm500', 'ujis', 'arabic', 'strk1048_2002', '424', 'iso_8859_11_2001', 'l5', 'iso_646.irv_1991', '869', 'ibm855', 'eucjisx0213', 'latin1', 'csibm866', 'ibm864', 'big5_hkscs', 'sjis_2004', 'us', 'iso_8859_7', 'macturkish', 'iso_2022_jp_2004', '437', 'windows_1255', 's_jis_2004', 's_jis', '1257', 'ebcdic_cp_wt', 'iso2022jp_2004', 'ms949', 'utf32', 'shiftjis', 'latin', 'windows_1251', '1125', 'ks_x_1001', 'iso_8859_10_1992', 'mskanji', 'cyrillic_asian', 'ibm273', 'tis620', '1026', 'csiso2022kr', 'cspc775baltic', 'iso_ir_58', 'latin8', 'ibm424', 'iso_ir_126', 'ansi_x3.4_1968', 'windows_1257', 'windows_1252', '949', 'base_64', 'ms936', 'csisolatin2', 'utf7', 'iso646_us', 'macroman', '1253', '862', 'iso_8859_1_1987', 'csibm860', 'gb2312_80', 'latin10', 'ksc5601', 'iso_8859_10', 'utf8_ucs4', 'csisolatin4', 'ebcdic_cp_be', 'iso_8859_1', 'hzgb', 'ansi_x3_4_1968', 'ks_c_5601', 'l3', 'cspc8codepage437', 'iso_8859_7_1987', '8859', 'ibm500', 'ibm1026', 'iso_8859_6', 'csibm865', 'ibm866', 'windows_1258', 'iso_ir_138', 'l4', 'utf_32le', 'iso_8859_11', 'thai', '864', 'euc_jis2004', 'cp936', '1251', 'zip', 'unicodebigunmarked', 'csHPRoman8', 'csibm858', 'utf16', '936', 'ibm037', 'iso_8859_8_1988', '857', 'csibm869', 'ebcdic_cp_he', 'cp819', 'euccn', 'iso_8859_2', 'ms932', 'iso_2022_jp_1', 'iso_2022_kr', 'csisolatin6', 'iso_2022_jp', 'x_mac_korean', 'latin3', 'csbig5', 'hz_gb', 'csascii', 'u8', 'csisolatin5', 'csisolatincyrillic', 'ms_kanji', 'cspcp852', 'rk1048', 'iso2022jp_ext', 'csibm273', 'iso_2022_jp_ext', 'ibm858', 'ibm850', 'sjisx0213', 'tis_620_2529_1', 'l10', 'iso_ir_109', 'ibm1125', '1254', 'euckr', 'tis_620_0', 'l1', 'ibm819', 'iso2022kr', 'ibm367', '950', 'r8', 'hex', 'cp154', 'tis_620_2529_0', 'iso_8859_16', 'pt154', 'ebcdic_cp_ca', 'ibm1140', 'l6', 'csibm864', 'csisolatin1', 'csisolatin3', 'latin6', 'iso_8859_9_1989', 'iso_8859_3_1988', 'unicodelittleunmarked', 'macintosh', '273', 'latin9', 'iso_8859_4_1988', 'iso_8859_9', 'ebcdic_cp_nl', 'iso_ir_144'])

Example 2: Code to encode the string

Python3




string = "¶"  # utf-8 character
 
# trying to encode using utf-8 scheme
print(string.encode('utf-8'))


Output:

b'\xc2\xb6'

Errors when using wrong encoding scheme

Example 1: Python String encode() method will raise UnicodeEncodeError if wrong encoding scheme is used

Python3




string = "¶"  # utf-8 character
 
# trying to encode using ascii scheme
print(string.encode('ascii'))


Output:

UnicodeEncodeError: 'ascii' codec can't encode character '\xb6' in position 0: ordinal not in range(128)

Example 2: Using ‘errors’ parameter to ignore errors while encoding

Python String encode() method with errors parameter set to ‘ignore’ will ignore the errors in conversion of characters into specified encoding scheme.

Python3




string = "123-¶"  # utf-8 character
 
# ignore if there are any errors
print(string.encode('ascii', errors='ignore'))


Output:

b'123-'


Previous Article
Next Article

Similar Reads

numpy.defchararray.encode() in Python
numpy.core.defchararray.encode(arr, encoding): This numpy function encodes the string(object) based on the specified codec. Parameters: arr : array-like or string. encoding : [str] Name of encoding being followed. error : Specifying how to handle error. Returns : Encoded string Code: C/C++ Code # Python Program illustrating # numpy.char.encode() me
1 min read
Python | Message Encode-Decode using Tkinter
Prerequisite : Basics of TkinterPython offers multiple options for developing GUI (Graphical User Interface). Out of all the GUI methods, tkinter is most commonly used method. It is a standard Python interface to the Tk GUI toolkit shipped with Python. Python with tkinter outputs the fastest and easiest way to create the GUI applications. Python pr
5 min read
Python | Pandas Series.str.encode()
Series.str can be used to access the values of the series as strings and apply several methods to it. Pandas Series.str.encode() function is used to encode character string in the Series/Index using indicated encoding. Equivalent to str.encode(). Syntax: Series.str.encode(encoding, errors='strict') Parameter : encoding : str errors : str, optional
2 min read
IncrementalEncoder encode() in Python
With the help of IncrementalEncoder.encode() method, we can encode the string into the binary form by using IncrementalEncoder.encode() method. Syntax : IncrementalEncoder.encode(string) Return : Return the encoded string. Note : If you want to use this method you should have python 3.8.2 version or latest. Example #1 : In this example we can see t
1 min read
codecs.encode() in Python
With the help of codecs.encode() method, we can encode the string into the binary form . Syntax : codecs.encode(string) Return : Return the encoded string. Example #1 : In this example we can see that by using codecs.encode() method, we are able to get the encoded string which can be in binary form by using this method. # import codecs import codec
1 min read
How To Encode And Decode A Message using Python?
Encryption is the process of converting a normal message (plain text) into a meaningless message (Ciphertext). Whereas, Decryption is the process of converting a meaningless message (Cipher text) into its original form (Plain text). In this article, we will take forward the idea of encryption and decryption and draft a python program. In this artic
3 min read
Python Encode Unicode and non-ASCII characters into JSON
This article will provide a comprehensive guide on how to work with Unicode and non-ASCII characters in Python when generating and parsing JSON data. We will look at the different ways to handle Unicode and non-ASCII characters in JSON. By the end of this article, you should have a good understanding of how to work with Unicode and non-ASCII charac
5 min read
How To Fix - UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 20: ordinal not in range(128) in Python
Several errors can arise when an attempt to change from one datatype to another is made. The reason is the inability of some datatype to get casted/converted into others. One of the most common errors during these conversions is Unicode Encode Error which occurs when a text containing a Unicode literal is attempted to be encoded bytes. This article
2 min read
Python | Remove empty strings from list of strings
In many scenarios, we encounter the issue of getting an empty string in a huge amount of data and handling that sometimes becomes a tedious task. Let's discuss certain way-outs to remove empty strings from list of strings. Method #1: Using remove() This particular method is quite naive and not recommended use, but is indeed a method to perform this
7 min read
Python | Tokenizing strings in list of strings
Sometimes, while working with data, we need to perform the string tokenization of the strings that we might get as an input as list of strings. This has a usecase in many application of Machine Learning. Let's discuss certain ways in which this can be done. Method #1 : Using list comprehension + split() We can achieve this particular task using lis
3 min read