Why Python?
In Python, we represent text in the form of string [1], which are objects of the str
[2] class. They are an immutable sequence of Unicode code points or characters. It is important to make a careful distinction here, though; in Python 3, all strings are by default Unicode, but in Python 2, the str
class is limited to ASCII code, and there is a Unicode class to deal with Unicodes.
Unicode is merely an encoding language or a way we handle text. For example, the Unicode value for the letter Z is U+005A. There are many encoding types, and historically in Python, developers were expected to deal with different encodings on their own, with all the low-level action happening in bytes. In fact, the shift in the way Python handles Unicode has led to a lot of discussions [3], criticism [4], and praise [5] within the community. It also remains an important point of contention when we are porting code from Python 2 and Python 3.
We said earlier on that the low-level action was going on in...