summaryrefslogtreecommitdiffstats
path: root/Python/Dependencies/future-0.18.2/docs/str_object.rst
diff options
context:
space:
mode:
authoryum <yum.food.vr@gmail.com>2022-12-17 17:26:16 -0800
committeryum <yum.food.vr@gmail.com>2022-12-17 17:26:16 -0800
commit4d836989720523cd0363927e3e066f56b9dc445c (patch)
treef7a9ff7cb50eda1ff29e91c78067dcc5e0ce6233 /Python/Dependencies/future-0.18.2/docs/str_object.rst
parentda754e9cf5b192239826aa1619e1ada3c98daa45 (diff)
Check in `future` package
I hit some issues installing Whisper and had to embed this package. I haven't taken the time to deeply understand what's going on. I think that embedded Python follows different rules about resolving module paths than regular system Python. Basically, `future`'s setup.py has a line like `import src`, where `src` is a module inside future (like `future/src/__init__.py`). This doesn't work unless we put that directory on the search path.
Diffstat (limited to 'Python/Dependencies/future-0.18.2/docs/str_object.rst')
-rw-r--r--Python/Dependencies/future-0.18.2/docs/str_object.rst99
1 files changed, 99 insertions, 0 deletions
diff --git a/Python/Dependencies/future-0.18.2/docs/str_object.rst b/Python/Dependencies/future-0.18.2/docs/str_object.rst
new file mode 100644
index 0000000..4c5257a
--- /dev/null
+++ b/Python/Dependencies/future-0.18.2/docs/str_object.rst
@@ -0,0 +1,99 @@
+.. _str-object:
+
+str
+-----
+
+The :class:`str` object in Python 3 is quite similar but not identical to the
+Python 2 :class:`unicode` object.
+
+The major difference is the stricter type-checking of Py3's ``str`` that
+enforces a distinction between unicode strings and byte-strings, such as when
+comparing, concatenating, joining, or replacing parts of strings.
+
+There are also other differences, such as the ``repr`` of unicode strings in
+Py2 having a ``u'...'`` prefix, versus simply ``'...'``, and the removal of
+the :func:`str.decode` method in Py3.
+
+:mod:`future` contains a :class:`newstr`` type that is a backport of the
+:mod:`str` object from Python 3. This inherits from the Python 2
+:class:`unicode` class but has customizations to improve compatibility with
+Python 3's :class:`str` object. You can use it as follows::
+
+ >>> from __future__ import unicode_literals
+ >>> from builtins import str
+
+On Py2, this gives us::
+
+ >>> str
+ future.types.newstr.newstr
+
+(On Py3, it is simply the usual builtin :class:`str` object.)
+
+Then, for example, the following code has the same effect on Py2 as on Py3::
+
+ >>> s = str(u'ABCD')
+ >>> assert s != b'ABCD'
+ >>> assert isinstance(s.encode('utf-8'), bytes)
+ >>> assert isinstance(b.decode('utf-8'), str)
+
+ These raise TypeErrors:
+
+ >>> bytes(b'B') in s
+ Traceback (most recent call last):
+ File "<stdin>", line 1, in <module>
+ TypeError: 'in <string>' requires string as left operand, not <type 'str'>
+
+ >>> s.find(bytes(b'A'))
+ Traceback (most recent call last):
+ File "<stdin>", line 1, in <module>
+ TypeError: argument can't be <type 'str'>
+
+Various other operations that mix strings and bytes or other types are
+permitted on Py2 with the :class:`newstr` class even though they
+are illegal with Python 3. For example::
+
+ >>> s2 = b'/' + str('ABCD')
+ >>> s2
+ '/ABCD'
+ >>> type(s2)
+ future.types.newstr.newstr
+
+This is allowed for compatibility with parts of the Python 2 standard
+library and various third-party libraries that mix byte-strings and unicode
+strings loosely. One example is ``os.path.join`` on Python 2, which
+attempts to add the byte-string ``b'/'`` to its arguments, whether or not
+they are unicode. (See ``posixpath.py``.) Another example is the
+:func:`escape` function in Django 1.4's :mod:`django.utils.html`.
+
+
+.. For example, this is permissible on Py2::
+..
+.. >>> u'u' > 10
+.. True
+..
+.. >>> u'u' <= b'u'
+.. True
+..
+.. On Py3, these raise TypeErrors.
+
+In most other ways, these :class:`builtins.str` objects on Py2 have the
+same behaviours as Python 3's :class:`str`::
+
+ >>> s = str('ABCD')
+ >>> assert repr(s) == 'ABCD' # consistent repr with Py3 (no u prefix)
+ >>> assert list(s) == ['A', 'B', 'C', 'D']
+ >>> assert s.split('B') == ['A', 'CD']
+
+
+The :class:`str` type from :mod:`builtins` also provides support for the
+``surrogateescape`` error handler on Python 2.x. Here is an example that works
+identically on Python 2.x and 3.x::
+
+ >>> from builtins import str
+ >>> s = str(u'\udcff')
+ >>> s.encode('utf-8', 'surrogateescape')
+ b'\xff'
+
+This feature is in alpha. Please leave feedback `here
+<https://github.com/PythonCharmers/python-future/issues>`_ about whether this
+works for you.