summaryrefslogtreecommitdiffstats
path: root/Python/Dependencies/future-0.18.2/docs/bytes_object.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Python/Dependencies/future-0.18.2/docs/bytes_object.rst')
-rw-r--r--Python/Dependencies/future-0.18.2/docs/bytes_object.rst80
1 files changed, 80 insertions, 0 deletions
diff --git a/Python/Dependencies/future-0.18.2/docs/bytes_object.rst b/Python/Dependencies/future-0.18.2/docs/bytes_object.rst
new file mode 100644
index 0000000..110280a
--- /dev/null
+++ b/Python/Dependencies/future-0.18.2/docs/bytes_object.rst
@@ -0,0 +1,80 @@
+.. _bytes-object:
+
+bytes
+-----
+
+Handling ``bytes`` consistently and correctly has traditionally been one
+of the most difficult tasks in writing a Py2/3 compatible codebase. This
+is because the Python 2 :class:`bytes` object is simply an alias for
+Python 2's :class:`str`, rather than a true implementation of the Python
+3 :class:`bytes` object, which is substantially different.
+
+:mod:`future` contains a backport of the :mod:`bytes` object from Python 3
+which passes most of the Python 3 tests for :mod:`bytes`. (See
+``tests/test_future/test_bytes.py`` in the source tree.) You can use it as
+follows::
+
+ >>> from builtins import bytes
+ >>> b = bytes(b'ABCD')
+
+On Py3, this is simply the builtin :class:`bytes` object. On Py2, this
+object is a subclass of Python 2's :class:`str` that enforces the same
+strict separation of unicode strings and byte strings as Python 3's
+:class:`bytes` object::
+
+ >>> b + u'EFGH' # TypeError
+ Traceback (most recent call last):
+ File "<stdin>", line 1, in <module>
+ TypeError: argument can't be unicode string
+
+ >>> bytes(b',').join([u'Fred', u'Bill'])
+ Traceback (most recent call last):
+ File "<stdin>", line 1, in <module>
+ TypeError: sequence item 0: expected bytes, found unicode string
+
+ >>> b == u'ABCD'
+ False
+
+ >>> b < u'abc'
+ Traceback (most recent call last):
+ File "<stdin>", line 1, in <module>
+ TypeError: unorderable types: bytes() and <type 'unicode'>
+
+
+In most other ways, these :class:`bytes` objects have identical
+behaviours to Python 3's :class:`bytes`::
+
+ b = bytes(b'ABCD')
+ assert list(b) == [65, 66, 67, 68]
+ assert repr(b) == "b'ABCD'"
+ assert b.split(b'B') == [b'A', b'CD']
+
+Currently the easiest way to ensure identical behaviour of byte-strings
+in a Py2/3 codebase is to wrap all byte-string literals ``b'...'`` in a
+:func:`~bytes` call as follows::
+
+ from builtins import bytes
+
+ # ...
+
+ b = bytes(b'This is my bytestring')
+
+ # ...
+
+This is not perfect, but it is superior to manually debugging and fixing
+code incompatibilities caused by the many differences between Py3 bytes
+and Py2 strings.
+
+
+The :class:`bytes` type from :mod:`builtins` also provides support for the
+``surrogateescape`` error handler on Python 2.x. Here is an example that works
+identically on Python 2.x and 3.x::
+
+ >>> from builtins import bytes
+ >>> b = bytes(b'\xff')
+ >>> b.decode('utf-8', 'surrogateescape')
+ '\udcc3'
+
+This feature is in alpha. Please leave feedback `here
+<https://github.com/PythonCharmers/python-future/issues>`_ about whether this
+works for you.