diff options
Diffstat (limited to 'FOSS/Python/Dependencies/future-0.18.2/docs/bytes_object.rst')
| -rw-r--r-- | FOSS/Python/Dependencies/future-0.18.2/docs/bytes_object.rst | 80 |
1 files changed, 80 insertions, 0 deletions
diff --git a/FOSS/Python/Dependencies/future-0.18.2/docs/bytes_object.rst b/FOSS/Python/Dependencies/future-0.18.2/docs/bytes_object.rst new file mode 100644 index 0000000..110280a --- /dev/null +++ b/FOSS/Python/Dependencies/future-0.18.2/docs/bytes_object.rst @@ -0,0 +1,80 @@ +.. _bytes-object: + +bytes +----- + +Handling ``bytes`` consistently and correctly has traditionally been one +of the most difficult tasks in writing a Py2/3 compatible codebase. This +is because the Python 2 :class:`bytes` object is simply an alias for +Python 2's :class:`str`, rather than a true implementation of the Python +3 :class:`bytes` object, which is substantially different. + +:mod:`future` contains a backport of the :mod:`bytes` object from Python 3 +which passes most of the Python 3 tests for :mod:`bytes`. (See +``tests/test_future/test_bytes.py`` in the source tree.) You can use it as +follows:: + + >>> from builtins import bytes + >>> b = bytes(b'ABCD') + +On Py3, this is simply the builtin :class:`bytes` object. On Py2, this +object is a subclass of Python 2's :class:`str` that enforces the same +strict separation of unicode strings and byte strings as Python 3's +:class:`bytes` object:: + + >>> b + u'EFGH' # TypeError + Traceback (most recent call last): + File "<stdin>", line 1, in <module> + TypeError: argument can't be unicode string + + >>> bytes(b',').join([u'Fred', u'Bill']) + Traceback (most recent call last): + File "<stdin>", line 1, in <module> + TypeError: sequence item 0: expected bytes, found unicode string + + >>> b == u'ABCD' + False + + >>> b < u'abc' + Traceback (most recent call last): + File "<stdin>", line 1, in <module> + TypeError: unorderable types: bytes() and <type 'unicode'> + + +In most other ways, these :class:`bytes` objects have identical +behaviours to Python 3's :class:`bytes`:: + + b = bytes(b'ABCD') + assert list(b) == [65, 66, 67, 68] + assert repr(b) == "b'ABCD'" + assert b.split(b'B') == [b'A', b'CD'] + +Currently the easiest way to ensure identical behaviour of byte-strings +in a Py2/3 codebase is to wrap all byte-string literals ``b'...'`` in a +:func:`~bytes` call as follows:: + + from builtins import bytes + + # ... + + b = bytes(b'This is my bytestring') + + # ... + +This is not perfect, but it is superior to manually debugging and fixing +code incompatibilities caused by the many differences between Py3 bytes +and Py2 strings. + + +The :class:`bytes` type from :mod:`builtins` also provides support for the +``surrogateescape`` error handler on Python 2.x. Here is an example that works +identically on Python 2.x and 3.x:: + + >>> from builtins import bytes + >>> b = bytes(b'\xff') + >>> b.decode('utf-8', 'surrogateescape') + '\udcc3' + +This feature is in alpha. Please leave feedback `here +<https://github.com/PythonCharmers/python-future/issues>`_ about whether this +works for you. |
