summaryrefslogtreecommitdiffstats
path: root/Python/Dependencies/future-0.18.2/docs/unicode_literals.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Python/Dependencies/future-0.18.2/docs/unicode_literals.rst')
-rw-r--r--Python/Dependencies/future-0.18.2/docs/unicode_literals.rst197
1 files changed, 0 insertions, 197 deletions
diff --git a/Python/Dependencies/future-0.18.2/docs/unicode_literals.rst b/Python/Dependencies/future-0.18.2/docs/unicode_literals.rst
deleted file mode 100644
index 7252e4d..0000000
--- a/Python/Dependencies/future-0.18.2/docs/unicode_literals.rst
+++ /dev/null
@@ -1,197 +0,0 @@
-.. _unicode-literals:
-
-Should I import unicode_literals?
----------------------------------
-
-The ``future`` package can be used with or without ``unicode_literals``
-imports.
-
-In general, it is more compelling to use ``unicode_literals`` when
-back-porting new or existing Python 3 code to Python 2/3 than when porting
-existing Python 2 code to 2/3. In the latter case, explicitly marking up all
-unicode string literals with ``u''`` prefixes would help to avoid
-unintentionally changing the existing Python 2 API. However, if changing the
-existing Python 2 API is not a concern, using ``unicode_literals`` may speed up
-the porting process.
-
-This section summarizes the benefits and drawbacks of using
-``unicode_literals``. To avoid confusion, we recommend using
-``unicode_literals`` everywhere across a code-base or not at all, instead of
-turning on for only some modules.
-
-
-
-Benefits
-~~~~~~~~
-
-1. String literals are unicode on Python 3. Making them unicode on Python 2
- leads to more consistency of your string types across the two
- runtimes. This can make it easier to understand and debug your code.
-
-2. Code without ``u''`` prefixes is cleaner, one of the claimed advantages
- of Python 3. Even though some unicode strings would require a function
- call to invert them to native strings for some Python 2 APIs (see
- :ref:`stdlib-incompatibilities`), the incidence of these function calls
- would usually be much lower than the incidence of ``u''`` prefixes for text
- strings in the absence of ``unicode_literals``.
-
-3. The diff when porting to a Python 2/3-compatible codebase may be smaller,
- less noisy, and easier to review with ``unicode_literals`` than if an
- explicit ``u''`` prefix is added to every unadorned string literal.
-
-4. If support for Python 3.2 is required (e.g. for Ubuntu 12.04 LTS or
- Debian wheezy), ``u''`` prefixes are a ``SyntaxError``, making
- ``unicode_literals`` the only option for a Python 2/3 compatible
- codebase. [However, note that ``future`` doesn't support Python 3.0-3.2.]
-
-
-Drawbacks
-~~~~~~~~~
-
-1. Adding ``unicode_literals`` to a module amounts to a "global flag day" for
- that module, changing the data types of all strings in the module at once.
- Cautious developers may prefer an incremental approach. (See
- `here <http://lwn.net/Articles/165039/>`_ for an excellent article
- describing the superiority of an incremental patch-set in the the case
- of the Linux kernel.)
-
-.. This is a larger-scale change than adding explicit ``u''`` prefixes to
-.. all strings that should be Unicode.
-
-2. Changing to ``unicode_literals`` will likely introduce regressions on
- Python 2 that require an initial investment of time to find and fix. The
- APIs may be changed in subtle ways that are not immediately obvious.
-
- An example on Python 2::
-
- ### Module: mypaths.py
-
- ...
- def unix_style_path(path):
- return path.replace('\\', '/')
- ...
-
- ### User code:
-
- >>> path1 = '\\Users\\Ed'
- >>> unix_style_path(path1)
- '/Users/ed'
-
- On Python 2, adding a ``unicode_literals`` import to ``mypaths.py`` would
- change the return type of the ``unix_style_path`` function from ``str`` to
- ``unicode`` in the user code, which is difficult to anticipate and probably
- unintended.
-
- The counter-argument is that this code is broken, in a portability
- sense; we see this from Python 3 raising a ``TypeError`` upon passing the
- function a byte-string. The code needs to be changed to make explicit
- whether the ``path`` argument is to be a byte string or a unicode string.
-
-3. With ``unicode_literals`` in effect, there is no way to specify a native
- string literal (``str`` type on both platforms). This can be worked around as follows::
-
- >>> from __future__ import unicode_literals
- >>> ...
- >>> from future.utils import bytes_to_native_str as n
-
- >>> s = n(b'ABCD')
- >>> s
- 'ABCD' # on both Py2 and Py3
-
- although this incurs a performance penalty (a function call and, on Py3,
- a ``decode`` method call.)
-
- This is a little awkward because various Python library APIs (standard
- and non-standard) require a native string to be passed on both Py2
- and Py3. (See :ref:`stdlib-incompatibilities` for some examples. WSGI
- dictionaries are another.)
-
-3. If a codebase already explicitly marks up all text with ``u''`` prefixes,
- and if support for Python versions 3.0-3.2 can be dropped, then
- removing the existing ``u''`` prefixes and replacing these with
- ``unicode_literals`` imports (the porting approach Django used) would
- introduce more noise into the patch and make it more difficult to review.
- However, note that the ``futurize`` script takes advantage of PEP 414 and
- does not remove explicit ``u''`` prefixes that already exist.
-
-4. Turning on ``unicode_literals`` converts even docstrings to unicode, but
- Pydoc breaks with unicode docstrings containing non-ASCII characters for
- Python versions < 2.7.7. (`Fix
- committed <http://bugs.python.org/issue1065986#msg207403>`_ in Jan 2014.)::
-
- >>> def f():
- ... u"Author: Martin von Löwis"
-
- >>> help(f)
-
- /Users/schofield/Install/anaconda/python.app/Contents/lib/python2.7/pydoc.pyc in pipepager(text, cmd)
- 1376 pipe = os.popen(cmd, 'w')
- 1377 try:
- -> 1378 pipe.write(text)
- 1379 pipe.close()
- 1380 except IOError:
-
- UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 71: ordinal not in range(128)
-
-See `this Stack Overflow thread
-<http://stackoverflow.com/questions/809796/any-gotchas-using-unicode-literals-in-python-2-6>`_
-for other gotchas.
-
-
-Others' perspectives
-~~~~~~~~~~~~~~~~~~~~
-
-In favour of ``unicode_literals``
-*********************************
-
-Django recommends importing ``unicode_literals`` as its top `porting tip <https://docs.djangoproject.com/en/dev/topics/python3/#unicode-literals>`_ for
-migrating Django extension modules to Python 3. The following `quote
-<https://groups.google.com/forum/#!topic/django-developers/2ddIWdicbNY>`_ is
-from Aymeric Augustin on 23 August 2012 regarding why he chose
-``unicode_literals`` for the port of Django to a Python 2/3-compatible
-codebase.:
-
- "... I'd like to explain why this PEP [PEP 414, which allows explicit
- ``u''`` prefixes for unicode literals on Python 3.3+] is at odds with
- the porting philosophy I've applied to Django, and why I would have
- vetoed taking advantage of it.
-
- "I believe that aiming for a Python 2 codebase with Python 3
- compatibility hacks is a counter-productive way to port a project. You
- end up with all the drawbacks of Python 2 (including the legacy `u`
- prefixes) and none of the advantages Python 3 (especially the sane
- string handling).
-
- "Working to write Python 3 code, with legacy compatibility for Python
- 2, is much more rewarding. Of course it takes more effort, but the
- results are much cleaner and much more maintainable. It's really about
- looking towards the future or towards the past.
-
- "I understand the reasons why PEP 414 was proposed and why it was
- accepted. It makes sense for legacy software that is minimally
- maintained. I hope nobody puts Django in this category!"
-
-
-Against ``unicode_literals``
-****************************
-
- "There are so many subtle problems that ``unicode_literals`` causes.
- For instance lots of people accidentally introduce unicode into
- filenames and that seems to work, until they are using it on a system
- where there are unicode characters in the filesystem path."
-
- -- Armin Ronacher
-
- "+1 from me for avoiding the unicode_literals future, as it can have
- very strange side effects in Python 2.... This is one of the key
- reasons I backed Armin's PEP 414."
-
- -- Nick Coghlan
-
- "Yeah, one of the nuisances of the WSGI spec is that the header values
- IIRC are the str or StringType on both py2 and py3. With
- unicode_literals this causes hard-to-spot bugs, as some WSGI servers
- might be more tolerant than others, but usually using unicode in python
- 2 for WSGI headers will cause the response to fail."
-
- -- Antti Haapala