diff options
| author | yum <yum.food.vr@gmail.com> | 2023-01-01 21:05:27 -0800 |
|---|---|---|
| committer | yum <yum.food.vr@gmail.com> | 2023-01-01 21:44:45 -0800 |
| commit | e25bdba3a3a53b09be5269d8b065c13b73ab55c3 (patch) | |
| tree | 1d1dc1d94cde92c2f4f8ce86017395054787515d /FOSS/Python/Dependencies/future-0.18.2/docs/unicode_literals.rst | |
| parent | 0d408cc812a094a708edbe4baf536e928731cfc3 (diff) | |
Embed git in package
package.ps1 fetches PortableGit and embeds it in the package. This
eliminates all but one runtime dependency (MSVC++ Redistributable).
* Move Python into a new FOSS folder.
Diffstat (limited to 'FOSS/Python/Dependencies/future-0.18.2/docs/unicode_literals.rst')
| -rw-r--r-- | FOSS/Python/Dependencies/future-0.18.2/docs/unicode_literals.rst | 197 |
1 files changed, 197 insertions, 0 deletions
diff --git a/FOSS/Python/Dependencies/future-0.18.2/docs/unicode_literals.rst b/FOSS/Python/Dependencies/future-0.18.2/docs/unicode_literals.rst new file mode 100644 index 0000000..7252e4d --- /dev/null +++ b/FOSS/Python/Dependencies/future-0.18.2/docs/unicode_literals.rst @@ -0,0 +1,197 @@ +.. _unicode-literals: + +Should I import unicode_literals? +--------------------------------- + +The ``future`` package can be used with or without ``unicode_literals`` +imports. + +In general, it is more compelling to use ``unicode_literals`` when +back-porting new or existing Python 3 code to Python 2/3 than when porting +existing Python 2 code to 2/3. In the latter case, explicitly marking up all +unicode string literals with ``u''`` prefixes would help to avoid +unintentionally changing the existing Python 2 API. However, if changing the +existing Python 2 API is not a concern, using ``unicode_literals`` may speed up +the porting process. + +This section summarizes the benefits and drawbacks of using +``unicode_literals``. To avoid confusion, we recommend using +``unicode_literals`` everywhere across a code-base or not at all, instead of +turning on for only some modules. + + + +Benefits +~~~~~~~~ + +1. String literals are unicode on Python 3. Making them unicode on Python 2 + leads to more consistency of your string types across the two + runtimes. This can make it easier to understand and debug your code. + +2. Code without ``u''`` prefixes is cleaner, one of the claimed advantages + of Python 3. Even though some unicode strings would require a function + call to invert them to native strings for some Python 2 APIs (see + :ref:`stdlib-incompatibilities`), the incidence of these function calls + would usually be much lower than the incidence of ``u''`` prefixes for text + strings in the absence of ``unicode_literals``. + +3. The diff when porting to a Python 2/3-compatible codebase may be smaller, + less noisy, and easier to review with ``unicode_literals`` than if an + explicit ``u''`` prefix is added to every unadorned string literal. + +4. If support for Python 3.2 is required (e.g. for Ubuntu 12.04 LTS or + Debian wheezy), ``u''`` prefixes are a ``SyntaxError``, making + ``unicode_literals`` the only option for a Python 2/3 compatible + codebase. [However, note that ``future`` doesn't support Python 3.0-3.2.] + + +Drawbacks +~~~~~~~~~ + +1. Adding ``unicode_literals`` to a module amounts to a "global flag day" for + that module, changing the data types of all strings in the module at once. + Cautious developers may prefer an incremental approach. (See + `here <http://lwn.net/Articles/165039/>`_ for an excellent article + describing the superiority of an incremental patch-set in the the case + of the Linux kernel.) + +.. This is a larger-scale change than adding explicit ``u''`` prefixes to +.. all strings that should be Unicode. + +2. Changing to ``unicode_literals`` will likely introduce regressions on + Python 2 that require an initial investment of time to find and fix. The + APIs may be changed in subtle ways that are not immediately obvious. + + An example on Python 2:: + + ### Module: mypaths.py + + ... + def unix_style_path(path): + return path.replace('\\', '/') + ... + + ### User code: + + >>> path1 = '\\Users\\Ed' + >>> unix_style_path(path1) + '/Users/ed' + + On Python 2, adding a ``unicode_literals`` import to ``mypaths.py`` would + change the return type of the ``unix_style_path`` function from ``str`` to + ``unicode`` in the user code, which is difficult to anticipate and probably + unintended. + + The counter-argument is that this code is broken, in a portability + sense; we see this from Python 3 raising a ``TypeError`` upon passing the + function a byte-string. The code needs to be changed to make explicit + whether the ``path`` argument is to be a byte string or a unicode string. + +3. With ``unicode_literals`` in effect, there is no way to specify a native + string literal (``str`` type on both platforms). This can be worked around as follows:: + + >>> from __future__ import unicode_literals + >>> ... + >>> from future.utils import bytes_to_native_str as n + + >>> s = n(b'ABCD') + >>> s + 'ABCD' # on both Py2 and Py3 + + although this incurs a performance penalty (a function call and, on Py3, + a ``decode`` method call.) + + This is a little awkward because various Python library APIs (standard + and non-standard) require a native string to be passed on both Py2 + and Py3. (See :ref:`stdlib-incompatibilities` for some examples. WSGI + dictionaries are another.) + +3. If a codebase already explicitly marks up all text with ``u''`` prefixes, + and if support for Python versions 3.0-3.2 can be dropped, then + removing the existing ``u''`` prefixes and replacing these with + ``unicode_literals`` imports (the porting approach Django used) would + introduce more noise into the patch and make it more difficult to review. + However, note that the ``futurize`` script takes advantage of PEP 414 and + does not remove explicit ``u''`` prefixes that already exist. + +4. Turning on ``unicode_literals`` converts even docstrings to unicode, but + Pydoc breaks with unicode docstrings containing non-ASCII characters for + Python versions < 2.7.7. (`Fix + committed <http://bugs.python.org/issue1065986#msg207403>`_ in Jan 2014.):: + + >>> def f(): + ... u"Author: Martin von Löwis" + + >>> help(f) + + /Users/schofield/Install/anaconda/python.app/Contents/lib/python2.7/pydoc.pyc in pipepager(text, cmd) + 1376 pipe = os.popen(cmd, 'w') + 1377 try: + -> 1378 pipe.write(text) + 1379 pipe.close() + 1380 except IOError: + + UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 71: ordinal not in range(128) + +See `this Stack Overflow thread +<http://stackoverflow.com/questions/809796/any-gotchas-using-unicode-literals-in-python-2-6>`_ +for other gotchas. + + +Others' perspectives +~~~~~~~~~~~~~~~~~~~~ + +In favour of ``unicode_literals`` +********************************* + +Django recommends importing ``unicode_literals`` as its top `porting tip <https://docs.djangoproject.com/en/dev/topics/python3/#unicode-literals>`_ for +migrating Django extension modules to Python 3. The following `quote +<https://groups.google.com/forum/#!topic/django-developers/2ddIWdicbNY>`_ is +from Aymeric Augustin on 23 August 2012 regarding why he chose +``unicode_literals`` for the port of Django to a Python 2/3-compatible +codebase.: + + "... I'd like to explain why this PEP [PEP 414, which allows explicit + ``u''`` prefixes for unicode literals on Python 3.3+] is at odds with + the porting philosophy I've applied to Django, and why I would have + vetoed taking advantage of it. + + "I believe that aiming for a Python 2 codebase with Python 3 + compatibility hacks is a counter-productive way to port a project. You + end up with all the drawbacks of Python 2 (including the legacy `u` + prefixes) and none of the advantages Python 3 (especially the sane + string handling). + + "Working to write Python 3 code, with legacy compatibility for Python + 2, is much more rewarding. Of course it takes more effort, but the + results are much cleaner and much more maintainable. It's really about + looking towards the future or towards the past. + + "I understand the reasons why PEP 414 was proposed and why it was + accepted. It makes sense for legacy software that is minimally + maintained. I hope nobody puts Django in this category!" + + +Against ``unicode_literals`` +**************************** + + "There are so many subtle problems that ``unicode_literals`` causes. + For instance lots of people accidentally introduce unicode into + filenames and that seems to work, until they are using it on a system + where there are unicode characters in the filesystem path." + + -- Armin Ronacher + + "+1 from me for avoiding the unicode_literals future, as it can have + very strange side effects in Python 2.... This is one of the key + reasons I backed Armin's PEP 414." + + -- Nick Coghlan + + "Yeah, one of the nuisances of the WSGI spec is that the header values + IIRC are the str or StringType on both py2 and py3. With + unicode_literals this causes hard-to-spot bugs, as some WSGI servers + might be more tolerant than others, but usually using unicode in python + 2 for WSGI headers will cause the response to fail." + + -- Antti Haapala |
