- Author:
- Guido van Rossum <guido at python.org>,
Barry Warsaw <barry at python.org>,
Nick Coghlan <ncoghlan at gmail.com> - Status:
- Active
- Type:
- Process
- Created:
- 05-Jul-2001
- Post-History:
- 05-Jul-2001, 01-Aug-2013
- Introduction
- A Foolish Consistency is the Hobgoblin of Little Minds
- Code Lay-out
- Indentation
- Tabs or Spaces?
- Maximum Line Length
- Should a Line Break Before or After a Binary Operator?
- Blank Lines
- Source File Encoding
- Imports
- Module Level Dunder Names
- String Quotes
- Whitespace in Expressions and Statements
- Pet Peeves
- Other Recommendations
- When to Use Trailing Commas
- Comments
- Block Comments
- Inline Comments
- Documentation Strings
- Naming Conventions
- Overriding Principle
- Descriptive: Naming Styles
- Prescriptive: Naming Conventions
- Names to Avoid
- ASCII Compatibility
- Package and Module Names
- Class Names
- Type Variable Names
- Exception Names
- Global Variable Names
- Function and Variable Names
- Function and Method Arguments
- Method Names and Instance Variables
- Constants
- Designing for Inheritance
- Public and Internal Interfaces
- Programming Recommendations
- Function Annotations
- Variable Annotations
- References
- Copyright
Introduction
This document gives coding conventions for the Python code comprising
the standard library in the main Python distribution. Please see the
companion informational PEP describing style guidelines for the C code
in the C implementation of Python.
This document and PEP 257 (Docstring Conventions) were adapted from
Guido’s original Python Style Guide essay, with some additions from
Barry’s style guide [2].
This style guide evolves over time as additional conventions are
identified and past conventions are rendered obsolete by changes in
the language itself.
Many projects have their own coding style guidelines. In the event of any
conflicts, such project-specific guides take precedence for that project.
A Foolish Consistency is the Hobgoblin of Little Minds
One of Guido’s key insights is that code is read much more often than
it is written. The guidelines provided here are intended to improve
the readability of code and make it consistent across the wide
spectrum of Python code. As PEP 20 says, “Readability counts”.
A style guide is about consistency. Consistency with this style guide
is important. Consistency within a project is more important.
Consistency within one module or function is the most important.
However, know when to be inconsistent – sometimes style guide
recommendations just aren’t applicable. When in doubt, use your best
judgment. Look at other examples and decide what looks best. And
don’t hesitate to ask!
In particular: do not break backwards compatibility just to comply with
this PEP!
Some other good reasons to ignore a particular guideline:
- When applying the guideline would make the code less readable, even
for someone who is used to reading code that follows this PEP. - To be consistent with surrounding code that also breaks it (maybe
for historic reasons) – although this is also an opportunity to
clean up someone else’s mess (in true XP style). - Because the code in question predates the introduction of the
guideline and there is no other reason to be modifying that code. - When the code needs to remain compatible with older versions of
Python that don’t support the feature recommended by the style guide.
Code Lay-out
Indentation
Use 4 spaces per indentation level.
Continuation lines should align wrapped elements either vertically
using Python’s implicit line joining inside parentheses, brackets and
braces, or using a hanging indent [1]. When using a hanging
indent the following should be considered; there should be no
arguments on the first line and further indentation should be used to
clearly distinguish itself as a continuation line:
# Correct: # Aligned with opening delimiter. foo = long_function_name(var_one, var_two, var_three, var_four) # Add 4 spaces (an extra level of indentation) to distinguish arguments from the rest. def long_function_name( var_one, var_two, var_three, var_four): print(var_one) # Hanging indents should add a level. foo = long_function_name( var_one, var_two, var_three, var_four)
# Wrong: # Arguments on first line forbidden when not using vertical alignment. foo = long_function_name(var_one, var_two, var_three, var_four) # Further indentation required as indentation is not distinguishable. def long_function_name( var_one, var_two, var_three, var_four): print(var_one)
The 4-space rule is optional for continuation lines.
Optional:
# Hanging indents *may* be indented to other than 4 spaces. foo = long_function_name( var_one, var_two, var_three, var_four)
When the conditional part of an if
-statement is long enough to require
that it be written across multiple lines, it’s worth noting that the
combination of a two character keyword (i.e. if
), plus a single space,
plus an opening parenthesis creates a natural 4-space indent for the
subsequent lines of the multiline conditional. This can produce a visual
conflict with the indented suite of code nested inside the if
-statement,
which would also naturally be indented to 4 spaces. This PEP takes no
explicit position on how (or whether) to further visually distinguish such
conditional lines from the nested suite inside the if
-statement.
Acceptable options in this situation include, but are not limited to:
# No extra indentation. if (this_is_one_thing and that_is_another_thing): do_something() # Add a comment, which will provide some distinction in editors # supporting syntax highlighting. if (this_is_one_thing and that_is_another_thing): # Since both conditions are true, we can frobnicate. do_something() # Add some extra indentation on the conditional continuation line. if (this_is_one_thing and that_is_another_thing): do_something()
(Also see the discussion of whether to break before or after binary
operators below.)
The closing brace/bracket/parenthesis on multiline constructs may
either line up under the first non-whitespace character of the last
line of list, as in:
my_list = [ 1, 2, 3, 4, 5, 6, ] result = some_function_that_takes_arguments( 'a', 'b', 'c', 'd', 'e', 'f', )
or it may be lined up under the first character of the line that
starts the multiline construct, as in:
my_list = [ 1, 2, 3, 4, 5, 6, ] result = some_function_that_takes_arguments( 'a', 'b', 'c', 'd', 'e', 'f', )
Tabs or Spaces?
Spaces are the preferred indentation method.
Tabs should be used solely to remain consistent with code that is
already indented with tabs.
Python disallows mixing tabs and spaces for indentation.
Maximum Line Length
Limit all lines to a maximum of 79 characters.
For flowing long blocks of text with fewer structural restrictions
(docstrings or comments), the line length should be limited to 72
characters.
Limiting the required editor window width makes it possible to have
several files open side by side, and works well when using code
review tools that present the two versions in adjacent columns.
The default wrapping in most tools disrupts the visual structure of the
code, making it more difficult to understand. The limits are chosen to
avoid wrapping in editors with the window width set to 80, even
if the tool places a marker glyph in the final column when wrapping
lines. Some web based tools may not offer dynamic line wrapping at all.
Some teams strongly prefer a longer line length. For code maintained
exclusively or primarily by a team that can reach agreement on this
issue, it is okay to increase the line length limit up to 99 characters,
provided that comments and docstrings are still wrapped at 72
characters.
The Python standard library is conservative and requires limiting
lines to 79 characters (and docstrings/comments to 72).
The preferred way of wrapping long lines is by using Python’s implied
line continuation inside parentheses, brackets and braces. Long lines
can be broken over multiple lines by wrapping expressions in
parentheses. These should be used in preference to using a backslash
for line continuation.
Backslashes may still be appropriate at times. For example, long,
multiple with
-statements could not use implicit continuation
before Python 3.10, so backslashes were acceptable for that case:
with open('/path/to/some/file/you/want/to/read') as file_1, \ open('/path/to/some/file/being/written', 'w') as file_2: file_2.write(file_1.read())
(See the previous discussion on multiline if-statements for further
thoughts on the indentation of such multiline with
-statements.)
Another such case is with assert
statements.
Make sure to indent the continued line appropriately.
Should a Line Break Before or After a Binary Operator?
For decades the recommended style was to break after binary operators.
But this can hurt readability in two ways: the operators tend to get
scattered across different columns on the screen, and each operator is
moved away from its operand and onto the previous line. Here, the eye
has to do extra work to tell which items are added and which are
subtracted:
# Wrong: # operators sit far away from their operands income = (gross_wages + taxable_interest + (dividends - qualified_dividends) - ira_deduction - student_loan_interest)
To solve this readability problem, mathematicians and their publishers
follow the opposite convention. Donald Knuth explains the traditional
rule in his Computers and Typesetting series: “Although formulas
within a paragraph always break after binary operations and relations,
displayed formulas always break before binary operations” [3].
Following the tradition from mathematics usually results in more
readable code:
# Correct: # easy to match operators with operands income = (gross_wages + taxable_interest + (dividends - qualified_dividends) - ira_deduction - student_loan_interest)
In Python code, it is permissible to break before or after a binary
operator, as long as the convention is consistent locally. For new
code Knuth’s style is suggested.
Blank Lines
Surround top-level function and class definitions with two blank
lines.
Method definitions inside a class are surrounded by a single blank
line.
Extra blank lines may be used (sparingly) to separate groups of
related functions. Blank lines may be omitted between a bunch of
related one-liners (e.g. a set of dummy implementations).
Use blank lines in functions, sparingly, to indicate logical sections.
Python accepts the control-L (i.e. ^L) form feed character as
whitespace; many tools treat these characters as page separators, so
you may use them to separate pages of related sections of your file.
Note, some editors and web-based code viewers may not recognize
control-L as a form feed and will show another glyph in its place.
Source File Encoding
Code in the core Python distribution should always use UTF-8, and should not
have an encoding declaration.
In the standard library, non-UTF-8 encodings should be used only for
test purposes. Use non-ASCII characters sparingly, preferably only to
denote places and human names. If using non-ASCII characters as data,
avoid noisy Unicode characters like z̯̯͡a̧͎̺l̡͓̫g̹̲o̡̼̘ and byte order
marks.
All identifiers in the Python standard library MUST use ASCII-only
identifiers, and SHOULD use English words wherever feasible (in many
cases, abbreviations and technical terms are used which aren’t
English).
Open source projects with a global audience are encouraged to adopt a
similar policy.
Imports
- Imports should usually be on separate lines:
# Correct: import os import sys
It’s okay to say this though:
# Correct: from subprocess import Popen, PIPE
- Imports are always put at the top of the file, just after any module
comments and docstrings, and before module globals and constants.Imports should be grouped in the following order:
- Standard library imports.
- Related third party imports.
- Local application/library specific imports.
You should put a blank line between each group of imports.
- Absolute imports are recommended, as they are usually more readable
and tend to be better behaved (or at least give better error
messages) if the import system is incorrectly configured (such as
when a directory inside a package ends up onsys.path
):import mypkg.sibling from mypkg import sibling from mypkg.sibling import example
However, explicit relative imports are an acceptable alternative to
absolute imports, especially when dealing with complex package layouts
where using absolute imports would be unnecessarily verbose:from . import sibling from .sibling import example
Standard library code should avoid complex package layouts and always
use absolute imports. - When importing a class from a class-containing module, it’s usually
okay to spell this:from myclass import MyClass from foo.bar.yourclass import YourClass
If this spelling causes local name clashes, then spell them explicitly:
import myclass import foo.bar.yourclass
and use “myclass.MyClass” and “foo.bar.yourclass.YourClass”.
- Wildcard imports (
from <module> import *
) should be avoided, as
they make it unclear which names are present in the namespace,
confusing both readers and many automated tools. There is one
defensible use case for a wildcard import, which is to republish an
internal interface as part of a public API (for example, overwriting
a pure Python implementation of an interface with the definitions
from an optional accelerator module and exactly which definitions
will be overwritten isn’t known in advance).When republishing names this way, the guidelines below regarding
public and internal interfaces still apply.
Module Level Dunder Names
Module level “dunders” (i.e. names with two leading and two trailing
underscores) such as __all__
, __author__
, __version__
,
etc. should be placed after the module docstring but before any import
statements except from __future__
imports. Python mandates that
future-imports must appear in the module before any other code except
docstrings:
"""This is the example module. This module does stuff. """ from __future__ import barry_as_FLUFL __all__ = ['a', 'b', 'c'] __version__ = '0.1' __author__ = 'Cardinal Biggles' import os import sys
String Quotes
In Python, single-quoted strings and double-quoted strings are the
same. This PEP does not make a recommendation for this. Pick a rule
and stick to it. When a string contains single or double quote
characters, however, use the other one to avoid backslashes in the
string. It improves readability.
For triple-quoted strings, always use double quote characters to be
consistent with the docstring convention in PEP 257.
Whitespace in Expressions and Statements
Pet Peeves
Avoid extraneous whitespace in the following situations:
- Immediately inside parentheses, brackets or braces:
# Correct: spam(ham[1], {eggs: 2})
# Wrong: spam( ham[ 1 ], { eggs: 2 } )
- Between a trailing comma and a following close parenthesis:
- Immediately before a comma, semicolon, or colon:
# Correct: if x == 4: print(x, y); x, y = y, x
# Wrong: if x == 4 : print(x , y) ; x , y = y , x
- However, in a slice the colon acts like a binary operator, and
should have equal amounts on either side (treating it as the
operator with the lowest priority). In an extended slice, both
colons must have the same amount of spacing applied. Exception:
when a slice parameter is omitted, the space is omitted:# Correct: ham[1:9], ham[1:9:3], ham[:9:3], ham[1::3], ham[1:9:] ham[lower:upper], ham[lower:upper:], ham[lower::step] ham[lower+offset : upper+offset] ham[: upper_fn(x) : step_fn(x)], ham[:: step_fn(x)] ham[lower + offset : upper + offset]
# Wrong: ham[lower + offset:upper + offset] ham[1: 9], ham[1 :9], ham[1:9 :3] ham[lower : : step] ham[ : upper]
- Immediately before the open parenthesis that starts the argument
list of a function call: - Immediately before the open parenthesis that starts an indexing or
slicing:# Correct: dct['key'] = lst[index]
# Wrong: dct ['key'] = lst [index]
- More than one space around an assignment (or other) operator to
align it with another:# Correct: x = 1 y = 2 long_variable = 3
# Wrong: x = 1 y = 2 long_variable = 3
Other Recommendations
- Avoid trailing whitespace anywhere. Because it’s usually invisible,
it can be confusing: e.g. a backslash followed by a space and a
newline does not count as a line continuation marker. Some editors
don’t preserve it and many projects (like CPython itself) have
pre-commit hooks that reject it. - Always surround these binary operators with a single space on either
side: assignment (=
), augmented assignment (+=
,-=
etc.), comparisons (==
,<
,>
,!=
,<>
,<=
,
>=
,in
,not in
,is
,is not
), Booleans (and
,
or
,not
). - If operators with different priorities are used, consider adding
whitespace around the operators with the lowest priority(ies). Use
your own judgment; however, never use more than one space, and
always have the same amount of whitespace on both sides of a binary
operator:# Correct: i = i + 1 submitted += 1 x = x*2 - 1 hypot2 = x*x + y*y c = (a+b) * (a-b)
# Wrong: i=i+1 submitted +=1 x = x * 2 - 1 hypot2 = x * x + y * y c = (a + b) * (a - b)
- Function annotations should use the normal rules for colons and
always have spaces around the->
arrow if present. (See
Function Annotations below for more about function annotations.):# Correct: def munge(input: AnyStr): ... def munge() -> PosInt: ...
# Wrong: def munge(input:AnyStr): ... def munge()->PosInt: ...
- Don’t use spaces around the
=
sign when used to indicate a
keyword argument, or when used to indicate a default value for an
unannotated function parameter:# Correct: def complex(real, imag=0.0): return magic(r=real, i=imag)
# Wrong: def complex(real, imag = 0.0): return magic(r = real, i = imag)
When combining an argument annotation with a default value, however, do use
spaces around the=
sign:# Correct: def munge(sep: AnyStr = None): ... def munge(input: AnyStr, sep: AnyStr = None, limit=1000): ...
# Wrong: def munge(input: AnyStr=None): ... def munge(input: AnyStr, limit = 1000): ...
- Compound statements (multiple statements on the same line) are
generally discouraged:# Correct: if foo == 'blah': do_blah_thing() do_one() do_two() do_three()
Rather not:
# Wrong: if foo == 'blah': do_blah_thing() do_one(); do_two(); do_three()
- While sometimes it’s okay to put an if/for/while with a small body
on the same line, never do this for multi-clause statements. Also
avoid folding such long lines!Rather not:
# Wrong: if foo == 'blah': do_blah_thing() for x in lst: total += x while t < 10: t = delay()
Definitely not:
# Wrong: if foo == 'blah': do_blah_thing() else: do_non_blah_thing() try: something() finally: cleanup() do_one(); do_two(); do_three(long, argument, list, like, this) if foo == 'blah': one(); two(); three()
When to Use Trailing Commas
Trailing commas are usually optional, except they are mandatory when
making a tuple of one element. For clarity, it is recommended to
surround the latter in (technically redundant) parentheses:
# Correct: FILES = ('setup.cfg',)
# Wrong: FILES = 'setup.cfg',
When trailing commas are redundant, they are often helpful when a
version control system is used, when a list of values, arguments or
imported items is expected to be extended over time. The pattern is
to put each value (etc.) on a line by itself, always adding a trailing
comma, and add the close parenthesis/bracket/brace on the next line.
However it does not make sense to have a trailing comma on the same
line as the closing delimiter (except in the above case of singleton
tuples):
# Correct: FILES = [ 'setup.cfg', 'tox.ini', ] initialize(FILES, error=True, )
# Wrong: FILES = ['setup.cfg', 'tox.ini',] initialize(FILES, error=True,)
Naming Conventions
The naming conventions of Python’s library are a bit of a mess, so
we’ll never get this completely consistent – nevertheless, here are
the currently recommended naming standards. New modules and packages
(including third party frameworks) should be written to these
standards, but where an existing library has a different style,
internal consistency is preferred.
Overriding Principle
Names that are visible to the user as public parts of the API should
follow conventions that reflect usage rather than implementation.
Descriptive: Naming Styles
There are a lot of different naming styles. It helps to be able to
recognize what naming style is being used, independently from what
they are used for.
The following naming styles are commonly distinguished:
b
(single lowercase letter)B
(single uppercase letter)lowercase
lower_case_with_underscores
UPPERCASE
UPPER_CASE_WITH_UNDERSCORES
CapitalizedWords
(or CapWords, or CamelCase – so named because
of the bumpy look of its letters [4]). This is also sometimes known
as StudlyCaps.Note: When using acronyms in CapWords, capitalize all the
letters of the acronym. Thus HTTPServerError is better than
HttpServerError.mixedCase
(differs from CapitalizedWords by initial lowercase
character!)Capitalized_Words_With_Underscores
(ugly!)
There’s also the style of using a short unique prefix to group related
names together. This is not used much in Python, but it is mentioned
for completeness. For example, the os.stat()
function returns a
tuple whose items traditionally have names like st_mode
,
st_size
, st_mtime
and so on. (This is done to emphasize the
correspondence with the fields of the POSIX system call struct, which
helps programmers familiar with that.)
The X11 library uses a leading X for all its public functions. In
Python, this style is generally deemed unnecessary because attribute
and method names are prefixed with an object, and function names are
prefixed with a module name.
In addition, the following special forms using leading or trailing
underscores are recognized (these can generally be combined with any
case convention):
_single_leading_underscore
: weak “internal use” indicator.
E.g.from M import *
does not import objects whose names start
with an underscore.single_trailing_underscore_
: used by convention to avoid
conflicts with Python keyword, e.g.tkinter.Toplevel(master, class_='ClassName')
__double_leading_underscore
: when naming a class attribute,
invokes name mangling (inside class FooBar,__boo
becomes
_FooBar__boo
; see below).__double_leading_and_trailing_underscore__
: “magic” objects or
attributes that live in user-controlled namespaces.
E.g.__init__
,__import__
or__file__
. Never invent
such names; only use them as documented.
Prescriptive: Naming Conventions
Names to Avoid
Never use the characters ‘l’ (lowercase letter el), ‘O’ (uppercase
letter oh), or ‘I’ (uppercase letter eye) as single character variable
names.
In some fonts, these characters are indistinguishable from the
numerals one and zero. When tempted to use ‘l’, use ‘L’ instead.
ASCII Compatibility
Identifiers used in the standard library must be ASCII compatible
as described in the
policy section
of PEP 3131.
Package and Module Names
Modules should have short, all-lowercase names. Underscores can be
used in the module name if it improves readability. Python packages
should also have short, all-lowercase names, although the use of
underscores is discouraged.
When an extension module written in C or C++ has an accompanying
Python module that provides a higher level (e.g. more object oriented)
interface, the C/C++ module has a leading underscore
(e.g. _socket
).
Class Names
Class names should normally use the CapWords convention.
The naming convention for functions may be used instead in cases where
the interface is documented and used primarily as a callable.
Note that there is a separate convention for builtin names: most builtin
names are single words (or two words run together), with the CapWords
convention used only for exception names and builtin constants.
Type Variable Names
Names of type variables introduced in PEP 484 should normally use CapWords
preferring short names: T
, AnyStr
, Num
. It is recommended to add
suffixes _co
or _contra
to the variables used to declare covariant
or contravariant behavior correspondingly:
from typing import TypeVar VT_co = TypeVar('VT_co', covariant=True) KT_contra = TypeVar('KT_contra', contravariant=True)
Exception Names
Because exceptions should be classes, the class naming convention
applies here. However, you should use the suffix “Error” on your
exception names (if the exception actually is an error).
Global Variable Names
(Let’s hope that these variables are meant for use inside one module
only.) The conventions are about the same as those for functions.
Modules that are designed for use via from M import *
should use
the __all__
mechanism to prevent exporting globals, or use the
older convention of prefixing such globals with an underscore (which
you might want to do to indicate these globals are “module
non-public”).
Function and Variable Names
Function names should be lowercase, with words separated by
underscores as necessary to improve readability.
Variable names follow the same convention as function names.
mixedCase is allowed only in contexts where that’s already the
prevailing style (e.g. threading.py), to retain backwards
compatibility.
Function and Method Arguments
Always use self
for the first argument to instance methods.
Always use cls
for the first argument to class methods.
If a function argument’s name clashes with a reserved keyword, it is
generally better to append a single trailing underscore rather than
use an abbreviation or spelling corruption. Thus class_
is better
than clss
. (Perhaps better is to avoid such clashes by using a
synonym.)
Method Names and Instance Variables
Use the function naming rules: lowercase with words separated by
underscores as necessary to improve readability.
Use one leading underscore only for non-public methods and instance
variables.
To avoid name clashes with subclasses, use two leading underscores to
invoke Python’s name mangling rules.
Python mangles these names with the class name: if class Foo has an
attribute named __a
, it cannot be accessed by Foo.__a
. (An
insistent user could still gain access by calling Foo._Foo__a
.)
Generally, double leading underscores should be used only to avoid
name conflicts with attributes in classes designed to be subclassed.
Note: there is some controversy about the use of __names (see below).
Constants
Constants are usually defined on a module level and written in all
capital letters with underscores separating words. Examples include
MAX_OVERFLOW
and TOTAL
.
Designing for Inheritance
Always decide whether a class’s methods and instance variables
(collectively: “attributes”) should be public or non-public. If in
doubt, choose non-public; it’s easier to make it public later than to
make a public attribute non-public.
Public attributes are those that you expect unrelated clients of your
class to use, with your commitment to avoid backwards incompatible
changes. Non-public attributes are those that are not intended to be
used by third parties; you make no guarantees that non-public
attributes won’t change or even be removed.
We don’t use the term “private” here, since no attribute is really
private in Python (without a generally unnecessary amount of work).
Another category of attributes are those that are part of the
“subclass API” (often called “protected” in other languages). Some
classes are designed to be inherited from, either to extend or modify
aspects of the class’s behavior. When designing such a class, take
care to make explicit decisions about which attributes are public,
which are part of the subclass API, and which are truly only to be
used by your base class.
With this in mind, here are the Pythonic guidelines:
- Public attributes should have no leading underscores.
- If your public attribute name collides with a reserved keyword,
append a single trailing underscore to your attribute name. This is
preferable to an abbreviation or corrupted spelling. (However,
notwithstanding this rule, ‘cls’ is the preferred spelling for any
variable or argument which is known to be a class, especially the
first argument to a class method.)Note 1: See the argument name recommendation above for class methods.
- For simple public data attributes, it is best to expose just the
attribute name, without complicated accessor/mutator methods. Keep
in mind that Python provides an easy path to future enhancement,
should you find that a simple data attribute needs to grow
functional behavior. In that case, use properties to hide
functional implementation behind simple data attribute access
syntax.Note 1: Try to keep the functional behavior side-effect free,
although side-effects such as caching are generally fine.Note 2: Avoid using properties for computationally expensive
operations; the attribute notation makes the caller believe that
access is (relatively) cheap. - If your class is intended to be subclassed, and you have attributes
that you do not want subclasses to use, consider naming them with
double leading underscores and no trailing underscores. This
invokes Python’s name mangling algorithm, where the name of the
class is mangled into the attribute name. This helps avoid
attribute name collisions should subclasses inadvertently contain
attributes with the same name.Note 1: Note that only the simple class name is used in the mangled
name, so if a subclass chooses both the same class name and attribute
name, you can still get name collisions.Note 2: Name mangling can make certain uses, such as debugging and
__getattr__()
, less convenient. However the name mangling
algorithm is well documented and easy to perform manually.Note 3: Not everyone likes name mangling. Try to balance the
need to avoid accidental name clashes with potential use by
advanced callers.
Public and Internal Interfaces
Any backwards compatibility guarantees apply only to public interfaces.
Accordingly, it is important that users be able to clearly distinguish
between public and internal interfaces.
Documented interfaces are considered public, unless the documentation
explicitly declares them to be provisional or internal interfaces exempt
from the usual backwards compatibility guarantees. All undocumented
interfaces should be assumed to be internal.
To better support introspection, modules should explicitly declare the
names in their public API using the __all__
attribute. Setting
__all__
to an empty list indicates that the module has no public API.
Even with __all__
set appropriately, internal interfaces (packages,
modules, classes, functions, attributes or other names) should still be
prefixed with a single leading underscore.
An interface is also considered internal if any containing namespace
(package, module or class) is considered internal.
Imported names should always be considered an implementation detail.
Other modules must not rely on indirect access to such imported names
unless they are an explicitly documented part of the containing module’s
API, such as os.path
or a package’s __init__
module that exposes
functionality from submodules.
Programming Recommendations
- Code should be written in a way that does not disadvantage other
implementations of Python (PyPy, Jython, IronPython, Cython, Psyco,
and such).For example, do not rely on CPython’s efficient implementation of
in-place string concatenation for statements in the forma += b
ora = a + b
. This optimization is fragile even in CPython (it
only works for some types) and isn’t present at all in implementations
that don’t use refcounting. In performance sensitive parts of the
library, the''.join()
form should be used instead. This will
ensure that concatenation occurs in linear time across various
implementations. - Comparisons to singletons like None should always be done with
is
oris not
, never the equality operators.Also, beware of writing
if x
when you really meanif x is not
– e.g. when testing whether a variable or argument that
None
defaults to None was set to some other value. The other value might
have a type (such as a container) that could be false in a boolean
context! - Use
is not
operator rather thannot ... is
. While both
expressions are functionally identical, the former is more readable
and preferred:# Correct: if foo is not None:
# Wrong: if not foo is None:
- When implementing ordering operations with rich comparisons, it is
best to implement all six operations (__eq__
,__ne__
,
__lt__
,__le__
,__gt__
,__ge__
) rather than relying
on other code to only exercise a particular comparison.To minimize the effort involved, the
functools.total_ordering()
decorator provides a tool to generate missing comparison methods.PEP 207 indicates that reflexivity rules are assumed by Python.
Thus, the interpreter may swapy > x
withx < y
,y >= x
withx <= y
, and may swap the arguments ofx == y
andx !=
. The
ysort()
andmin()
operations are guaranteed to use
the<
operator and themax()
function uses the>
operator. However, it is best to implement all six operations so
that confusion doesn’t arise in other contexts. - Always use a def statement instead of an assignment statement that binds
a lambda expression directly to an identifier:# Correct: def f(x): return 2*x
# Wrong: f = lambda x: 2*x
The first form means that the name of the resulting function object is
specifically ‘f’ instead of the generic ‘<lambda>’. This is more
useful for tracebacks and string representations in general. The use
of the assignment statement eliminates the sole benefit a lambda
expression can offer over an explicit def statement (i.e. that it can
be embedded inside a larger expression) - Derive exceptions from
Exception
rather thanBaseException
.
Direct inheritance fromBaseException
is reserved for exceptions
where catching them is almost always the wrong thing to do.Design exception hierarchies based on the distinctions that code
catching the exceptions is likely to need, rather than the locations
where the exceptions are raised. Aim to answer the question
“What went wrong?” programmatically, rather than only stating that
“A problem occurred” (see PEP 3151 for an example of this lesson being
learned for the builtin exception hierarchy)Class naming conventions apply here, although you should add the
suffix “Error” to your exception classes if the exception is an
error. Non-error exceptions that are used for non-local flow control
or other forms of signaling need no special suffix. - Use exception chaining appropriately.
raise X from Y
should be used to indicate explicit replacement without losing the
original traceback.When deliberately replacing an inner exception (using
raise X from
), ensure that relevant details are transferred to the new
None
exception (such as preserving the attribute name when converting
KeyError to AttributeError, or embedding the text of the original
exception in the new exception message). - When catching exceptions, mention specific exceptions whenever
possible instead of using a bareexcept:
clause:try: import platform_specific_module except ImportError: platform_specific_module = None
A bare
except:
clause will catch SystemExit and
KeyboardInterrupt exceptions, making it harder to interrupt a
program with Control-C, and can disguise other problems. If you
want to catch all exceptions that signal program errors, use
except Exception:
(bare except is equivalent toexcept
).
BaseException:A good rule of thumb is to limit use of bare ‘except’ clauses to two
cases:- If the exception handler will be printing out or logging the
traceback; at least the user will be aware that an error has
occurred. - If the code needs to do some cleanup work, but then lets the
exception propagate upwards withraise
.try...finally
can be a better way to handle this case.
- If the exception handler will be printing out or logging the
- When catching operating system errors, prefer the explicit exception
hierarchy introduced in Python 3.3 over introspection oferrno
values. - Additionally, for all try/except clauses, limit the
try
clause
to the absolute minimum amount of code necessary. Again, this
avoids masking bugs:# Correct: try: value = collection[key] except KeyError: return key_not_found(key) else: return handle_value(value)
# Wrong: try: # Too broad! return handle_value(collection[key]) except KeyError: # Will also catch KeyError raised by handle_value() return key_not_found(key)
- When a resource is local to a particular section of code, use a
with
statement to ensure it is cleaned up promptly and reliably
after use. A try/finally statement is also acceptable. - Context managers should be invoked through separate functions or methods
whenever they do something other than acquire and release resources:# Correct: with conn.begin_transaction(): do_stuff_in_transaction(conn)
# Wrong: with conn: do_stuff_in_transaction(conn)
The latter example doesn’t provide any information to indicate that
the__enter__
and__exit__
methods are doing something other
than closing the connection after a transaction. Being explicit is
important in this case. - Be consistent in return statements. Either all return statements in
a function should return an expression, or none of them should. If
any return statement returns an expression, any return statements
where no value is returned should explicitly state this asreturn
, and an explicit return statement should be present at the
None
end of the function (if reachable):# Correct: def foo(x): if x >= 0: return math.sqrt(x) else: return None def bar(x): if x < 0: return None return math.sqrt(x)
# Wrong: def foo(x): if x >= 0: return math.sqrt(x) def bar(x): if x < 0: return return math.sqrt(x)
- Use
''.startswith()
and''.endswith()
instead of string
slicing to check for prefixes or suffixes.startswith() and endswith() are cleaner and less error prone:
# Correct: if foo.startswith('bar'):
# Wrong: if foo[:3] == 'bar':
- Object type comparisons should always use isinstance() instead of
comparing types directly:# Correct: if isinstance(obj, int):
# Wrong: if type(obj) is type(1):
- For sequences, (strings, lists, tuples), use the fact that empty
sequences are false:# Correct: if not seq: if seq:
# Wrong: if len(seq): if not len(seq):
- Don’t write string literals that rely on significant trailing
whitespace. Such trailing whitespace is visually indistinguishable
and some editors (or more recently, reindent.py) will trim them. - Don’t compare boolean values to True or False using
==
:# Wrong: if greeting == True:
Worse:
# Wrong: if greeting is True:
- Use of the flow control statements
return
/break
/continue
within the finally suite of atry...finally
, where the flow control
statement would jump outside the finally suite, is discouraged. This
is because such statements will implicitly cancel any active exception
that is propagating through the finally suite:# Wrong: def foo(): try: 1 / 0 finally: return 42
Function Annotations
With the acceptance of PEP 484, the style rules for function
annotations have changed.
- Function annotations should use PEP 484 syntax (there are some
formatting recommendations for annotations in the previous section). - The experimentation with annotation styles that was recommended
previously in this PEP is no longer encouraged. - However, outside the stdlib, experiments within the rules of PEP 484
are now encouraged. For example, marking up a large third party
library or application with PEP 484 style type annotations,
reviewing how easy it was to add those annotations, and observing
whether their presence increases code understandability. - The Python standard library should be conservative in adopting such
annotations, but their use is allowed for new code and for big
refactorings. - For code that wants to make a different use of function annotations
it is recommended to put a comment of the form:near the top of the file; this tells type checkers to ignore all
annotations. (More fine-grained ways of disabling complaints from
type checkers can be found in PEP 484.) - Like linters, type checkers are optional, separate tools. Python
interpreters by default should not issue any messages due to type
checking and should not alter their behavior based on annotations. - Users who don’t want to use type checkers are free to ignore them.
However, it is expected that users of third party library packages
may want to run type checkers over those packages. For this purpose
PEP 484 recommends the use of stub files: .pyi files that are read
by the type checker in preference of the corresponding .py files.
Stub files can be distributed with a library, or separately (with
the library author’s permission) through the typeshed repo [5].
Variable Annotations
PEP 526 introduced variable annotations. The style recommendations for them are
similar to those on function annotations described above:
- Annotations for module level variables, class and instance variables,
and local variables should have a single space after the colon. - There should be no space before the colon.
- If an assignment has a right hand side, then the equality sign should have
exactly one space on both sides:# Correct: code: int class Point: coords: Tuple[int, int] label: str = '<unknown>'
# Wrong: code:int # No space after colon code : int # Space before colon class Test: result: int=0 # No spaces around equality sign
- Although the PEP 526 is accepted for Python 3.6, the variable annotation
syntax is the preferred syntax for stub files on all versions of Python
(see PEP 484 for details).
Footnotes
References
Copyright
This document has been placed in the public domain.
Когда дело доходит до написания крупных проектов или поддержки существующего кода, становится очень важным следовать определенным стандартам кодирования, чтобы обеспечить читаемость, понятность и поддерживаемость кода.
Один из таких стандартов — PEP8, который устанавливает рекомендации по стилю и форматированию кода на языке Python.
Содержание
- Введение
- Причины использовать
- Основные правила
- Отступы
- Максимальная длина строки
- Пробелы
- Именование
- Комментарии
- Импорты
- Пробелы вокруг операторов
- Названия функций и методов
- Название переменных
- Расположение функций и классов
- Длина строки
- Тройные кавычки
- Полезные инструменты
- Дополнительные источники
- Заключение
Введение
PEP8 — это документ, описывающий стандарты, которые разработчики должны следовать при написании кода на Python. Следование этим рекомендациям может значительно улучшить качество вашего кода, сделать его более читаемым и понятным для других разработчиков.
В этой статье мы рассмотрим основные правила, описанные в PEP8, и объясним, как их следовать, чтобы написать чистый, читаемый и поддерживаемый код на Python.
Причины использовать
PEP8 — это руководство по стилю кода для языка программирования Python. Он описывает рекомендации и правила для написания читаемого, понятного и консистентного кода на Python.
PEP8 важен для написания качественного кода на Python по нескольким причинам. Во-первых, он помогает сделать код более читаемым и понятным для других программистов, которые могут работать с вашим кодом. Это особенно важно, если вы работаете в команде или если ваш код будет использоваться другими людьми.
Во-вторых, соблюдение стандартов PEP8 может помочь сделать ваш код более консистентным. Это означает, что ваш код будет выглядеть более единообразно и просто, что упрощает его понимание и обслуживание.
В-третьих, соблюдение стандартов PEP8 может помочь обнаружить ошибки и потенциальные проблемы в вашем коде. Например, если вы используете нестандартное именование переменных или не соблюдаете правила отступов, это может привести к ошибкам или проблемам при чтении вашего кода.
Кроме того, многие инструменты разработки поддерживают стандарты PEP8 и могут помочь вам автоматически проверять соответствие вашего кода этим стандартам.
В целом, соблюдение стандартов PEP8 является важной практикой для написания качественного кода на Python. Он помогает сделать ваш код более читаемым, консистентным и устойчивым к ошибкам.
Основные правила
Основные правила PEP8 — это набор рекомендаций по оформлению кода на Python, который помогает сделать код более читаемым и понятным. Ниже я перечислю несколько основных правил PEP8:
Отступы
Используйте 4 пробела для каждого уровня отступа. Это правило помогает визуально выделить блоки кода и упрощает чтение кода. Использование символов табуляции или пробелов для отступов не рекомендуется, так как это может вызывать проблемы с отображением в разных текстовых редакторах.
Например:
# Правильно: if x == 1: print("x is 1") # Неправильно: if x == 1: print("x is 1")
Максимальная длина строки
Ограничьте длину строки не более чем 79 символами. Если строка длиннее, разбейте ее на несколько строк. Длинные строки могут быть трудны для чтения, особенно когда они выходят за границы окна редактора. Разбиение длинных строк на несколько строк с помощью продолжения строки с помощью символа обратной косой черты \
является хорошей практикой.
Например:
# Правильно: long_string = "This is a really long string that "\ "spans multiple lines." # Неправильно: long_string = "This is a really long string that spans multiple lines."
Пробелы
Используйте один пробел между операторами и операндами. Не используйте пробелы для выделения скобок вокруг аргументов функций. Это правило помогает упростить код и сделать его более читабельным.
Например:
# Правильно: x = 2 + 3 y = (1 + 2) * 3 # Неправильно: x=2+3 y = ( 1 + 2 ) * 3
Именование
Используйте понятные и описательные имена переменных, функций и методов. Для имени переменных используйте строчные буквы, а для имен функций и методов — заглавные буквы. Это правило помогает делать ваш код более читаемым и понятным для других программистов.
Например:
# Правильно: age = 25 name = "John" def calculate_sum(numbers): return sum(numbers) # Неправильно: a = 25 b = "John" def calc_sum(nums): return sum(nums)
Комментарии
Добавляйте комментарии к вашему коду, чтобы объяснить сложные участки кода.
Комментарии должны быть короткими, лаконичными и описательными, они должны помогать другим программистам понимать ваш код. Не используйте комментарии для описания очевидных вещей, таких как присваивание переменной значения, и избегайте комментариев в конце строки.
Например:
# Правильно: # Получаем текущую дату и время current_time = datetime.datetime.now() # Неправильно: current_time = datetime.datetime.now() # Получаем текущую дату и время
Импорты
Импортируйте модули в алфавитном порядке, разделяйте группы импортов пустой строкой и избегайте использования символа *
. Это правило помогает упростить импорты и улучшить читабельность кода.
Например:
# Правильно: import datetime import os from math import sqrt import requests # Неправильно: import requests, os, datetime from math import * import my_module
Пробелы вокруг операторов
Используйте пустые строки для разделения логически связанных частей кода. Не используйте несколько операторов на одной строке.
Используйте пробелы вокруг операторов (=, +, -, *, /, //, %, и т. д.), но не используйте пробелы вокруг символа индексирования или среза.
Например:
# Правильно: x = 2 + 3 y = x * 4 z = list[0] # Неправильно: x=2+3 y = x*4 z = list [0]
Названия функций и методов
Используйте глаголы в названиях функций и методов, используйте нижнее подчеркивание для разделения слов. Это правило помогает делать код более понятным и легче читаемым.
Например:
# Правильно: def calculate_sum(numbers): return sum(numbers) def get_user_name(user): return user.name # Неправильно: def numbersSum(nums): return sum(nums) def user(user): return user.name
Название переменных
Используйте понятные и описательные названия для переменных, избегайте использования одиночных символов в качестве названий переменных, используйте нижнее подчеркивание для разделения слов.
Например:
# Правильно: total_sum = 0 list_of_numbers = [1, 2, 3, 4] user_name = "John" # Неправильно: t = 0 n = [1, 2, 3, 4] un = "John"
Расположение функций и классов
Располагайте функции и классы логически в вашем коде. Разделяйте функции и классы пустыми строками, чтобы улучшить читабельность кода. Функции должны быть определены перед их использованием.
Например:
# Правильно: def calculate_sum(numbers): return sum(numbers) def main(): list_of_numbers = [1, 2, 3, 4] total_sum = calculate_sum(list_of_numbers) print(f"The total sum is: {total_sum}") if __name__ == "__main__": main() # Неправильно: def main(): list_of_numbers = [1, 2, 3, 4] total_sum = calculate_sum(list_of_numbers) print(f"The total sum is: {total_sum}") def calculate_sum(numbers): return sum(numbers) if __name__ == "__main__": main()
Длина строки
Строки не должны быть длиннее 79 символов. Если строка слишком длинная, ее можно разбить на несколько строк, используя скобки, запятые или операторы конкатенации.
Например:
# Правильно: message = "This is a very long message that should be split " \ "into multiple lines for better readability." total_sum = (100 + 200 + 300 + 400 + 500 + 600) # Неправильно: message = "This is a very long message that should be split into multiple lines for better readability." total_sum = 100 + 200 + 300 + 400 + 500 + 600
Тройные кавычки
Используйте тройные кавычки для документации вашего кода. Это помогает другим программистам понимать ваш код и использовать его в своих проектах.
Например:
# Правильно: def calculate_sum(numbers): """ This function calculates the sum of the numbers in the given list. Parameters: numbers (list): A list of numbers to calculate the sum of. Returns: float: The sum of the numbers in the list. """ return sum(numbers) # Неправильно: def calculate_sum(numbers): # This function calculates the sum of the numbers in the given list. return sum(numbers)
Полезные инструменты
Действительно, существует множество инструментов, которые помогают разработчикам Python следовать стандартам PEP8. Эти инструменты включают линтеры и автоматические форматировщики.
Линтеры — это инструменты, которые анализируют код и проверяют его соответствие стандартам PEP8. Они предупреждают разработчиков о нарушениях стандартов PEP8 и других проблемах в их коде. Некоторые из наиболее популярных линтеров для Python включают:
- pylint — это линтер для Python, который проверяет соответствие кода стандартам PEP8. Он также обнаруживает другие проблемы в коде, такие как синтаксические ошибки, неиспользуемые переменные и дублирование кода.
- flake8 — это линтер, который проверяет соответствие кода стандартам PEP8, а также обнаруживает другие проблемы в коде, такие как неиспользуемые импорты и неправильное форматирование строк.
- PyCharm — это IDE для Python, которая включает встроенный линтер, который проверяет соответствие кода стандартам PEP8 и другие проблемы в коде. Он также предлагает рекомендации по исправлению нарушений стандартов PEP8.
Автоматические форматировщики — это инструменты, которые автоматически форматируют код в соответствии со стандартами PEP8. Они упрощают процесс форматирования кода и позволяют разработчикам сосредоточиться на его содержимом. Некоторые из наиболее популярных автоматических форматировщиков для Python включают:
- Black — это автоматический форматировщик кода на Python, который форматирует код в соответствии со стандартами PEP8. Он удаляет неоднозначность в коде и делает его более понятным.
- autopep8 — это инструмент, который автоматически форматирует код в соответствии со стандартами PEP8. Он также может исправлять другие проблемы в коде, такие как синтаксические ошибки.
- YAPF — это автоматический форматировщик кода на Python, который форматирует код в соответствии со стандартами PEP8 и другими рекомендациями по стилю кодирования.
Использование линтеров и автоматических форматировщиков может значительно упростить процесс следования стандартам PEP8 и улучшить качество кода. Например, линтеры могут предупреждать о нарушениях стандартов PEP8 до того, как код будет отправлен на ревью, что помогает избежать ошибок, связанных со стилем кода. Автоматические форматировщики позволяют быстро и легко отформатировать код в соответствии со стандартами PEP8.
Кроме того, многие инструменты интегрируются с популярными средами разработки, такими как PyCharm и VS Code, что упрощает их использование и интеграцию в рабочий процесс разработки.
Важно отметить, что линтеры и автоматические форматировщики не могут полностью заменить ручное форматирование и проверку кода на соответствие стандартам PEP8. Иногда они могут допустить ошибки или предложить неоптимальные решения. Поэтому важно использовать эти инструменты в сочетании с ручной проверкой кода и соблюдением наилучших практик при написании кода на Python.
Дополнительные источники
Существует множество ресурсов и инструментов для дополнительного изучения PEP8 и его реализации в проектах.
Некоторые из них перечислены ниже:
- Официальный документ PEP8: Документ PEP8 можно найти на официальном сайте Python. Он содержит все основные правила и рекомендации по написанию чистого и понятного кода на Python.
- Flake8: Flake8 — это инструмент статического анализа кода на Python, который проверяет соответствие кода стандартам PEP8. Он также проверяет синтаксические ошибки, использование необъявленных переменных и другие нарушения.
- PyCharm: PyCharm — это интегрированная среда разработки (IDE) для Python, которая имеет встроенный инструмент PEP8. Он предупреждает вас, если ваш код нарушает стандарты PEP8, и предлагает исправления.
- Black: Black — это автоматический форматтер кода на Python, который следует стандартам PEP8. Он автоматически форматирует ваш код, чтобы он соответствовал стандартам PEP8, и также может использоваться для автоматического форматирования кода в больших проектах.
- pylint: pylint — это еще один инструмент статического анализа кода на Python, который проверяет соответствие кода стандартам PEP8. Он также предупреждает вас об использовании устаревших или небезопасных конструкций в вашем коде.
- Real Python: Real Python — это онлайн-платформа для изучения Python, которая содержит множество статей, учебных пособий и видеоуроков. Они предлагают руководства по написанию чистого и понятного кода на Python в соответствии с PEP8.
- Python Code Quality Authority: Python Code Quality Authority — это организация, которая управляет несколькими инструментами для проверки качества кода на Python, включая Flake8 и pylint. Они также поддерживают ряд стандартов и рекомендаций для написания чистого и понятного кода на Python.
Заключение
Следование правилам PEP8 поможет вам написать более чистый, понятный и поддерживаемый код на Python. Однако, не стоит забывать, что эти правила не являются абсолютными и всегда могут быть нарушены в некоторых случаях. Важно следить за своим стилем и придерживаться общепринятых стандартов в сообществе Python.
Watch Now This tutorial has a related video course created by the Real Python team. Watch it together with the written tutorial to deepen your understanding: Writing Beautiful Pythonic Code With PEP 8
PEP 8, sometimes spelled PEP8 or PEP-8, is a document that provides guidelines and best practices on how to write Python code. It was written in 2001 by Guido van Rossum, Barry Warsaw, and Nick Coghlan. The primary focus of PEP 8 is to improve the readability and consistency of Python code.
PEP stands for Python Enhancement Proposal, and there are several of them. A PEP is a document that describes new features proposed for Python and documents aspects of Python, like design and style, for the community.
This tutorial outlines the key guidelines laid out in PEP 8. It’s aimed at beginner to intermediate programmers, and as such I have not covered some of the most advanced topics. You can learn about these by reading the full PEP 8 documentation.
By the end of this tutorial, you’ll be able to:
- Write Python code that conforms to PEP 8
- Understand the reasoning behind the guidelines laid out in PEP 8
- Set up your development environment so that you can start writing PEP 8 compliant Python code
Why We Need PEP 8
“Readability counts.”
— The Zen of Python
PEP 8 exists to improve the readability of Python code. But why is readability so important? Why is writing readable code one of the guiding principles of the Python language, according to the Zen of Python?
As Guido van Rossum said, “Code is read much more often than it is written.” You may spend a few minutes, or a whole day, writing a piece of code to process user authentication. Once you’ve written it, you’re never going to write it again. But you’ll definitely have to read it again. That piece of code might remain part of a project you’re working on. Every time you go back to that file, you’ll have to remember what that code does and why you wrote it, so readability matters.
If you’re new to Python, it can be difficult to remember what a piece of code does a few days, or weeks, after you wrote it. If you follow PEP 8, you can be sure that you’ve named your variables well. You’ll know that you’ve added enough whitespace so it’s easier to follow logical steps in your code. You’ll also have commented your code well. All this will mean your code is more readable and easier to come back to. As a beginner, following the rules of PEP 8 can make learning Python a much more pleasant task.
Following PEP 8 is particularly important if you’re looking for a development job. Writing clear, readable code shows professionalism. It’ll tell an employer that you understand how to structure your code well.
If you have more experience writing Python code, then you may need to collaborate with others. Writing readable code here is crucial. Other people, who may have never met you or seen your coding style before, will have to read and understand your code. Having guidelines that you follow and recognize will make it easier for others to read your code.
Naming Conventions
“Explicit is better than implicit.”
— The Zen of Python
When you write Python code, you have to name a lot of things: variables, functions, classes, packages, and so on. Choosing sensible names will save you time and energy later. You’ll be able to figure out, from the name, what a certain variable, function, or class represents. You’ll also avoid using inappropriate names that might result in errors that are difficult to debug.
Naming Styles
The table below outlines some of the common naming styles in Python code and when you should use them:
Type | Naming Convention | Examples |
---|---|---|
Function | Use a lowercase word or words. Separate words by underscores to improve readability. | function , my_function |
Variable | Use a lowercase single letter, word, or words. Separate words with underscores to improve readability. | x , var , my_variable |
Class | Start each word with a capital letter. Do not separate words with underscores. This style is called camel case or pascal case. | Model , MyClass |
Method | Use a lowercase word or words. Separate words with underscores to improve readability. | class_method , method |
Constant | Use an uppercase single letter, word, or words. Separate words with underscores to improve readability. | CONSTANT , MY_CONSTANT , MY_LONG_CONSTANT |
Module | Use a short, lowercase word or words. Separate words with underscores to improve readability. | module.py , my_module.py |
Package | Use a short, lowercase word or words. Do not separate words with underscores. | package , mypackage |
These are some of the common naming conventions and examples of how to use them. But in order to write readable code, you still have to be careful with your choice of letters and words. In addition to choosing the correct naming styles in your code, you also have to choose the names carefully. Below are a few pointers on how to do this as effectively as possible.
How to Choose Names
Choosing names for your variables, functions, classes, and so forth can be challenging. You should put a fair amount of thought into your naming choices when writing code as it will make your code more readable. The best way to name your objects in Python is to use descriptive names to make it clear what the object represents.
When naming variables, you may be tempted to choose simple, single-letter lowercase names, like x
. But, unless you’re using x
as the argument of a mathematical function, it’s not clear what x
represents. Imagine you are storing a person’s name as a string, and you want to use string slicing to format their name differently. You could end up with something like this:
>>>
>>> # Not recommended
>>> x = 'John Smith'
>>> y, z = x.split()
>>> print(z, y, sep=', ')
'Smith, John'
This will work, but you’ll have to keep track of what x
, y
, and z
represent. It may also be confusing for collaborators. A much clearer choice of names would be something like this:
>>>
>>> # Recommended
>>> name = 'John Smith'
>>> first_name, last_name = name.split()
>>> print(last_name, first_name, sep=', ')
'Smith, John'
Similarly, to reduce the amount of typing you do, it can be tempting to use abbreviations when choosing names. In the example below, I have defined a function db()
that takes a single argument x
and doubles it:
# Not recommended
def db(x):
return x * 2
At first glance, this could seem like a sensible choice. db()
could easily be an abbreviation for double. But imagine coming back to this code in a few days. You may have forgotten what you were trying to achieve with this function, and that would make guessing how you abbreviated it difficult.
The following example is much clearer. If you come back to this code a couple of days after writing it, you’ll still be able to read and understand the purpose of this function:
# Recommended
def multiply_by_two(x):
return x * 2
The same philosophy applies to all other data types and objects in Python. Always try to use the most concise but descriptive names possible.
Code Layout
“Beautiful is better than ugly.”
— The Zen of Python
How you lay out your code has a huge role in how readable it is. In this section, you’ll learn how to add vertical whitespace to improve the readability of your code. You’ll also learn how to handle the 79 character line limit recommended in PEP 8.
Blank Lines
Vertical whitespace, or blank lines, can greatly improve the readability of your code. Code that’s bunched up together can be overwhelming and hard to read. Similarly, too many blank lines in your code makes it look very sparse, and the reader might need to scroll more than necessary. Below are three key guidelines on how to use vertical whitespace.
Surround top-level functions and classes with two blank lines. Top-level functions and classes should be fairly self-contained and handle separate functionality. It makes sense to put extra vertical space around them, so that it’s clear they are separate:
class MyFirstClass:
pass
class MySecondClass:
pass
def top_level_function():
return None
Surround method definitions inside classes with a single blank line. Inside a class, functions are all related to one another. It’s good practice to leave only a single line between them:
class MyClass:
def first_method(self):
return None
def second_method(self):
return None
Use blank lines sparingly inside functions to show clear steps. Sometimes, a complicated function has to complete several steps before the return
statement. To help the reader understand the logic inside the function, it can be helpful to leave a blank line between each step.
In the example below, there is a function to calculate the variance of a list. This is two-step problem, so I have indicated each step by leaving a blank line between them. There is also a blank line before the return
statement. This helps the reader clearly see what’s returned:
def calculate_variance(number_list):
sum_list = 0
for number in number_list:
sum_list = sum_list + number
mean = sum_list / len(number_list)
sum_squares = 0
for number in number_list:
sum_squares = sum_squares + number**2
mean_squares = sum_squares / len(number_list)
return mean_squares - mean**2
If you use vertical whitespace carefully, it can greatly improved the readability of your code. It helps the reader visually understand how your code splits up into sections, and how those sections relate to one another.
Maximum Line Length and Line Breaking
PEP 8 suggests lines should be limited to 79 characters. This is because it allows you to have multiple files open next to one another, while also avoiding line wrapping.
Of course, keeping statements to 79 characters or less is not always possible. PEP 8 outlines ways to allow statements to run over several lines.
Python will assume line continuation if code is contained within parentheses, brackets, or braces:
def function(arg_one, arg_two,
arg_three, arg_four):
return arg_one
If it is impossible to use implied continuation, then you can use backslashes to break lines instead:
from mypkg import example1, \
example2, example3
However, if you can use implied continuation, then you should do so.
If line breaking needs to occur around binary operators, like +
and *
, it should occur before the operator. This rule stems from mathematics. Mathematicians agree that breaking before binary operators improves readability. Compare the following two examples.
Below is an example of breaking before a binary operator:
# Recommended
total = (first_variable
+ second_variable
- third_variable)
You can immediately see which variable is being added or subtracted, as the operator is right next to the variable being operated on.
Now, let’s see an example of breaking after a binary operator:
# Not Recommended
total = (first_variable +
second_variable -
third_variable)
Here, it’s harder to see which variable is being added and which is subtracted.
Breaking before binary operators produces more readable code, so PEP 8 encourages it. Code that consistently breaks after a binary operator is still PEP 8 compliant. However, you’re encouraged to break before a binary operator.
Indentation
“There should be one—and preferably only one—obvious way to do it.”
— The Zen of Python
Indentation, or leading whitespace, is extremely important in Python. The indentation level of lines of code in Python determines how statements are grouped together.
Consider the following example:
x = 3
if x > 5:
print('x is larger than 5')
The indented print
statement lets Python know that it should only be executed if the if
statement returns True
. The same indentation applies to tell Python what code to execute when a function is called or what code belongs to a given class.
The key indentation rules laid out by PEP 8 are the following:
- Use 4 consecutive spaces to indicate indentation.
- Prefer spaces over tabs.
Tabs vs. Spaces
As mentioned above, you should use spaces instead of tabs when indenting code. You can adjust the settings in your text editor to output 4 spaces instead of a tab character, when you press the Tab key.
If you’re using Python 2 and have used a mixture of tabs and spaces to indent your code, you won’t see errors when trying to run it. To help you to check consistency, you can add a -t
flag when running Python 2 code from the command line. The interpreter will issue warnings when you are inconsistent with your use of tabs and spaces:
$ python2 -t code.py
code.py: inconsistent use of tabs and spaces in indentation
If, instead, you use the -tt
flag, the interpreter will issue errors instead of warnings, and your code will not run. The benefit of using this method is that the interpreter tells you where the inconsistencies are:
$ python2 -tt code.py
File "code.py", line 3
print(i, j)
^
TabError: inconsistent use of tabs and spaces in indentation
Python 3 does not allow mixing of tabs and spaces. Therefore, if you are using Python 3, then these errors are issued automatically:
$ python3 code.py
File "code.py", line 3
print(i, j)
^
TabError: inconsistent use of tabs and spaces in indentation
You can write Python code with either tabs or spaces indicating indentation. But, if you’re using Python 3, you must be consistent with your choice. Otherwise, your code will not run. PEP 8 recommends that you always use 4 consecutive spaces to indicate indentation.
Indentation Following Line Breaks
When you’re using line continuations to keep lines to under 79 characters, it is useful to use indentation to improve readability. It allows the reader to distinguish between two lines of code and a single line of code that spans two lines. There are two styles of indentation you can use.
The first of these is to align the indented block with the opening delimiter:
def function(arg_one, arg_two,
arg_three, arg_four):
return arg_one
Sometimes you can find that only 4 spaces are needed to align with the opening delimiter. This will often occur in if
statements that span multiple lines as the if
, space, and opening bracket make up 4 characters. In this case, it can be difficult to determine where the nested code block inside the if
statement begins:
x = 5
if (x > 3 and
x < 10):
print(x)
In this case, PEP 8 provides two alternatives to help improve readability:
-
Add a comment after the final condition. Due to syntax highlighting in most editors, this will separate the conditions from the nested code:
x = 5 if (x > 3 and x < 10): # Both conditions satisfied print(x)
-
Add extra indentation on the line continuation:
x = 5 if (x > 3 and x < 10): print(x)
An alternative style of indentation following a line break is a hanging indent. This is a typographical term meaning that every line but the first in a paragraph or statement is indented. You can use a hanging indent to visually represent a continuation of a line of code. Here’s an example:
var = function(
arg_one, arg_two,
arg_three, arg_four)
When using a hanging indent, add extra indentation to distinguish the continued line from code contained inside the function. The following example is difficult to read because the code inside the function is at the same indentation level as the continued lines:
# Not Recommended
def function(
arg_one, arg_two,
arg_three, arg_four):
return arg_one
Instead, it’s better to use a double indent on the line continuation. This helps you to distinguish between function arguments and the function body, improving readability:
def function(
arg_one, arg_two,
arg_three, arg_four):
return arg_one
When you write PEP 8 compliant code, the 79 character line limit forces you to add line breaks in your code. To improve readability, you should indent a continued line to show that it is a continued line. There are two ways of doing this. The first is to align the indented block with the opening delimiter. The second is to use a hanging indent. You are free to chose which method of indentation you use following a line break.
Where to Put the Closing Brace
Line continuations allow you to break lines inside parentheses, brackets, or braces. It’s easy to forget about the closing brace, but it’s important to put it somewhere sensible. Otherwise, it can confuse the reader. PEP 8 provides two options for the position of the closing brace in implied line continuations:
-
Line up the closing brace with the first non-whitespace character of the previous line:
list_of_numbers = [ 1, 2, 3, 4, 5, 6, 7, 8, 9 ]
-
Line up the closing brace with the first character of the line that starts the construct:
list_of_numbers = [ 1, 2, 3, 4, 5, 6, 7, 8, 9 ]
You are free to chose which option you use. But, as always, consistency is key, so try to stick to one of the above methods.
“If the implementation is hard to explain, it’s a bad idea.”
— The Zen of Python
You should use comments to document code as it’s written. It is important to document your code so that you, and any collaborators, can understand it. When you or someone else reads a comment, they should be able to easily understand the code the comment applies to and how it fits in with the rest of your code.
Here are some key points to remember when adding comments to your code:
- Limit the line length of comments and docstrings to 72 characters.
- Use complete sentences, starting with a capital letter.
- Make sure to update comments if you change your code.
Use block comments to document a small section of code. They are useful when you have to write several lines of code to perform a single action, such as importing data from a file or updating a database entry. They are important as they help others understand the purpose and functionality of a given code block.
PEP 8 provides the following rules for writing block comments:
- Indent block comments to the same level as the code they describe.
- Start each line with a
#
followed by a single space. - Separate paragraphs by a line containing a single
#
.
Here is a block comment explaining the function of a for
loop. Note that the sentence wraps to a new line to preserve the 79 character line limit:
for i in range(0, 10):
# Loop over i ten times and print out the value of i, followed by a
# new line character
print(i, '\n')
Sometimes, if the code is very technical, then it is necessary to use more than one paragraph in a block comment:
def quadratic(a, b, c, x):
# Calculate the solution to a quadratic equation using the quadratic
# formula.
#
# There are always two solutions to a quadratic equation, x_1 and x_2.
x_1 = (- b+(b**2-4*a*c)**(1/2)) / (2*a)
x_2 = (- b-(b**2-4*a*c)**(1/2)) / (2*a)
return x_1, x_2
If you’re ever in doubt as to what comment type is suitable, then block comments are often the way to go. Use them as much as possible throughout your code, but make sure to update them if you make changes to your code!
Inline comments explain a single statement in a piece of code. They are useful to remind you, or explain to others, why a certain line of code is necessary. Here’s what PEP 8 has to say about them:
- Use inline comments sparingly.
- Write inline comments on the same line as the statement they refer to.
- Separate inline comments by two or more spaces from the statement.
- Start inline comments with a
#
and a single space, like block comments. - Don’t use them to explain the obvious.
Below is an example of an inline comment:
x = 5 # This is an inline comment
Sometimes, inline comments can seem necessary, but you can use better naming conventions instead. Here’s an example:
x = 'John Smith' # Student Name
Here, the inline comment does give extra information. However using x
as a variable name for a person’s name is bad practice. There’s no need for the inline comment if you rename your variable:
student_name = 'John Smith'
Finally, inline comments such as these are bad practice as they state the obvious and clutter code:
empty_list = [] # Initialize empty list
x = 5
x = x * 5 # Multiply x by 5
Inline comments are more specific than block comments, and it’s easy to add them when they’re not necessary, which leads to clutter. You could get away with only using block comments so, unless you are sure you need an inline comment, your code is more likely to be PEP 8 compliant if you stick to block comments.
Documentation Strings
Documentation strings, or docstrings, are strings enclosed in double ("""
) or single ('''
) quotation marks that appear on the first line of any function, class, method, or module. You can use them to explain and document a specific block of code. There is an entire PEP, PEP 257, that covers docstrings, but you’ll get a summary in this section.
The most important rules applying to docstrings are the following:
-
Surround docstrings with three double quotes on either side, as in
"""This is a docstring"""
. -
Write them for all public modules, functions, classes, and methods.
-
Put the
"""
that ends a multiline docstring on a line by itself:def quadratic(a, b, c, x): """Solve quadratic equation via the quadratic formula. A quadratic equation has the following form: ax**2 + bx + c = 0 There always two solutions to a quadratic equation: x_1 & x_2. """ x_1 = (- b+(b**2-4*a*c)**(1/2)) / (2*a) x_2 = (- b-(b**2-4*a*c)**(1/2)) / (2*a) return x_1, x_2
-
For one-line docstrings, keep the
"""
on the same line:def quadratic(a, b, c, x): """Use the quadratic formula""" x_1 = (- b+(b**2-4*a*c)**(1/2)) / (2*a) x_2 = (- b-(b**2-4*a*c)**(1/2)) / (2*a) return x_1, x_2
For a more detailed article on documenting Python code, see Documenting Python Code: A Complete Guide by James Mertz.
Whitespace in Expressions and Statements
“Sparse is better than dense.”
— The Zen of Python
Whitespace can be very helpful in expressions and statements when used properly. If there is not enough whitespace, then code can be difficult to read, as it’s all bunched together. If there’s too much whitespace, then it can be difficult to visually combine related terms in a statement.
Whitespace Around Binary Operators
Surround the following binary operators with a single space on either side:
-
Assignment operators (
=
,+=
,-=
, and so forth) -
Comparisons (
==
,!=
,>
,<
.>=
,<=
) and (is
,is not
,in
,not in
) -
Booleans (
and
,not
,or
When there’s more than one operator in a statement, adding a single space before and after each operator can look confusing. Instead, it is better to only add whitespace around the operators with the lowest priority, especially when performing mathematical manipulation. Here are a couple examples:
# Recommended
y = x**2 + 5
z = (x+y) * (x-y)
# Not Recommended
y = x ** 2 + 5
z = (x + y) * (x - y)
You can also apply this to if
statements where there are multiple conditions:
# Not recommended
if x > 5 and x % 2 == 0:
print('x is larger than 5 and divisible by 2!')
In the above example, the and
operator has lowest priority. It may therefore be clearer to express the if
statement as below:
# Recommended
if x>5 and x%2==0:
print('x is larger than 5 and divisible by 2!')
You are free to choose which is clearer, with the caveat that you must use the same amount of whitespace either side of the operator.
The following is not acceptable:
# Definitely do not do this!
if x >5 and x% 2== 0:
print('x is larger than 5 and divisible by 2!')
In slices, colons act as a binary operators.
Therefore, the rules outlined in the previous section apply, and there should be the same amount of whitespace either side. The following examples of list slices are valid:
list[3:4]
# Treat the colon as the operator with lowest priority
list[x+1 : x+2]
# In an extended slice, both colons must be
# surrounded by the same amount of whitespace
list[3:4:5]
list[x+1 : x+2 : x+3]
# The space is omitted if a slice parameter is omitted
list[x+1 : x+2 :]
In summary, you should surround most operators with whitespace. However, there are some caveats to this rule, such as in function arguments or when you’re combining multiple operators in one statement.
When to Avoid Adding Whitespace
In some cases, adding whitespace can make code harder to read. Too much whitespace can make code overly sparse and difficult to follow. PEP 8 outlines very clear examples where whitespace is inappropriate.
The most important place to avoid adding whitespace is at the end of a line. This is known as trailing whitespace. It is invisible and can produce errors that are difficult to trace.
The following list outlines some cases where you should avoid adding whitespace:
-
Immediately inside parentheses, brackets, or braces:
# Recommended my_list = [1, 2, 3] # Not recommended my_list = [ 1, 2, 3, ]
-
Before a comma, semicolon, or colon:
x = 5 y = 6 # Recommended print(x, y) # Not recommended print(x , y)
-
Before the open parenthesis that starts the argument list of a function call:
def double(x): return x * 2 # Recommended double(3) # Not recommended double (3)
-
Before the open bracket that starts an index or slice:
# Recommended list[3] # Not recommended list [3]
-
Between a trailing comma and a closing parenthesis:
# Recommended tuple = (1,) # Not recommended tuple = (1, )
-
To align assignment operators:
# Recommended var1 = 5 var2 = 6 some_long_var = 7 # Not recommended var1 = 5 var2 = 6 some_long_var = 7
Make sure that there is no trailing whitespace anywhere in your code. There are other cases where PEP 8 discourages adding extra whitespace, such as immediately inside brackets, as well as before commas and colons. You should also never add extra whitespace in order to align operators.
Programming Recommendations
“Simple is better than complex.”
— The Zen of Python
You will often find that there are several ways to perform a similar action in Python (and any other programming language for that matter). In this section, you’ll see some of the suggestions PEP 8 provides to remove that ambiguity and preserve consistency.
Don’t compare Boolean values to True
or False
using the equivalence operator. You’ll often need to check if a Boolean value is True or False. When doing so, it is intuitive to do this with a statement like the one below:
# Not recommended
my_bool = 6 > 5
if my_bool == True:
return '6 is bigger than 5'
The use of the equivalence operator, ==
, is unnecessary here. bool
can only take values True
or False
. It is enough to write the following:
# Recommended
if my_bool:
return '6 is bigger than 5'
This way of performing an if
statement with a Boolean requires less code and is simpler, so PEP 8 encourages it.
Use the fact that empty sequences are falsy in if
statements. If you want to check whether a list is empty, you might be tempted to check the length of the list. If the list is empty, it’s length is 0
which is equivalent to False
when used in an if
statement. Here’s an example:
# Not recommended
my_list = []
if not len(my_list):
print('List is empty!')
However, in Python any empty list, string, or tuple is falsy. We can therefore come up with a simpler alternative to the above:
# Recommended
my_list = []
if not my_list:
print('List is empty!')
While both examples will print out List is empty!
, the second option is simpler, so PEP 8 encourages it.
Use is not
rather than not ... is
in if
statements. If you are trying to check whether a variable has a defined value, there are two options. The first is to evaluate an if
statement with x is not None
, as in the example below:
# Recommended
if x is not None:
return 'x exists!'
A second option would be to evaluate x is None
and then have an if
statement based on not
the outcome:
# Not recommended
if not x is None:
return 'x exists!'
While both options will be evaluated correctly, the first is simpler, so PEP 8 encourages it.
Don’t use if x:
when you mean if x is not None:
. Sometimes, you may have a function with arguments that are None
by default. A common mistake when checking if such an argument, arg
, has been given a different value is to use the following:
# Not Recommended
if arg:
# Do something with arg...
This code checks that arg
is truthy. Instead, you want to check that arg
is not None
, so it would be better to use the following:
# Recommended
if arg is not None:
# Do something with arg...
The mistake being made here is assuming that not None
and truthy are equivalent. You could have set arg = []
. As we saw above, empty lists are evaluated as falsy in Python. So, even though the argument arg
has been assigned, the condition is not met, and so the code in the body of the if
statement will not be executed.
Use .startswith()
and .endswith()
instead of slicing. If you were trying to check if a string word
was prefixed, or suffixed, with the word cat
, it might seem sensible to use list slicing. However, list slicing is prone to error, and you have to hardcode the number of characters in the prefix or suffix. It is also not clear to someone less familiar with Python list slicing what you are trying to achieve:
# Not recommended
if word[:3] == 'cat':
print('The word starts with "cat"')
However, this is not as readable as using .startswith()
:
# Recommended
if word.startswith('cat'):
print('The word starts with "cat"')
Similarly, the same principle applies when you’re checking for suffixes. The example below outlines how you might check whether a string ends in jpg
:
# Not recommended
if file_name[-3:] == 'jpg':
print('The file is a JPEG')
While the outcome is correct, the notation is a bit clunky and hard to read. Instead, you could use .endswith()
as in the example below:
# Recommended
if file_name.endswith('jpg'):
print('The file is a JPEG')
As with most of these programming recommendations, the goal is readability and simplicity. In Python, there are many different ways to perform the same action, so guidelines on which methods to chose are helpful.
When to Ignore PEP 8
The short answer to this question is never. If you follow PEP 8 to the letter, you can guarantee that you’ll have clean, professional, and readable code. This will benefit you as well as collaborators and potential employers.
However, some guidelines in PEP 8 are inconvenient in the following instances:
- If complying with PEP 8 would break compatibility with existing software
- If code surrounding what you’re working on is inconsistent with PEP 8
- If code needs to remain compatible with older versions of Python
Tips and Tricks to Help Ensure Your Code Follows PEP 8
There is a lot to remember to make sure your code is PEP 8 compliant. It can be a tall order to remember all these rules when you’re developing code. It’s particularly time consuming to update past projects to be PEP 8 compliant. Luckily, there are tools that can help speed up this process. There are two classes of tools that you can use to enforce PEP 8 compliance: linters and autoformatters.
Linters
Linters are programs that analyze code and flag errors. They provide suggestions on how to fix the error. Linters are particularly useful when installed as extensions to your text editor, as they flag errors and stylistic problems while you write. In this section, you’ll see an outline of how the linters work, with links to the text editor extensions at the end.
The best linters for Python code are the following:
-
pycodestyle
is a tool to check your Python code against some of the style conventions in PEP 8.Install
pycodestyle
usingpip
:$ pip install pycodestyle
You can run
pycodestyle
from the terminal using the following command:$ pycodestyle code.py code.py:1:17: E231 missing whitespace after ',' code.py:2:21: E231 missing whitespace after ',' code.py:6:19: E711 comparison to None should be 'if cond is None:'
-
flake8
is a tool that combines a debugger,pyflakes
, withpycodestyle
.Install
flake8
usingpip
:Run
flake8
from the terminal using the following command:$ flake8 code.py code.py:1:17: E231 missing whitespace after ',' code.py:2:21: E231 missing whitespace after ',' code.py:3:17: E999 SyntaxError: invalid syntax code.py:6:19: E711 comparison to None should be 'if cond is None:'
An example of the output is also shown.
These are also available as extensions for Atom, Sublime Text, Visual Studio Code, and VIM. You can also find guides on setting up Sublime Text and VIM for Python development, as well as an overview of some popular text editors at Real Python.
Autoformatters
Autoformatters are programs that refactor your code to conform with PEP 8 automatically. Once such program is black
, which autoformats code following most of the rules in PEP 8. One big difference is that it limits line length to 88 characters, rather than 79. However, you can overwrite this by adding a command line flag, as you’ll see in an example below.
Install black
using pip
. It requires Python 3.6+ to run:
It can be run via the command line, as with the linters. Let’s say you start with the following code that isn’t PEP 8 compliant in a file called code.py
:
for i in range(0,3):
for j in range(0,3):
if (i==2):
print(i,j)
You can then run the following command via the command line:
$ black code.py
reformatted code.py
All done! ✨ 🍰 ✨
code.py
will be automatically reformatted to look like this:
for i in range(0, 3):
for j in range(0, 3):
if i == 2:
print(i, j)
If you want to alter the line length limit, then you can use the --line-length
flag:
$ black --line-length=79 code.py
reformatted code.py
All done! ✨ 🍰 ✨
Two other autoformatters, autopep8
and yapf
, perform actions that are similar to what black
does.
Another Real Python tutorial, Python Code Quality: Tools & Best Practices by Alexander van Tol, gives a thorough explanation of how to use these tools.
Conclusion
You now know how to write high-quality, readable Python code by using the guidelines laid out in PEP 8. While the guidelines can seem pedantic, following them can really improve your code, especially when it comes to sharing your code with potential employers or collaborators.
In this tutorial, you learned:
- What PEP 8 is and why it exists
- Why you should aim to write PEP 8 compliant code
- How to write code that is PEP 8 compliant
On top of all this, you also saw how to use linters and autoformatters to check your code against PEP 8 guidelines.
If you want to learn more about PEP 8, then you can read the full documentation, or visit pep8.org, which contains the same information but has been nicely formatted. In these documents, you will find the rest of the PEP 8 guidelines not covered in this tutorial.
Watch Now This tutorial has a related video course created by the Real Python team. Watch it together with the written tutorial to deepen your understanding: Writing Beautiful Pythonic Code With PEP 8
Документ Python Enhancement Proposal #8 (сокращенно РЕР8) содержит предложения по стилевому оформлению кода программ на языке Python. Вообще говоря, вы вправе форматировать свой код так, как считаете нужным. Однако применение единообразного стиля облегчит изучение кода другими людьми и улучшит его удобочитаемость. Совместное использование общего стиля с другими Руthоn-программистами в рамках большого сообщества способствует улучшению качества программ при коллективной работе над проектами. Но даже если единственный человек, который когда-либо будет читать ваш код, — это вы, соблюдение рекомендаций РЕР 8 облегчит внесение последующих изменений в код.
Документ РЕР8 содержит детализированные правила написания кода на Python. По мере развития языка этот документ постоянно обновляется. Было бы неплохо, если бы вы прочитали целиком все руководство
Ниже приведены некоторые правила, которых следует обязательно придерживаться.
В языке Python пробелы имеют синтаксическое значение. Особое значение Pythоn-программисты придают влиянию пробелов на удобочитаемость кода.
В документе РЕР8 для различных элементов языка предлагается свой стиль имен. Благодаря этому можно легко определить в процессе чтения кода, какому типу соответствует то или иное имя:
Одно из положений дзен-философии Python гласит: «Должен существовать один — и предпочтительно только один — очевидный способ сделать это». В рекомендациях документа РЕР8 предпринимается попытка кодифицировать такой стиль написания выражений и предложений.
В каждом подразделе модули должны располагаться в алфавитном порядке.
Если вы спросите программистов Python, что им больше всего нравится в Python, они часто будут ссылаться на его высокую читабельность. Действительно, высокий уровень читабельности лежит в основе дизайна языка Python, следуя общепризнанному факту, что код читается гораздо чаще, чем пишется.
Одной из причин высокой читабельности кода Python является его полный набор рекомендаций PEP8 по стилю кода и «Pythonic» идиом.
Когда ветеран Python-разработчик (Pythonista) называет части кода не «Pythonic», они обычно означают, что эти строки кода не следуют общим правилам и не выражают свое намерение в том, что считается лучшим (слушайте: наиболее читаемый) путь.
В некоторых случаях не было достигнуто соглашения о том, как выразить намерение в коде Python, но такие случаи редки.
Хотя в Python возможен любой вид черной магии, наиболее явный и простой способ предпочтителен.
В приведенном выше хорошем коде x и y явно принимаются от вызывающей стороны, и возвращается явный словарь. Разработчик, использующий эту функцию, точно знает, что делать, читая первые и последние строки, что не так с плохим примером.
Одно утверждение на строку
Несмотря на то, что некоторые составные операторы, такие как списочные выражения, допускаются и ценятся за их краткость и выразительность, использование двух разделенных операторов в одной строке кода является плохой практикой.
Плохо
print 'one'; print 'two' if x == 1: print 'one' if <complex comparison> and <other complex comparison>: # do something
Хорошо
print 'one' print 'two' if x == 1: print 'one' cond1 = <complex comparison> cond2 = <other complex comparison> if cond1 and cond2: # do something
Аргументы функции
Аргументы могут быть переданы в функции четырьмя различными способами.
- Позиционные аргументы являются обязательными и не имеют значений по умолчанию. Они являются простейшей формой аргументов и могут использоваться для нескольких аргументов функции, которые полностью являются частью значения функции, и их порядок является естественным. Например, пользователь или пользователь функции без труда помнит, что эти две функции требуют двух аргументов и в каком порядке.
send(message, recipient)
point(x, y)
В этих двух случаях, можно использовать имена аргументов при вызове функции и, делая это, можно изменить порядок аргументов, вызывая, например , и , но это снижает читаемость и излишне многословные, по сравнению с более простыми вызовами к и .send(recipient='World', message='Hello')
point(y=2, x=1)
send('Hello', 'World')
point(1, 2)
- Аргументы ключевых слов не являются обязательными и имеют значения по умолчанию. Они часто используются для необязательных параметров, отправляемых в функцию. Когда функция имеет более двух или трех позиционных параметров, ее сигнатуру труднее запомнить, и полезно использовать аргументы ключевых слов со значениями по умолчанию. Например, более полная
send
функция может быть определена как . Здесь и не являются обязательными, и оценивают, когда им не передается другое значение.send(message, to, cc=None, bcc=None)
cc
bcc
None
Вызов функции с аргументами ключевых слов может быть выполнен несколькими способами в Python; например, можно следовать порядку аргументов в определении, не называя аргументы в явном виде, как, например , отправляя слепую копию для Бога. Также можно было бы назвать аргументы в другом порядке, например, в . Эти две возможности лучше избегать без каких — либо веских причин , чтобы не следить за синтаксис , который ближе всего к определению функции: .send('Hello', 'World', 'Cthulhu', 'God')
send('Hello again', 'World', bcc='God', cc='Cthulhu')
send('Hello', 'World', cc='Cthulhu', bcc='God')
В качестве примечания, следуя принципу YAGNI , зачастую сложнее удалить необязательный аргумент (и его логику внутри функции), который был добавлен «на всякий случай» и, по-видимому, никогда не используется, чем добавить новый необязательный аргумент и его логика, когда это необходимо.
- Список произвольных аргументов — это третий способ передачи аргументов в функцию. Если намерение функции лучше выражается сигнатурой с расширяемым числом позиционных аргументов, ее можно определить с помощью
*args
конструкций. В теле функцииargs
будет кортеж всех оставшихся позиционных аргументов. Например, может быть вызван с каждым получателем в качестве аргумента:, а в теле функции будет равно .send(message, *args)
send('Hello', 'God', 'Mom', 'Cthulhu')
args
('God', 'Mom', 'Cthulhu')
Однако эта конструкция имеет некоторые недостатки и должна использоваться с осторожностью. Если функция получает список аргументов одинаковой природы, часто более понятно определить ее как функцию одного аргумента, причем этот аргумент является списком или любой последовательностью. Здесь, если send
есть несколько получателей, лучше определить это явно: и вызвать его с помощью . Таким образом, пользователь функции может заранее манипулировать списком получателей как списком, и это открывает возможность для передачи любой последовательности, включая итераторы, которая не может быть распакована как другие последовательности.send(message, recipients)
send('Hello', ['God', 'Mom', 'Cthulhu'])
- Произвольный ключевой слово словарь аргумента является последним способом передать аргументы функции. Если функции требуется неопределенная серия именованных аргументов, можно использовать
**kwargs
конструкцию. В теле функцииkwargs
будет словарь всех переданных именованных аргументов, которые не были перехвачены другими ключевыми аргументами в сигнатуре функции.
Необходима та же осторожность, что и в случае списка произвольных аргументов , по аналогичным причинам: эти мощные методы должны использоваться, когда существует явная необходимость их использования, и их не следует использовать, если более простая и понятная конструкция достаточна для выразить намерение функции.
Программист должен написать функцию, чтобы определить, какие аргументы являются позиционными аргументами, а какие — необязательными аргументами ключевых слов, и решить, использовать ли передовые методы передачи произвольных аргументов. Если следовать приведенному выше совету разумно, можно и приятно писать функции Python, которые:
- легко читается (имя и аргументы не нуждаются в объяснениях)
- легко изменить (добавление нового ключевого аргумента не нарушает другие части кода)
Избегайте волшебной палочки
Мощный инструмент для хакеров, Python поставляется с очень богатым набором хуков и инструментов, позволяющих вам выполнять практически любые хитрые трюки. Например, можно выполнить каждое из следующих действий:
- изменить способ создания и создания объектов
- изменить способ импорта модулей интерпретатором Python
- Можно даже (и рекомендуется при необходимости) встроить подпрограммы C в Python.
Тем не менее, все эти варианты имеют много недостатков, и всегда лучше использовать самый простой способ для достижения вашей цели. Основным недостатком является то, что читаемость сильно страдает при использовании этих конструкций. Многие инструменты анализа кода, такие как pylint или pyflakes, не смогут проанализировать этот «волшебный» код.
Мы считаем, что разработчик Python должен знать об этих почти безграничных возможностях, потому что это вселяет уверенность в том, что на пути не будет непроходимых проблем. Однако очень важно знать, как и, в частности, когда их не использовать.
Подобно мастеру кунг-фу, питонист знает, как убивать одним пальцем, и никогда не делать этого на самом деле.
Мы все ответственные пользователи
Как видно выше, Python допускает множество трюков, и некоторые из них потенциально опасны. Хорошим примером является то, что любой клиентский код может переопределять свойства и методы объекта: в Python нет ключевого слова «private». Эта философия, очень отличающаяся от языков с высокой степенью защиты, таких как Java, которые предоставляют множество механизмов для предотвращения любого неправильного использования, выражается высказыванием: «Мы все ответственные пользователи».
Это не означает, что, например, никакие свойства не считаются закрытыми и что правильная инкапсуляция невозможна в Python. Скорее, вместо того, чтобы полагаться на бетонные стены, возводимые разработчиками между их кодом и чужим, сообщество Python предпочитает полагаться на ряд соглашений, указывающих, что к этим элементам не следует обращаться напрямую.
Основное соглашение для частных свойств и деталей реализации заключается в добавлении префикса ко всем «внутренним элементам». Если клиентский код нарушает это правило и получает доступ к этим отмеченным элементам, любое неправильное поведение или проблемы, возникшие при изменении кода, являются ответственностью клиентского кода.
Использование этого соглашения приветствуется: любой метод или свойство, которые не предназначены для использования клиентским кодом, должны начинаться с подчеркивания. Это гарантирует лучшее разделение обязанностей и более легкую модификацию существующего кода; всегда будет возможно обнародовать частную собственность, но сделать публичную собственность частной может быть гораздо более сложной операцией.
Возвращение значения
Когда функция усложняется, нередко используют несколько операторов return внутри тела функции. Однако, чтобы сохранить четкое намерение и устойчивый уровень читабельности, желательно избегать возврата значимых значений из многих выходных точек в теле.
Существует два основных случая возврата значений в функцию: результат возврата функции, когда она была обработана нормально, и случаи ошибок, которые указывают на неправильный входной параметр, или любую другую причину, по которой функция не может завершить вычисление или задача.
Если вы не хотите вызывать исключения для второго случая, может потребоваться возврат значения, такого как None или False, указывающего, что функция не может работать правильно. В этом случае лучше вернуться, как только был обнаружен неправильный контекст. Это поможет сгладить структуру функции: весь код после оператора return-from-of-error может предполагать, что условие выполнено для дальнейшего вычисления основного результата функции. Наличие нескольких таких операторов возврата часто необходимо.
Однако, когда функция имеет несколько основных точек выхода для своего нормального хода, становится трудно отлаживать возвращаемый результат, поэтому может быть предпочтительнее сохранить одну точку выхода. Это также поможет выделить некоторые пути кода, а несколько точек выхода являются вероятным признаком того, что такой рефакторинг необходим.
def complex_function(a, b, c): if not a: return None # Raising an exception might be better if not b: return None # Raising an exception might be better # Some complex code trying to compute x from a, b and c # Resist temptation to return x if succeeded if not x: # Some Plan-B computation of x return x # One single exit point for the returned value x will help # when maintaining the code.
Идиомы
Проще говоря, идиома программирования — это способ написания кода. Понятие идиом программирования подробно обсуждается на c2 и в Stack Overflow .
Идиоматический код Python часто называют Pythonic .
Хотя обычно есть один — и предпочтительно только один — очевидный способ сделать это; способ писать идиоматические коды Python могут быть неочевидными для начинающего Python. Таким образом, хорошие идиомы должны быть осознанно приобретены.
Ниже приведены некоторые распространенные идиомы Python:
Распаковка
Если вы знаете длину списка или кортежа, вы можете назначить имена его элементам при распаковке. Например, поскольку enumerate()
будет предоставлять кортеж из двух элементов для каждого элемента в списке:
for index, item in enumerate(some_list): # do something with index and item
Вы также можете использовать это для замены переменных:
a, b = b, a
Вложенная распаковка тоже работает:
a, (b, c) = 1, (2, 3)
В Python 3 новый метод расширенной распаковки был представлен PEP3132 :
a, *rest = [1, 2, 3] # a = 1, rest = [2, 3] a, *middle, c = [1, 2, 3, 4] # a = 1, middle = [2, 3], c = 4
Создать игнорируемую переменную
Если вам нужно что-то назначить (например, в распаковке ), но вам не понадобится эта переменная, используйте __
:
filename = 'foobar.txt' basename, __, ext = filename.rpartition('.')
Заметка
Многие руководства по стилю Python рекомендуют использовать одно подчеркивание «
_
» для одноразовых переменных, а не двойное подчеркивание «__
», рекомендованное здесь.Проблема заключается в том, что «_
» обычно используется в качестве псевдонима дляgettext()
функции, а также в интерактивном приглашении для хранения значения последней операции.Вместо этого использование двойного подчеркивания является столь же понятным и почти таким же удобным, и исключает риск случайного вмешательства в любой из этих других случаев использования.
Создайте список длины N того же самого
Используйте *
оператор списка Python :
four_nones = [None] * 4
Создание списка длины N списков
Поскольку списки являются изменяемыми, *
оператор (как указано выше) создаст список из N ссылок на один и тот же список, что вряд ли вам нужно. Вместо этого используйте понимание списка:
four_lists = [[] for __ in xrange(4)]
Примечание: используйте range () вместо xrange () в Python 3.
Создать строку из списка
Распространенная идиома для создания строк — использовать str.join()
пустую строку.
letters = ['s', 'p', 'a', 'm'] word = ''.join(letters)
Это установит значение переменной word в «spam». Эта идиома может применяться к спискам и кортежам.
Поиск предмета в коллекции
Иногда нам нужно искать в коллекции вещей. Давайте рассмотрим два варианта: списки и наборы.
Возьмите следующий код для примера:
s = set(['s', 'p', 'a', 'm']) l = ['s', 'p', 'a', 'm'] def lookup_set(s): return 's' in s def lookup_list(l): return 's' in l
Хотя обе функции выглядят одинаково, поскольку lookup_set использует тот факт, что наборы в Python являются хеш-таблицами, производительность поиска между ними очень различна. Чтобы определить, есть ли элемент в списке, Python должен будет просмотреть каждый элемент, пока не найдет соответствующий элемент. Это отнимает много времени, особенно для длинных списков. В наборе, с другой стороны, хеш элемента сообщит Python, где в наборе искать соответствующий элемент. В результате поиск может быть выполнен быстро, даже если набор большой. Поиск в словарях работает так же. Для получения дополнительной информации см. Эту страницу StackOverflow . Для получения подробной информации о времени, которое различные общие операции выполняют для каждой из этих структур данных, см. Эту страницу .
Из-за этих различий в производительности часто рекомендуется использовать наборы или словари вместо списков в случаях, когда:
- Коллекция будет содержать большое количество предметов
- Вы будете неоднократно искать предметы в коллекции
- У вас нет дубликатов.
Для небольших коллекций или коллекций, в которых вы не часто будете искать, дополнительное время и память, необходимые для настройки хэш-таблицы, часто будут больше, чем время, сэкономленное благодаря улучшенной скорости поиска.
Дзен питона
Также известен как PEP 20 , руководящие принципы для дизайна Python.
>>> import this The Zen of Python, by Tim Peters Beautiful is better than ugly. Explicit is better than implicit. Simple is better than complex. Complex is better than complicated. Flat is better than nested. Sparse is better than dense. Readability counts. Special cases aren't special enough to break the rules. Although practicality beats purity. Errors should never pass silently. Unless explicitly silenced. In the face of ambiguity, refuse the temptation to guess. There should be one-- and preferably only one --obvious way to do it. Although that way may not be obvious at first unless you're Dutch. Now is better than never. Although never is often better than *right* now. If the implementation is hard to explain, it's a bad idea. If the implementation is easy to explain, it may be a good idea. Namespaces are one honking great idea -- let's do more of those!
Некоторые примеры хорошего стиля Python см. На этих слайдах из группы пользователей Python .
Вот некоторые соглашения, которым вы должны следовать, чтобы сделать ваш код легче для чтения.
Проверьте, равна ли переменная постоянной
Вам не нужно явно сравнивать значение с True, None или 0 — вы можете просто добавить его в оператор if. См. Проверка истинности значения для получения списка того, что считается ложным.
Плохо :
if attr == True: print 'True!' if attr == None: print 'attr is None!'
Хорошо :
# Just check the value if attr: print 'attr is truthy!' # or check for the opposite if not attr: print 'attr is falsey!' # or, since None is considered false, explicitly check for it if attr is None: print 'attr is None!
Доступ к элементу словаря
Не используйте dict.has_key()
метод. Вместо этого используйте синтаксис или передайте аргумент по умолчанию для .x in d
dict.get()
Плохо :
d = {'hello': 'world'} if d.has_key('hello'): print d['hello'] # prints 'world' else: print 'default_value'
Хорошо :
d = {'hello': 'world'} print d.get('hello', 'default_value') # prints 'world' print d.get('thingy', 'default_value') # prints 'default_value' # Or: if 'hello' in d: print d['hello']
Короткие способы манипулирования списками
Постижения списков предоставляют мощный и лаконичный способ работы со списками.
Выражения генератора следуют почти тому же синтаксису, что и списки, но возвращают генератор вместо списка.
Создание нового списка требует больше работы и использует больше памяти. Если вы просто собираетесь пройтись по новому списку, используйте вместо этого итератор.
Плохо :
# needlessly allocates a list of all (gpa, name) entires in memory valedictorian = max([(student.gpa, student.name) for student in graduates])
Хорошо :
valedictorian = max((student.gpa, student.name) for student in graduates)
Используйте списки, когда вам действительно нужно создать второй список, например, если вам нужно использовать результат несколько раз.
Если ваша логика слишком сложна для понимания короткого списка или выражения генератора, рассмотрите возможность использования функции генератора вместо возврата списка.
Хорошо :
def make_batches(items, batch_size): """ >>> list(make_batches([1, 2, 3, 4, 5], batch_size=3)) [[1, 2, 3], [4, 5]] """ current_batch = [] for item in items: current_batch.append(item) if len(current_batch) == batch_size: yield current_batch current_batch = [] yield current_batch
Никогда не используйте списочное понимание только для его побочных эффектов.
Плохо :
[print(x) for x in sequence]
Хорошо :
for x in sequence: print(x)
Фильтрация списка
Плохо :
Никогда не удаляйте элементы из списка, пока вы просматриваете его.
# Filter elements greater than 4 a = [3, 4, 5] for i in a: if i > 4: a.remove(i)
Не делайте несколько проходов по списку.
while i in a: a.remove(i)
Хорошо :
Используйте понимание списка или выражение генератора.
# comprehensions create a new list object filtered_values = [value for value in sequence if value != x] # generators don't create another list filtered_values = (value for value in sequence if value != x)
Возможные побочные эффекты изменения исходного списка
Изменение исходного списка может быть рискованным, если на него ссылаются другие переменные. Но вы можете использовать назначение срезов, если вы действительно хотите это сделать.
# replace the contents of the original list sequence[::] = [value for value in sequence if value != x]
Изменение значений в списке
Плохо :
Помните, что назначение никогда не создает новый объект. Если две или более переменных ссылаются на один и тот же список, изменение одной из них изменит их все.
# Add three to all list members. a = [3, 4, 5] b = a # a and b refer to the same list object for i in range(len(a)): a[i] += 3 # b[i] also changes
Хорошо :
Безопаснее создать новый объект списка и оставить оригинал в покое.
a = [3, 4, 5] b = a # assign the variable "a" to a new list without changing "b" a = [i + 3 for i in a]
Используйте enumerate()
счетчик вашего места в списке.
a = [3, 4, 5] for i, item in enumerate(a): print i, item # prints # 0 3 # 1 4 # 2 5
enumerate()
Функция имеет лучшую читаемость , чем обработка счетчика вручную. Более того, он лучше оптимизирован для итераторов.
Читать из файла
Используйте синтаксис для чтения из файлов. Это автоматически закроет файлы для вас.with open
Плохо :
f = open('file.txt') a = f.read() print a f.close()
Хорошо :
with open('file.txt') as f: for line in f: print line
Это with
утверждение лучше, потому что оно гарантирует, что вы всегда закроете файл, даже если внутри with
блока возникнет исключение .
Продолжение строки
Когда логическая строка кода длиннее допустимого предела, вам необходимо разбить ее на несколько физических строк. Интерпретатор Python объединяет последовательные строки, если последний символ строки является обратной косой чертой. Это полезно в некоторых случаях, но, как правило, его следует избегать из-за его хрупкости: пробел, добавленный в конец строки после обратной косой черты, нарушит код и может привести к неожиданным результатам.
Лучшее решение — использовать круглые скобки вокруг ваших элементов. Оставленный с незакрытой круглой скобкой в конце строки, интерпретатор Python присоединится к следующей строке, пока круглые скобки не будут закрыты. То же самое относится и к фигурным и квадратным скобкам.
Плохо :
my_very_big_string = """For a long time I used to go to bed early. Sometimes, \ when I had put out my candle, my eyes would close so quickly that I had not even \ time to say “I’m going to sleep.”""" from some.deep.module.inside.a.module import a_nice_function, another_nice_function, \ yet_another_nice_function
Хорошо :
my_very_big_string = ( "For a long time I used to go to bed early. Sometimes, " "when I had put out my candle, my eyes would close so quickly " "that I had not even time to say “I’m going to sleep.”" ) from some.deep.module.inside.a.module import ( a_nice_function, another_nice_function, yet_another_nice_function)
Однако чаще всего разделение длинной логической строки является признаком того, что вы пытаетесь сделать слишком много вещей одновременно, что может ухудшить читабельность.
If you ask Python programmers what they like most about Python, they will
often cite its high readability. Indeed, a high level of readability
is at the heart of the design of the Python language, following the
recognized fact that code is read much more often than it is written.
One reason for the high readability of Python code is its relatively
complete set of Code Style guidelines and “Pythonic” idioms.
When a veteran Python developer (a Pythonista) calls portions of
code not “Pythonic”, they usually mean that these lines
of code do not follow the common guidelines and fail to express its intent in
what is considered the best (hear: most readable) way.
On some border cases, no best way has been agreed upon on how to express
an intent in Python code, but these cases are rare.
General concepts¶
Explicit code¶
While any kind of black magic is possible with Python, the
most explicit and straightforward manner is preferred.
Bad
def make_complex(*args): x, y = args return dict(**locals())
Good
def make_complex(x, y): return {'x': x, 'y': y}
In the good code above, x and y are explicitly received from
the caller, and an explicit dictionary is returned. The developer
using this function knows exactly what to do by reading the
first and last lines, which is not the case with the bad example.
One statement per line¶
While some compound statements such as list comprehensions are
allowed and appreciated for their brevity and their expressiveness,
it is bad practice to have two disjointed statements on the same line of code.
Bad
print('one'); print('two') if x == 1: print('one') if <complex comparison> and <other complex comparison>: # do something
Good
print('one') print('two') if x == 1: print('one') cond1 = <complex comparison> cond2 = <other complex comparison> if cond1 and cond2: # do something
Function arguments¶
Arguments can be passed to functions in four different ways.
- Positional arguments are mandatory and have no default values. They are
the simplest form of arguments and they can be used for the few function
arguments that are fully part of the function’s meaning and their order is
natural. For instance, insend(message, recipient)
orpoint(x, y)
the user of the function has no difficulty remembering that those two
functions require two arguments, and in which order.
In those two cases, it is possible to use argument names when calling the
functions and, doing so, it is possible to switch the order of arguments,
calling for instance send(recipient='World', message='Hello')
and
point(y=2, x=1)
but this reduces readability and is unnecessarily verbose,
compared to the more straightforward calls to send('Hello', 'World')
and
point(1, 2)
.
- Keyword arguments are not mandatory and have default values. They are
often used for optional parameters sent to the function. When a function has
more than two or three positional parameters, its signature is more difficult
to remember and using keyword arguments with default values is helpful. For
instance, a more completesend
function could be defined as
send(message, to, cc=None, bcc=None)
. Herecc
andbcc
are
optional, and evaluate toNone
when they are not passed another value.
Calling a function with keyword arguments can be done in multiple ways in
Python; for example, it is possible to follow the order of arguments in the
definition without explicitly naming the arguments, like in
send('Hello', 'World', 'Cthulhu', 'God')
, sending a blind carbon copy to
God. It would also be possible to name arguments in another order, like in
send('Hello again', 'World', bcc='God', cc='Cthulhu')
. Those two
possibilities are better avoided without any strong reason to not follow the
syntax that is the closest to the function definition:
send('Hello', 'World', cc='Cthulhu', bcc='God')
.
As a side note, following the YAGNI
principle, it is often harder to remove an optional argument (and its logic
inside the function) that was added “just in case” and is seemingly never used,
than to add a new optional argument and its logic when needed.
- The arbitrary argument list is the third way to pass arguments to a
function. If the function intention is better expressed by a signature with
an extensible number of positional arguments, it can be defined with the
*args
constructs. In the function body,args
will be a tuple of all
the remaining positional arguments. For example,send(message, *args)
can be called with each recipient as an argument:send('Hello', 'God',
, and in the function body
'Mom', 'Cthulhu')args
will be equal to
('God', 'Mom', 'Cthulhu')
.
However, this construct has some drawbacks and should be used with caution. If a
function receives a list of arguments of the same nature, it is often more
clear to define it as a function of one argument, that argument being a list or
any sequence. Here, if send
has multiple recipients, it is better to define
it explicitly: send(message, recipients)
and call it with send('Hello',
. This way, the user of the function can manipulate
['God', 'Mom', 'Cthulhu'])
the recipient list as a list beforehand, and it opens the possibility to pass
any sequence, including iterators, that cannot be unpacked as other sequences.
- The arbitrary keyword argument dictionary is the last way to pass
arguments to functions. If the function requires an undetermined series of
named arguments, it is possible to use the**kwargs
construct. In the
function body,kwargs
will be a dictionary of all the passed named
arguments that have not been caught by other keyword arguments in the
function signature.
The same caution as in the case of arbitrary argument list is necessary, for
similar reasons: these powerful techniques are to be used when there is a
proven necessity to use them, and they should not be used if the simpler and
clearer construct is sufficient to express the function’s intention.
It is up to the programmer writing the function to determine which arguments
are positional arguments and which are optional keyword arguments, and to
decide whether to use the advanced techniques of arbitrary argument passing. If
the advice above is followed wisely, it is possible and enjoyable to write
Python functions that are:
- easy to read (the name and arguments need no explanations)
- easy to change (adding a new keyword argument does not break other parts of
the code)
Avoid the magical wand¶
A powerful tool for hackers, Python comes with a very rich set of hooks and
tools allowing you to do almost any kind of tricky tricks. For instance, it is
possible to do each of the following:
- change how objects are created and instantiated
- change how the Python interpreter imports modules
- It is even possible (and recommended if needed) to embed C routines in Python.
However, all these options have many drawbacks and it is always better to use
the most straightforward way to achieve your goal. The main drawback is that
readability suffers greatly when using these constructs. Many code analysis
tools, such as pylint or pyflakes, will be unable to parse this “magic” code.
We consider that a Python developer should know about these nearly infinite
possibilities, because it instills confidence that no impassable problem will
be on the way. However, knowing how and particularly when not to use
them is very important.
Like a kung fu master, a Pythonista knows how to kill with a single finger, and
never to actually do it.
We are all responsible users¶
As seen above, Python allows many tricks, and some of them are potentially
dangerous. A good example is that any client code can override an object’s
properties and methods: there is no “private” keyword in Python. This
philosophy, very different from highly defensive languages like Java, which
give a lot of mechanisms to prevent any misuse, is expressed by the saying: “We
are all responsible users”.
This doesn’t mean that, for example, no properties are considered private, and
that no proper encapsulation is possible in Python. Rather, instead of relying
on concrete walls erected by the developers between their code and others’, the
Python community prefers to rely on a set of conventions indicating that these
elements should not be accessed directly.
The main convention for private properties and implementation details is to
prefix all “internals” with an underscore. If the client code breaks this rule
and accesses these marked elements, any misbehavior or problems encountered if
the code is modified is the responsibility of the client code.
Using this convention generously is encouraged: any method or property that is
not intended to be used by client code should be prefixed with an underscore.
This will guarantee a better separation of duties and easier modification of
existing code; it will always be possible to publicize a private property,
but making a public property private might be a much harder operation.
Returning values¶
When a function grows in complexity, it is not uncommon to use multiple return
statements inside the function’s body. However, in order to keep a clear intent
and a sustainable readability level, it is preferable to avoid returning
meaningful values from many output points in the body.
There are two main cases for returning values in a function: the result of the
function return when it has been processed normally, and the error cases that
indicate a wrong input parameter or any other reason for the function to not be
able to complete its computation or task.
If you do not wish to raise exceptions for the second case, then returning a
value, such as None or False, indicating that the function could not perform
correctly might be needed. In this case, it is better to return as early as the
incorrect context has been detected. It will help to flatten the structure of
the function: all the code after the return-because-of-error statement can
assume the condition is met to further compute the function’s main result.
Having multiple such return statements is often necessary.
However, when a function has multiple main exit points for its normal course,
it becomes difficult to debug the returned result, so it may be preferable to
keep a single exit point. This will also help factoring out some code paths,
and the multiple exit points are a probable indication that such a refactoring
is needed.
def complex_function(a, b, c): if not a: return None # Raising an exception might be better if not b: return None # Raising an exception might be better # Some complex code trying to compute x from a, b and c # Resist temptation to return x if succeeded if not x: # Some Plan-B computation of x return x # One single exit point for the returned value x will help # when maintaining the code.
Idioms¶
A programming idiom, put simply, is a way to write code. The notion of
programming idioms is discussed amply at c2
and at Stack Overflow.
Idiomatic Python code is often referred to as being Pythonic.
Although there usually is one — and preferably only one — obvious way to do
it; the way to write idiomatic Python code can be non-obvious to Python
beginners. So, good idioms must be consciously acquired.
Some common Python idioms follow:
Unpacking¶
If you know the length of a list or tuple, you can assign names to its
elements with unpacking. For example, since enumerate()
will provide
a tuple of two elements for each item in list:
for index, item in enumerate(some_list): # do something with index and item
You can use this to swap variables as well:
Nested unpacking works too:
In Python 3, a new method of extended unpacking was introduced by
PEP 3132:
a, *rest = [1, 2, 3] # a = 1, rest = [2, 3] a, *middle, c = [1, 2, 3, 4] # a = 1, middle = [2, 3], c = 4
Create an ignored variable¶
If you need to assign something (for instance, in Unpacking) but
will not need that variable, use __
:
filename = 'foobar.txt' basename, __, ext = filename.rpartition('.')
Note
Many Python style guides recommend the use of a single underscore “_
”
for throwaway variables rather than the double underscore “__
”
recommended here. The issue is that “_
” is commonly used as an alias
for the gettext()
function, and is also used at the
interactive prompt to hold the value of the last operation. Using a
double underscore instead is just as clear and almost as convenient,
and eliminates the risk of accidentally interfering with either of
these other use cases.
Create a length-N list of the same thing¶
Use the Python list *
operator:
Create a length-N list of lists¶
Because lists are mutable, the *
operator (as above) will create a list
of N references to the same list, which is not likely what you want.
Instead, use a list comprehension:
four_lists = [[] for __ in range(4)]
Create a string from a list¶
A common idiom for creating strings is to use str.join()
on an empty
string.
letters = ['s', 'p', 'a', 'm'] word = ''.join(letters)
This will set the value of the variable word to ‘spam’. This idiom can be
applied to lists and tuples.
Searching for an item in a collection¶
Sometimes we need to search through a collection of things. Let’s look at two
options: lists and sets.
Take the following code for example:
s = set(['s', 'p', 'a', 'm']) l = ['s', 'p', 'a', 'm'] def lookup_set(s): return 's' in s def lookup_list(l): return 's' in l
Even though both functions look identical, because lookup_set is utilizing
the fact that sets in Python are hashtables, the lookup performance
between the two is very different. To determine whether an item is in a list,
Python will have to go through each item until it finds a matching item.
This is time consuming, especially for long lists. In a set, on the other
hand, the hash of the item will tell Python where in the set to look for
a matching item. As a result, the search can be done quickly, even if the
set is large. Searching in dictionaries works the same way. For
more information see this
StackOverflow
page. For detailed information on the amount of time various common operations
take on each of these data structures, see
this page.
Because of these differences in performance, it is often a good idea to use
sets or dictionaries instead of lists in cases where:
- The collection will contain a large number of items
- You will be repeatedly searching for items in the collection
- You do not have duplicate items.
For small collections, or collections which you will not frequently be
searching through, the additional time and memory required to set up the
hashtable will often be greater than the time saved by the improved search
speed.
Zen of Python¶
Also known as PEP 20, the guiding principles for Python’s design.
>>> import this The Zen of Python, by Tim Peters Beautiful is better than ugly. Explicit is better than implicit. Simple is better than complex. Complex is better than complicated. Flat is better than nested. Sparse is better than dense. Readability counts. Special cases aren't special enough to break the rules. Although practicality beats purity. Errors should never pass silently. Unless explicitly silenced. In the face of ambiguity, refuse the temptation to guess. There should be one-- and preferably only one --obvious way to do it. Although that way may not be obvious at first unless you're Dutch. Now is better than never. Although never is often better than *right* now. If the implementation is hard to explain, it's a bad idea. If the implementation is easy to explain, it may be a good idea. Namespaces are one honking great idea -- let's do more of those!
For some examples of good Python style, see these slides from a Python user
group.
PEP 8¶
PEP 8 is the de facto code style guide for Python. A high quality,
easy-to-read version of PEP 8 is also available at pep8.org.
This is highly recommended reading. The entire Python community does their
best to adhere to the guidelines laid out within this document. Some project
may sway from it from time to time, while others may amend its recommendations.
That being said, conforming your Python code to PEP 8 is generally a good idea
and helps make code more consistent when working on projects with other
developers. There is a command-line program, pycodestyle
(previously known as pep8
), that can check your code for conformance.
Install it by running the following command in your terminal:
$ pip install pycodestyle
Then run it on a file or series of files to get a report of any violations.
$ pycodestyle optparse.py optparse.py:69:11: E401 multiple imports on one line optparse.py:77:1: E302 expected 2 blank lines, found 1 optparse.py:88:5: E301 expected 1 blank line, found 0 optparse.py:222:34: W602 deprecated form of raising exception optparse.py:347:31: E211 whitespace before '(' optparse.py:357:17: E201 whitespace after '{' optparse.py:472:29: E221 multiple spaces before operator optparse.py:544:21: W601 .has_key() is deprecated, use 'in'
Auto-Formatting¶
There are several auto-formatting tools that can reformat your code,
in order to comply with PEP 8.
autopep8
The program autopep8 can be used to
automatically reformat code in the PEP 8 style. Install the program with:
Use it to format a file in-place with:
$ autopep8 --in-place optparse.py
Excluding the --in-place
flag will cause the program to output the modified
code directly to the console for review. The --aggressive
flag will perform
more substantial changes and can be applied multiple times for greater effect.
yapf
While autopep8 focuses on solving the PEP 8 violations, yapf
tries to improve the format of your code aside from complying with PEP 8.
This formatter aims at providing as good looking code as a programmer who
writes PEP 8 compliant code.
It gets installed with:
Run the auto-formatting of a file with:
$ yapf --in-place optparse.py
Similar to autopep8, running the command without the --in-place
flag will
output the diff for review before applying the changes.
black
The auto-formatter black offers an
opinionated and deterministic reformatting of your code base.
Its main focus lies in providing a uniform code style without the need of
configuration throughout its users. Hence, users of black are able to forget
about formatting altogether. Also, due to the deterministic approach minimal
git diffs with only the relevant changes are guaranteed. You can install the
tool as follows:
A python file can be formatted with:
Adding the --diff
flag provides the code modification for review without
direct application.
Conventions¶
Here are some conventions you should follow to make your code easier to read.
Check if a variable equals a constant¶
You don’t need to explicitly compare a value to True, or None, or 0 – you can
just add it to the if statement. See Truth Value Testing for a
list of what is considered false.
Bad:
if attr == True: print('True!') if attr == None: print('attr is None!')
Good:
# Just check the value if attr: print('attr is truthy!') # or check for the opposite if not attr: print('attr is falsey!') # or, since None is considered false, explicitly check for it if attr is None: print('attr is None!')
Access a Dictionary Element¶
Don’t use the dict.has_key()
method. Instead, use x in d
syntax,
or pass a default argument to dict.get()
.
Bad:
d = {'hello': 'world'} if d.has_key('hello'): print(d['hello']) # prints 'world' else: print('default_value')
Good:
d = {'hello': 'world'} print(d.get('hello', 'default_value')) # prints 'world' print(d.get('thingy', 'default_value')) # prints 'default_value' # Or: if 'hello' in d: print(d['hello'])
Short Ways to Manipulate Lists¶
List comprehensions
provides a powerful, concise way to work with lists.
Generator expressions
follows almost the same syntax as list comprehensions but return a generator
instead of a list.
Creating a new list requires more work and uses more memory. If you are just going
to loop through the new list, prefer using an iterator instead.
Bad:
# needlessly allocates a list of all (gpa, name) entires in memory valedictorian = max([(student.gpa, student.name) for student in graduates])
Good:
valedictorian = max((student.gpa, student.name) for student in graduates)
Use list comprehensions when you really need to create a second list, for
example if you need to use the result multiple times.
If your logic is too complicated for a short list comprehension or generator
expression, consider using a generator function instead of returning a list.
Good:
def make_batches(items, batch_size): """ >>> list(make_batches([1, 2, 3, 4, 5], batch_size=3)) [[1, 2, 3], [4, 5]] """ current_batch = [] for item in items: current_batch.append(item) if len(current_batch) == batch_size: yield current_batch current_batch = [] yield current_batch
Never use a list comprehension just for its side effects.
Bad:
[print(x) for x in sequence]
Good:
for x in sequence: print(x)
Filtering a list¶
Bad:
Never remove items from a list while you are iterating through it.
# Filter elements greater than 4 a = [3, 4, 5] for i in a: if i > 4: a.remove(i)
Don’t make multiple passes through the list.
while i in a: a.remove(i)
Good:
Use a list comprehension or generator expression.
# comprehensions create a new list object filtered_values = [value for value in sequence if value != x] # generators don't create another list filtered_values = (value for value in sequence if value != x)
Possible side effects of modifying the original list¶
Modifying the original list can be risky if there are other variables referencing it. But you can use slice assignment if you really want to do that.
# replace the contents of the original list sequence[::] = [value for value in sequence if value != x]
Modifying the values in a list¶
Bad:
Remember that assignment never creates a new object. If two or more variables refer to the same list, changing one of them changes them all.
# Add three to all list members. a = [3, 4, 5] b = a # a and b refer to the same list object for i in range(len(a)): a[i] += 3 # b[i] also changes
Good:
It’s safer to create a new list object and leave the original alone.
a = [3, 4, 5] b = a # assign the variable "a" to a new list without changing "b" a = [i + 3 for i in a]
Use enumerate()
keep a count of your place in the list.
a = [3, 4, 5] for i, item in enumerate(a): print(i, item) # prints # 0 3 # 1 4 # 2 5
The enumerate()
function has better readability than handling a
counter manually. Moreover, it is better optimized for iterators.
Read From a File¶
Use the with open
syntax to read from files. This will automatically close
files for you.
Bad:
f = open('file.txt') a = f.read() print(a) f.close()
Good:
with open('file.txt') as f: for line in f: print(line)
The with
statement is better because it will ensure you always close the
file, even if an exception is raised inside the with
block.
Line Continuations¶
When a logical line of code is longer than the accepted limit, you need to
split it over multiple physical lines. The Python interpreter will join
consecutive lines if the last character of the line is a backslash. This is
helpful in some cases, but should usually be avoided because of its fragility:
a white space added to the end of the line, after the backslash, will break the
code and may have unexpected results.
A better solution is to use parentheses around your elements. Left with an
unclosed parenthesis on an end-of-line, the Python interpreter will join the
next line until the parentheses are closed. The same behavior holds for curly
and square braces.
Bad:
my_very_big_string = """For a long time I used to go to bed early. Sometimes, \ when I had put out my candle, my eyes would close so quickly that I had not even \ time to say “I’m going to sleep.”""" from some.deep.module.inside.a.module import a_nice_function, another_nice_function, \ yet_another_nice_function
Good:
my_very_big_string = ( "For a long time I used to go to bed early. Sometimes, " "when I had put out my candle, my eyes would close so quickly " "that I had not even time to say “I’m going to sleep.”" ) from some.deep.module.inside.a.module import ( a_nice_function, another_nice_function, yet_another_nice_function)
However, more often than not, having to split a long logical line is a sign that
you are trying to do too many things at the same time, which may hinder
readability.