What are the lesser-known but useful features of the Python programming language?
.get
value
[7]
import this
[14]
__missing__
items
[17]
.pth
files
[22]
try/except/else
[28]
print()
function
[29]
with
statement
[30]>>> x = 5
>>> 1 < x < 10
True
>>> 10 < x < 20
False
>>> x < 10 < x*10 < 100
True
>>> 10 > x <= 9
True
>>> 5 == x > 4
True
In case you're thinking it's doing 1 < x
, which comes out as True
, and then comparing True < 10
, which is also True
, then no, that's really not what happens (see the last example.) It's really translating into 1 < x and x < 10
, and x < 10 and 10 < x * 10 and x*10 < 100
, but with less typing and each term is only evaluated once.
Get the python regex parse tree to debug your regex.
Regular expressions are a great feature of python, but debugging them can be a pain, and it's all too easy to get a regex wrong.
Fortunately, python can print the regex parse tree, by passing the undocumented, experimental, hidden flag re.DEBUG
(actually, 128) to re.compile
.
>>> re.compile("^\[font(?:=(?P<size>[-+][0-9]{1,2}))?\](.*?)[/font]",
re.DEBUG)
at at_beginning
literal 91
literal 102
literal 111
literal 110
literal 116
max_repeat 0 1
subpattern None
literal 61
subpattern 1
in
literal 45
literal 43
max_repeat 1 2
in
range (48, 57)
literal 93
subpattern 2
min_repeat 0 65535
any None
in
literal 47
literal 102
literal 111
literal 110
literal 116
Once you understand the syntax, you can spot your errors. There we can see that I forgot to escape the []
in [/font]
.
Of course you can combine it with whatever flags you want, like commented regexes:
>>> re.compile("""
^ # start of a line
\[font # the font tag
(?:=(?P<size> # optional [font=+size]
[-+][0-9]{1,2} # size specification
))?
\] # end of tag
(.*?) # text between the tags
\[/font\] # end of the tag
""", re.DEBUG|re.VERBOSE|re.DOTALL)
enumerate
Wrap an iterable with enumerate and it will yield the item along with its index.
For example:
>>> a = ['a', 'b', 'c', 'd', 'e']
>>> for index, item in enumerate(a): print index, item
...
0 a
1 b
2 c
3 d
4 e
>>>
References:
[1] http://docs.python.org/tutorial/datastructures.html#looping-techniquesa = ["a","b","c"] >>> for x in enumerate(a): ... print x
why do you do for index, item in enumerate(a): print index, item
- Trufa
idx, elem in itertools.izip(itertools.count(), iterable):
... - Jeeyoung Kim
for inx, itm in enumerate(a):
can do. - Tomas Aschan
for i in range(len(a))
is still a lot better than for (int i=0;i<a.size();i++)
.... - Gravity
Creating generators objects
If you write
x=(n for n in foo if bar(n))
you can get out the generator and assign it to x. Now it means you can do
for n in x:
The advantage of this is that you don't need intermediate storage, which you would need if you did
x = [n for n in foo if bar(n)]
In some cases this can lead to significant speed up.
You can append many if statements to the end of the generator, basically replicating nested for loops:
>>> n = ((a,b) for a in range(0,2) for b in range(4,6))
>>> for i in n:
... print i
(0, 4)
(0, 5)
(1, 4)
(1, 5)
iter() can take a callable argument
For instance:
def seek_next_line(f):
for c in iter(lambda: f.read(1),'\n'):
pass
The iter(callable, until_value)
function repeatedly calls callable
and yields its result until until_value
is returned.
lambda
keyword is necessary here? - SiegeX
Be careful with mutable default arguments
>>> def foo(x=[]):
... x.append(1)
... print x
...
>>> foo()
[1]
>>> foo()
[1, 1]
>>> foo()
[1, 1, 1]
Instead, you should use a sentinel value denoting "not given" and replace with the mutable you'd like as default:
>>> def foo(x=None):
... if x is None:
... x = []
... x.append(1)
... print x
>>> foo()
[1]
>>> foo()
[1]
foo.func_defaults
. Which, being a tuple, is immutable. - Robert Rossney
Sending values into generator functions [1]. For example having this function:
def mygen():
"""Yield 5 until something else is passed back via send()"""
a = 5
while True:
f = (yield a) #yield a and possibly get f in return
if f is not None:
a = f #store the new value
You can:
>>> g = mygen()
>>> g.next()
5
>>> g.next()
5
>>> g.send(7) #we send this back to the generator
7
>>> g.next() #now it will yield 7 until we send something else
7
[1] http://www.python.org/dev/peps/pep-0342/If you don't like using whitespace to denote scopes, you can use the C-style {} by issuing:
from __future__ import braces
# }
), and in fact I've read of blind people that uses it to allow them to write Python :) - Esteban Küber
The step argument in slice operators. For example:
a = [1,2,3,4,5]
>>> a[::2] # iterate over the whole list in 2-increments
[1,3,5]
The special case x[::-1]
is a useful idiom for 'x reversed'.
>>> a[::-1]
[5,4,3,2,1]
[::-1]
and reversed()
? - Austin Richardson
[1,2,3,4,5][::-2]
also works as expected, which is quite cool - Sam Elliott
someseq == (someseq[::-1])
then it's a palindrome, and this would work with any sequence type (strings, lists, etc). - Adam Parkin
Decorators
Decorators [1] allow to wrap a function or method in another function that can add functionality, modify arguments or results, etc. You write decorators one line above the function definition, beginning with an "at" sign (@).
Example shows a print_args
decorator that prints the decorated function's arguments before calling it:
>>> def print_args(function):
>>> def wrapper(*args, **kwargs):
>>> print 'Arguments:', args, kwargs
>>> return function(*args, **kwargs)
>>> return wrapper
>>> @print_args
>>> def write(text):
>>> print text
>>> write('foo')
Arguments: ('foo',) {}
foo
[1] http://docs.python.org/ref/function.html#tok-decoratorsThe for...else syntax (see http://docs.python.org/ref/for.html )
for i in foo:
if i == 0:
break
else:
print("i was never 0")
The "else" block will be normally executed at the end of the for loop, unless the break is called.
The above code could be emulated as follows:
found = False
for i in foo:
if i == 0:
found = True
break
if not found:
print("i was never 0")
else
. - jfs
while
loops. - Denilson Sá Maia
for...then...else
construct would be better, where then
is only executed if the for
is successful, else
when the for cannot be entered (eg: for i in []; pass; else; print "empty list"
. But then I'm a novice. :) - Phillip B Oldham
try.. except.. else..
Python's for.. else..
behavior is consistent with how try blocks are executed. If the contents of try or for succeed, jump to else. - ryanshow
if any(i == 0 for i in foo): ...
Would be my choice of phrasing for this kind of code. Maybe it's my Haskell influence. - Theo Belaire
From 2.5 onwards dicts have a special method __missing__
that is invoked for missing items:
>>> class MyDict(dict):
... def __missing__(self, key):
... self[key] = rv = []
... return rv
...
>>> m = MyDict()
>>> m["foo"].append(1)
>>> m["foo"].append(2)
>>> dict(m)
{'foo': [1, 2]}
There is also a dict subclass in collections
called defaultdict
that does pretty much the same but calls a function without arguments for not existing items:
>>> from collections import defaultdict
>>> m = defaultdict(list)
>>> m["foo"].append(1)
>>> m["foo"].append(2)
>>> dict(m)
{'foo': [1, 2]}
I recommend converting such dicts to regular dicts before passing them to functions that don't expect such subclasses. A lot of code uses d[a_key]
and catches KeyErrors to check if an item exists which would add a new item to the dict.
m={}; m.setdefault('foo', []).append(1)
. - Cristian Ciupitu
defaultdict
is also more powerful than the setdefault
method in other cases. For example, for a counter—dd = collections.defaultdict(int) ... dd[k] += 1
vs d.setdefault(k, 0) += 1
. - Mike Graham
In-place value swapping
>>> a = 10
>>> b = 5
>>> a, b
(10, 5)
>>> a, b = b, a
>>> a, b
(5, 10)
The right-hand side of the assignment is an expression that creates a new tuple. The left-hand side of the assignment immediately unpacks that (unreferenced) tuple to the names a
and b
.
After the assignment, the new tuple is unreferenced and marked for garbage collection, and the values bound to a
and b
have been swapped.
As noted in the Python tutorial section on data structures [1],
[1] http://docs.python.org/tutorial/datastructures.html#tuples-and-sequencesNote that multiple assignment is really just a combination of tuple packing and sequence unpacking.
(a, b) = (b, a)
. I don't think it is necessarily clear whether ,
or =
has higher precedence. - Dana the Sane
foo[x:y], bar[x:y] = bar[x:y], foo[x:y]
with foo
and bar
being numpy arrays, because slicing numpy arrays creates views, not copies of the data! - cberzan
Readable regular expressions
In Python you can split a regular expression over multiple lines, name your matches and insert comments.
Example verbose syntax (from Dive into Python [1]):
>>> pattern = """
... ^ # beginning of string
... M{0,4} # thousands - 0 to 4 M's
... (CM|CD|D?C{0,3}) # hundreds - 900 (CM), 400 (CD), 0-300 (0 to 3 C's),
... # or 500-800 (D, followed by 0 to 3 C's)
... (XC|XL|L?X{0,3}) # tens - 90 (XC), 40 (XL), 0-30 (0 to 3 X's),
... # or 50-80 (L, followed by 0 to 3 X's)
... (IX|IV|V?I{0,3}) # ones - 9 (IX), 4 (IV), 0-3 (0 to 3 I's),
... # or 5-8 (V, followed by 0 to 3 I's)
... $ # end of string
... """
>>> re.search(pattern, 'M', re.VERBOSE)
Example naming matches (from Regular Expression HOWTO [2])
>>> p = re.compile(r'(?P<word>\b\w+\b)')
>>> m = p.search( '(((( Lots of punctuation )))' )
>>> m.group('word')
'Lots'
You can also verbosely write a regex without using re.VERBOSE
thanks to string literal concatenation.
>>> pattern = (
... "^" # beginning of string
... "M{0,4}" # thousands - 0 to 4 M's
... "(CM|CD|D?C{0,3})" # hundreds - 900 (CM), 400 (CD), 0-300 (0 to 3 C's),
... # or 500-800 (D, followed by 0 to 3 C's)
... "(XC|XL|L?X{0,3})" # tens - 90 (XC), 40 (XL), 0-30 (0 to 3 X's),
... # or 50-80 (L, followed by 0 to 3 X's)
... "(IX|IV|V?I{0,3})" # ones - 9 (IX), 4 (IV), 0-3 (0 to 3 I's),
... # or 5-8 (V, followed by 0 to 3 I's)
... "$" # end of string
... )
>>> print pattern
"^M{0,4}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})$"
[1] http://diveintopython.net/regular_expressions/index.html(?x) # Use verbose mode
, which feels more self-documenting than using re.VERBOSE
at the compile step. These must be the very first characters in the pattern - no leading whitespace. Also, when using a verbose pattern, remember to us \s
or [ ]
to signify spaces (depending on if you want to capture all whitespace or just spaces). It can be easy to forget when converting from standard to verbose patterns. - jwhitlock
re.VERBOSE
flag, which I think leads to terrible-to-read code. - orokusaki
Function argument unpacking
You can unpack a list or a dictionary as function arguments using *
and **
.
For example:
def draw_point(x, y):
# do some magic
point_foo = (3, 4)
point_bar = {'y': 3, 'x': 2}
draw_point(*point_foo)
draw_point(**point_bar)
Very useful shortcut since lists, tuples and dicts are widely used as containers.
ROT13 is a valid encoding for source code, when you use the right coding declaration at the top of the code file:
#!/usr/bin/env python
# -*- coding: rot13 -*-
cevag "Uryyb fgnpxbiresybj!".rapbqr("rot13")
cevag h"Uryyb fgnpxbiresybj!"
- u0b34a0f6ae
>>> NewType = type("NewType", (object,), {"x": "hello"})
>>> n = NewType()
>>> n.x
"hello"
which is exactly the same as
>>> class NewType(object):
>>> x = "hello"
>>> n = NewType()
>>> n.x
"hello"
Probably not the most useful thing, but nice to know.
Edit: Fixed name of new type, should be NewType
to be the exact same thing as with class
statement.
Edit: Adjusted the title to more accurately describe the feature.
x = type("X", (object,), {'val':'Hello'})()
- Nick Radford
Context managers and the "with
" Statement
Introduced in PEP 343 [1], a context manager [2] is an object that acts as a run-time context for a suite of statements.
Since the feature makes use of new keywords, it is introduced gradually: it is available in Python 2.5 via the
__future__
[3] directive. Python 2.6 and above (including Python 3) has it available by default.
I have used the "with" statement [4] a lot because I think it's a very useful construct, here is a quick demo:
from __future__ import with_statement
with open('foo.txt', 'w') as f:
f.write('hello!')
What's happening here behind the scenes, is that the
"with" statement
[5] calls the special __enter__
and __exit__
methods on the file object. Exception details are also passed to __exit__
if any exception was raised from the with statement body, allowing for exception handling to happen there.
What this does for you in this particular case is that it guarantees that the file is closed when execution falls out of scope of the with
suite, regardless if that occurs normally or whether an exception was thrown. It is basically a way of abstracting away common exception-handling code.
Other common use cases for this include locking with threads and database transactions.
[1] http://www.python.org/dev/peps/pep-0343/withs
:) with open('filea') as filea and open('fileb') as fileb: ...
- Austin Richardson
with open('filea') as filea, open('fileb') as fileb: ...
- wim
with
statement is different from f = open('foo.txt', 'w')
. - gb.
Dictionaries have a 'get()' method. If you do d['key'] and key isn't there, you get an exception. If you do d.get('key'), you get back None if 'key' isn't there. You can add a second argument to get that item back instead of None, eg: d.get('key', 0).
It's great for things like adding up numbers:
sum[value] = sum.get(value, 0) + 1
get(key, None)
. Had no idea that None
was provided by default. - Jordan Reiter
They're the magic behind a whole bunch of core Python features.
When you use dotted access to look up a member (eg, x.y), Python first looks for the member in the instance dictionary. If it's not found, it looks for it in the class dictionary. If it finds it in the class dictionary, and the object implements the descriptor protocol, instead of just returning it, Python executes it. A descriptor is any class that implements the __get__
, __set__
, or __delete__
methods.
Here's how you'd implement your own (read-only) version of property using descriptors:
class Property(object):
def __init__(self, fget):
self.fget = fget
def __get__(self, obj, type):
if obj is None:
return self
return self.fget(obj)
and you'd use it just like the built-in property():
class MyClass(object):
@Property
def foo(self):
return "Foo!"
Descriptors are used in Python to implement properties, bound methods, static methods, class methods and slots, amongst other things. Understanding them makes it easy to see why a lot of things that previously looked like Python 'quirks' are the way they are.
Raymond Hettinger has an excellent tutorial [1] that does a much better job of describing them than I do.
[1] http://users.rcn.com/python/download/Descriptor.htmfoo = property(lambda self: self.__foo)
- Pete Peterson
property
itself is implemented with descriptors, which was the point of my post. - Nick Johnson
Conditional Assignment
x = 3 if (y == 1) else 2
It does exactly what it sounds like: "assign 3 to x if y is 1, otherwise assign 2 to x". Note that the parens are not necessary, but I like them for readability. You can also chain it if you have something more complicated:
x = 3 if (y == 1) else 2 if (y == -1) else 1
Though at a certain point, it goes a little too far.
Note that you can use if ... else in any expression. For example:
(func1 if y == 1 else func2)(arg1, arg2)
Here func1 will be called if y is 1 and func2, otherwise. In both cases the corresponding function will be called with arguments arg1 and arg2.
Analogously, the following is also valid:
x = (class1 if y == 1 else class2)(arg1, arg2)
where class1 and class2 are two classes.
x = ((y == 1) ? 3 : 2)
makes more sense to me - mpen
'am I drunk' ? 'yes, make out with her' : 'no, dont even think about it'
- adamJLev
x = 3 if (y == 1) else 2
- I find that in many cases, x = (2, 3)[y==1]
is actually more readable (normally with really long statements, so you can keep the results (2, 3) together). - Ponkadoodle
Example extracted from the Python documentation:
def factorial(n):
"""Return the factorial of n, an exact integer >= 0.
If the result is small enough to fit in an int, return an int.
Else return a long.
>>> [factorial(n) for n in range(6)]
[1, 1, 2, 6, 24, 120]
>>> factorial(-1)
Traceback (most recent call last):
...
ValueError: n must be >= 0
Factorials of floats are OK, but the float must be an exact integer:
"""
import math
if not n >= 0:
raise ValueError("n must be >= 0")
if math.floor(n) != n:
raise ValueError("n must be exact integer")
if n+1 == n: # catch a value like 1e300
raise OverflowError("n too large")
result = 1
factor = 2
while factor <= n:
result *= factor
factor += 1
return result
def _test():
import doctest
doctest.testmod()
if __name__ == "__main__":
_test()
[1] http://docs.python.org/lib/module-doctest.htmllocals()
then in your doctest do locals().update(setUp())
=D - Jiaaro
% -formatting takes a dictionary (also applies %i/%s etc. validation).
>>> print "The %(foo)s is %(bar)i." % {'foo': 'answer', 'bar':42}
The answer is 42.
>>> foo, bar = 'question', 123
>>> print "The %(foo)s is %(bar)i." % locals()
The question is 123.
And since locals() is also a dictionary, you can simply pass that as a dict and have % -substitions from your local variables. I think this is frowned upon, but simplifies things..
New Style Formatting
>>> print("The {foo} is {bar}".format(foo='answer', bar=42))
locals()
-equivalent for new-style formatting is of course print "The {foo} is {bar}".format(**locals())
. - Ben Blank
locals()
equivalent is print("The {foo} is {bar}".format_map(locals()))
- ncoghlan
>>> timeit('''"a %(b)s" % {"b": "c"}''') 0.2503829002380371 >>> timeit('''"a {b}".format(b="c")''') 0.41667699813842773
- Arne Babenhauserheide
To add more python modules (espcially 3rd party ones), most people seem to use PYTHONPATH environment variables or they add symlinks or directories in their site-packages directories. Another way, is to use *.pth files. Here's the official python doc's explanation:
"The most convenient way [to modify python's search path] is to add a path configuration file to a directory that's already on Python's path, usually to the .../site-packages/ directory. Path configuration files have an extension of .pth, and each line must contain a single path that will be appended to sys.path. (Because the new paths are appended to sys.path, modules in the added directories will not override standard modules. This means you can't use this mechanism for installing fixed versions of standard modules.)"
Exception else clause:
try:
put_4000000000_volts_through_it(parrot)
except Voom:
print "'E's pining!"
else:
print "This parrot is no more!"
finally:
end_sketch()
The use of the else clause is better than adding additional code to the try clause because it avoids accidentally catching an exception that wasn’t raised by the code being protected by the try ... except statement.
See http://docs.python.org/tut/node10.html
Re-raising exceptions:
# Python 2 syntax
try:
some_operation()
except SomeError, e:
if is_fatal(e):
raise
handle_nonfatal(e)
# Python 3 syntax
try:
some_operation()
except SomeError as e:
if is_fatal(e):
raise
handle_nonfatal(e)
The 'raise' statement with no arguments inside an error handler tells Python to re-raise the exception with the original traceback intact, allowing you to say "oh, sorry, sorry, I didn't mean to catch that, sorry, sorry."
If you wish to print, store or fiddle with the original traceback, you can get it with sys.exc_info(), and printing it like Python would is done with the 'traceback' module.
raise e
instead, which doesn't preserve the original traceback. - habnabit
exc_info = sys.exc_info(); raise exc_info[0], exc_info[1], exc_info[2]
is equivalent to this, but you can change those values around (e.g., change the exception type or message) - ianb
Main messages :)
import this
# btw look at this module's source :)
De-cyphered [1]:
[1] http://svn.python.org/view/python/trunk/Lib/this.py?view=markupThe Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess. There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than right now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
print s.translate("".join(chr(64<i<91 and 65+(i-52)%26 or 96<i<123 and 97+(i-84)%26 or i) for i in range(256)))
and it looks much better now!! :-D - fortran
import this
the other day. Rather interesting: wefearchange.org/2010/06/import-this-and-zen-of-python.html - asmeurer
Interactive Interpreter Tab Completion
try:
import readline
except ImportError:
print "Unable to load readline module."
else:
import rlcompleter
readline.parse_and_bind("tab: complete")
>>> class myclass:
... def function(self):
... print "my function"
...
>>> class_instance = myclass()
>>> class_instance.<TAB>
class_instance.__class__ class_instance.__module__
class_instance.__doc__ class_instance.function
>>> class_instance.f<TAB>unction()
You will also have to set a PYTHONSTARTUP environment variable.
easy_install ipdb
-- then you can use import ipdb; ipdb.set_trace()
- Doug Harris
readline.parse_and_bind ("bind ^I rl_complete")
- Foo Bah
Nested list comprehensions and generator expressions:
[(i,j) for i in range(3) for j in range(i) ]
((i,j) for i in range(4) for j in range(i) )
These can replace huge chunks of nested-loop code.
for
statements are to be written in the order you'd expect them to be written in a standard for-loop, from the outside inwards. - sykora
for
s and if
s with yield x
inside. To convert that to a generator expression, move x
first, delete all the colons (and the yield
), and surround the whole thing in parentheses. To make a list comprehension instead, replace the outer parens with square brackets. - Ken Arnold
Operator overloading for the set
builtin:
>>> a = set([1,2,3,4])
>>> b = set([3,4,5,6])
>>> a | b # Union
{1, 2, 3, 4, 5, 6}
>>> a & b # Intersection
{3, 4}
>>> a < b # Subset
False
>>> a - b # Difference
{1, 2}
>>> a ^ b # Symmetric Difference
{1, 2, 5, 6}
More detail from the standard library reference: Set Types [1]
[1] http://docs.python.org/py3k/library/stdtypes.html#set-types-set-frozensetNegative round
The round()
function rounds a float number to given precision in decimal digits, but precision can be negative:
>>> str(round(1234.5678, -2))
'1200.0'
>>> str(round(1234.5678, 2))
'1234.57'
Note: round()
always returns a float, str()
used in the above example because floating point math is inexact, and under 2.x the second example can print as 1234.5700000000001
. Also see the
decimal
[1] module.
round(num, precision=1) - round "num" to the nearest multiple of "precision"
- Ponkadoodle
roundNearest = lambda n, m: round(float(n)/m)*m
- Ponkadoodle
roundNearest = lambda n, m: (n + (m/2)) / m * m
. It's twice as fast as using round(float)
on my system. - Mikel
Multiplying by a boolean
One thing I'm constantly doing in web development is optionally printing HTML parameters. We've all seen code like this in other languages:
class='<% isSelected ? "selected" : "" %>'
In Python, you can multiply by a boolean and it does exactly what you'd expect:
class='<% "selected" * isSelected %>'
This is because multiplication coerces the boolean to an integer (0 for False, 1 for True), and in python multiplying a string by an int repeats the string N times.
bool(isSelected)
both for reliability and readability. - Marian
('not-selected', 'selected')[isSelected]
if you need an option for False value too.. - redShadow
Python's advanced slicing operation has a barely known syntax element, the ellipsis:
>>> class C(object):
... def __getitem__(self, item):
... return item
...
>>> C()[1:2, ..., 3]
(slice(1, 2, None), Ellipsis, 3)
Unfortunately it's barely useful as the ellipsis is only supported if tuples are involved.
numpy
module. - Denilson Sá Maia
...
in a Python 3 interpreter and it will return Eillipsis
) - asmeurer
The fact that you can call a function every time something matches a regular expression is very handy. Here I have a sample of replacing every "Hello" with "Hi," and "there" with "Fred", etc.
import re
def Main(haystack):
# List of from replacements, can be a regex
finds = ('Hello', 'there', 'Bob')
replaces = ('Hi,', 'Fred,', 'how are you?')
def ReplaceFunction(matchobj):
for found, rep in zip(matchobj.groups(), replaces):
if found != None:
return rep
# log error
return matchobj.group(0)
named_groups = [ '(%s)' % find for find in finds ]
ret = re.sub('|'.join(named_groups), ReplaceFunction, haystack)
print ret
if __name__ == '__main__':
str = 'Hello there Bob'
Main(str)
# Prints 'Hi, Fred, how are you?'
re.sub('[aeiou]', lambda match: match.group().upper()*3, 'abcdefghijklmnopqrstuvwxyz')
- Don Spaulding
tuple unpacking in python 3
in python 3, you can use a syntax identical to optional arguments in function definition for tuple unpacking:
>>> first,second,*rest = (1,2,3,4,5,6,7,8)
>>> first
1
>>> second
2
>>> rest
[3, 4, 5, 6, 7, 8]
but a feature less known and more powerful allows you to have an unknown number of elements in the middle of the list:
>>> first,*rest,last = (1,2,3,4,5,6,7,8)
>>> first
1
>>> rest
[2, 3, 4, 5, 6, 7]
>>> last
8
Multi line strings
One approach is to use backslashes:
>>> sql = "select * from some_table \
where id > 10"
>>> print sql
select * from some_table where id > 10
Another is to use the triple-quote:
>>> sql = """select * from some_table
where id > 10"""
>>> print sql
select * from some_table where id > 10
Problem with those is that they are not indented (look poor in your code). If you try to indent, it'll just print the white-spaces you put.
A third solution, which I found about recently, is to divide your string into lines and surround with parentheses:
>>> sql = ("select * from some_table " # <-- no comma, whitespace at end
"where id > 10 "
"order by name")
>>> print sql
select * from some_table where id > 10 order by name
note how there's no comma between lines (this is not a tuple), and you have to account for any trailing/leading white spaces that your string needs to have. All of these work with placeholders, by the way (such as "my name is %s" % name
).
This answer has been moved into the question itself, as requested by many people.
>>> (a for a in xrange(10000)) <generator object at 0x81a8fcc> >>> b = 'blah' >>> _ <generator object at 0x81a8fcc>
>>> import webbrowser >>> webbrowser.open_new_tab('http://www.stackoverflow.com')
python -m SimpleHTTPServer 8000
>>> import atexit
_
is available only in interactive mode. when running scripts from a file, _
has no special meaning. - SingleNegationElimination
_
to be a variable in a file (just in case you do want to go for obfuscated Python). - asmeurer
_
as a name for variables I do not care about (eg for _, desired_value, _ in my_tuple_with_some_irrelevant_values
). Yes, ike a prologger :) - brandizzi
There is a lesser known third argument of the built-in pow()
function that allows you to calculate xy modulo z more efficiently than simply doing (x ** y) % z
:
>>> x, y, z = 1234567890, 2345678901, 17
>>> pow(x, y, z) # almost instantaneous
6
In comparison, (x ** y) % z
didn't given a result in one minute on my machine for the same values.
pow()
function. This is not the math.pow()
function, which accepts only 2 arguments. - Denilson Sá Maia
pow
had that third parameter then or not. - Omnifarious
pow
had that third parameter at least since Python 2.1. However, according to the documentation, "[i]n Python 2.1 and before, floating 3-argument pow()
returned platform-dependent results depending on floating-point rounding accidents." - Tamás
__pow__
. You just have to define an optional third argument. And for more information on where this would be used, see en.wikipedia.org/wiki/Modular_exponentiation. - asmeurer
You can easily transpose an array with zip.
a = [(1,2), (3,4), (5,6)]
zip(*a)
# [(1, 3, 5), (2, 4, 6)]
zip(*a)
unzips a
. So if b = zip(a)
, then a == zip(*b)
. - asmeurer
enumerate
has partly been covered in
this answer
[1], but recently I've found an even more hidden feature of enumerate
that I think deserves its own post instead of just a comment.
Since Python 2.6, you can specify a starting index to enumerate
in its second argument:
>>> l = ["spam", "ham", "eggs"]
>>> list(enumerate(l))
>>> [(0, "spam"), (1, "ham"), (2, "eggs")]
>>> list(enumerate(l, 1))
>>> [(1, "spam"), (2, "ham"), (3, "eggs")]
One place where I've found it utterly useful is when I am enumerating over entries of a symmetric matrix. Since the matrix is symmetric, I can save time by iterating over the upper triangle only, but in that case, I have to use enumerate
with a different starting index in the inner for
loop to keep track of the row and column indices properly:
for ri, row in enumerate(matrix):
for ci, column in enumerate(matrix[ri:], ri):
# ci now refers to the proper column index
Strangely enough, this behaviour of enumerate
is not documented in help(enumerate)
, only in the
online documentation
[2].
help(enumerate)
has this proper function signature in python2.x, but not in py3k. I guess, a bug needs to be filled. - SilentGhost
help(enumerate)
is definitely wrong in Python 2.6.5. Maybe they have fixed it already in Python 2.7. - Tamás
help(enumerate)
from Python 3.1.2 says class enumerate(object) | enumerate(iterable) -> iterator for index, value of iterable, but the trick from the answer works fine. - Cristian Ciupitu
You can use property [1] to make your class interfaces more strict.
class C(object):
def __init__(self, foo, bar):
self.foo = foo # read-write property
self.bar = bar # simple attribute
def _set_foo(self, value):
self._foo = value
def _get_foo(self):
return self._foo
def _del_foo(self):
del self._foo
# any of fget, fset, fdel and doc are optional,
# so you can make a write-only and/or delete-only property.
foo = property(fget = _get_foo, fset = _set_foo,
fdel = _del_foo, doc = 'Hello, I am foo!')
class D(C):
def _get_foo(self):
return self._foo * 2
def _set_foo(self, value):
self._foo = value / 2
foo = property(fget = _get_foo, fset = _set_foo,
fdel = C.foo.fdel, doc = C.foo.__doc__)
In Python 2.6 and 3.0 [2]:
class C(object):
def __init__(self, foo, bar):
self.foo = foo # read-write property
self.bar = bar # simple attribute
@property
def foo(self):
'''Hello, I am foo!'''
return self._foo
@foo.setter
def foo(self, value):
self._foo = value
@foo.deleter
def foo(self):
del self._foo
class D(C):
@C.foo.getter
def foo(self):
return self._foo * 2
@foo.setter
def foo(self, value):
self._foo = value / 2
To learn more about how property works refer to descriptors [3].
[1] http://docs.python.org/library/functions.html#propertyMany people don't know about the "dir" function. It's a great way to figure out what an object can do from the interpreter. For example, if you want to see a list of all the string methods:
>>> dir("foo")
['__add__', '__class__', '__contains__', (snipped a bunch), 'title',
'translate', 'upper', 'zfill']
And then if you want more information about a particular method you can call "help" on it.
>>> help("foo".upper)
Help on built-in function upper:
upper(...)
S.upper() -> string
Return a copy of the string S converted to uppercase.
set/frozenset [1]
Probably an easily overlooked python builtin is "set/frozenset".
Useful when you have a list like this, [1,2,1,1,2,3,4] and only want the uniques like this [1,2,3,4].
Using set() that's exactly what you get:
>>> x = [1,2,1,1,2,3,4]
>>>
>>> set(x)
set([1, 2, 3, 4])
>>>
>>> for i in set(x):
... print i
...
1
2
3
4
And of course to get the number of uniques in a list:
>>> len(set([1,2,1,1,2,3,4]))
4
You can also find if a list is a subset of another list using set().issubset():
>>> set([1,2,3,4]).issubset([0,1,2,3,4,5])
True
As of Python 2.7 and 3.0 you can use curly braces to create a set:
myset = {1,2,3,4}
as well as set comprehensions:
{x for x in stuff}
For more details: http://docs.python.org/library/stdtypes.html#set
[1] http://docs.python.org/library/stdtypes.html#setissubset
not isasubset
. either way, the subset operator <=
is nicer anyway. - wim
Built-in base64, zlib, and rot13 codecs
Strings have encode
and decode
methods. Usually this is used for converting str
to unicode
and vice versa, e.g. with u = s.encode('utf8')
. But there are some other handy builtin codecs. Compression and decompression with zlib (and bz2) is available without an explicit import:
>>> s = 'a' * 100
>>> s.encode('zlib')
'x\x9cKL\xa4=\x00\x00zG%\xe5'
Similarly you can encode and decode base64:
>>> 'Hello world'.encode('base64')
'SGVsbG8gd29ybGQ=\n'
>>> 'SGVsbG8gd29ybGQ=\n'.decode('base64')
'Hello world'
And, of course, you can rot13:
>>> 'Secret message'.encode('rot13')
'Frperg zrffntr'
An interpreter within the interpreter
The standard library's code [1] module let's you include your own read-eval-print loop inside a program, or run a whole nested interpreter. E.g. (copied my example from here [2])
$ python
Python 2.5.1 (r251:54863, Jan 17 2008, 19:35:17)
[GCC 4.0.1 (Apple Inc. build 5465)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> shared_var = "Set in main console"
>>> import code
>>> ic = code.InteractiveConsole({ 'shared_var': shared_var })
>>> try:
... ic.interact("My custom console banner!")
... except SystemExit, e:
... print "Got SystemExit!"
...
My custom console banner!
>>> shared_var
'Set in main console'
>>> shared_var = "Set in sub-console"
>>> import sys
>>> sys.exit()
Got SystemExit!
>>> shared_var
'Set in main console'
This is extremely useful for situations where you want to accept scripted input from the user, or query the state of the VM in real-time.
TurboGears [3] uses this to great effect by having a WebConsole from which you can query the state of you live web app.
[1] http://docs.python.org/library/code.html>>> from functools import partial
>>> bound_func = partial(range, 0, 10)
>>> bound_func()
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> bound_func(2)
[0, 2, 4, 6, 8]
not really a hidden feature but partial is extremely useful for having late evaluation of functions.
you can bind as many or as few parameters in the initial call to partial as you want, and call it with any remaining parameters later (in this example i've bound the begin/end args to range, but call it the second time with a step arg)
See the documentation [1].
[1] http://www.python.org/doc//current/library/functools.htmlWhile debugging complex data structures pprint module comes handy.
Quoting from the docs..
>>> import pprint
>>> stuff = sys.path[:]
>>> stuff.insert(0, stuff)
>>> pprint.pprint(stuff)
[<Recursion on list with id=869440>,
'',
'/usr/local/lib/python1.5',
'/usr/local/lib/python1.5/test',
'/usr/local/lib/python1.5/sunos5',
'/usr/local/lib/python1.5/sharedmodules',
'/usr/local/lib/python1.5/tkinter']
Python has GOTO
...and it's implemented by external pure-Python module [1] :)
from goto import goto, label
for i in range(1, 10):
for j in range(1, 20):
for k in range(1, 30):
print i, j, k
if k == 3:
goto .end # breaking out from a deeply nested loop
label .end
print "Finished"
[1] http://entrian.com/goto/dict's constructor accepts keyword arguments:
>>> dict(foo=1, bar=2)
{'foo': 1, 'bar': 2}
Sequence multiplication and reflected operands
>>> 'xyz' * 3
'xyzxyzxyz'
>>> [1, 2] * 3
[1, 2, 1, 2, 1, 2]
>>> (1, 2) * 3
(1, 2, 1, 2, 1, 2)
We get the same result with reflected (swapped) operands
>>> 3 * 'xyz'
'xyzxyzxyz'
It works like this:
>>> s = 'xyz'
>>> num = 3
To evaluate an expression s * num interpreter calls s.___mul___(num)
>>> s * num
'xyzxyzxyz'
>>> s.__mul__(num)
'xyzxyzxyz'
To evaluate an expression num * s interpreter calls num.___mul___(s)
>>> num * s
'xyzxyzxyz'
>>> num.__mul__(s)
NotImplemented
If the call returns NotImplemented then interpreter calls a reflected operation s.___rmul___(num) if operands have different types
>>> s.__rmul__(num)
'xyzxyzxyz'
See http://docs.python.org/reference/datamodel.html#object. rmul [1]
[1] http://docs.python.org/reference/datamodel.html#object.__rmul__x * y != y * x
, after all :) - badp
x= [] * 20
, i am thinking if it would be better to make shallow copies of the operands by default - warvariuc
Getter functions in module operator
The functions attrgetter()
and itemgetter()
in module operator
can be used to generate fast access functions for use in sorting and search objects and dictionaries
Chapter 6.7 [1] in the Python Library Docs
[1] http://docs.python.org/lib/module-operator.htmlmap()
- Jonathan Livni
Interleaving if
and for
in list comprehensions
>>> [(x, y) for x in range(4) if x % 2 == 1 for y in range(4)]
[(1, 0), (1, 1), (1, 2), (1, 3), (3, 0), (3, 1), (3, 2), (3, 3)]
I never realized this until I learned Haskell.
[x for i in range(10) if i%2 for j in range(10) if j%2]
, nothing especially cool or interesting. The if in the middle of your example has nothing to do with the second for. - Olivier Verdier
[ a for (a, b) in zip(lista, listb) if a == b else: '-' ]
- Austin Richardson
[ _ for _ in _ if _ ]
the if is a filter for the example above it would need to be [ _ if _ else _ for _ ]
- Dan D.
Tuple unpacking:
>>> (a, (b, c), d) = [(1, 2), (3, 4), (5, 6)]
>>> a
(1, 2)
>>> b
3
>>> c, d
(4, (5, 6))
More obscurely, you can do this in function arguments (in Python 2.x; Python 3.x will not allow this anymore):
>>> def addpoints((x1, y1), (x2, y2)):
... return (x1+x2, y1+y2)
>>> addpoints((5, 0), (3, 5))
(8, 5)
x1, x2 = x; y1, y2 = y
(if you have x,y arguments) - Joschua
The Python Interpreter
>>>
Maybe not lesser known, but certainly one of my favorite features of Python.
Python sort function sorts tuples correctly (i.e. using the familiar lexicographical order):
a = [(2, "b"), (1, "a"), (2, "a"), (3, "c")]
print sorted(a)
#[(1, 'a'), (2, 'a'), (2, 'b'), (3, 'c')]
Useful if you want to sort a list of persons after age and then name.
You can reference a list comprehension as it is being built by the symbol '_[1]'. For example, the following function unique-ifies a list of elements without changing their order by referencing its list comprehension.
def unique(my_list):
return [x for x in my_list if x not in locals()['_[1]']]
The unpacking syntax has been upgraded in the recent version as can be seen in the example.
>>> a, *b = range(5)
>>> a, b
(0, [1, 2, 3, 4])
>>> *a, b = range(5)
>>> a, b
([0, 1, 2, 3], 4)
>>> a, *b, c = range(5)
>>> a, b, c
(0, [1, 2, 3], 4)
The simplicity of :
>>> 'str' in 'string'
True
>>> 'no' in 'yes'
False
>>>
is something i love about Python, I have seen a lot of not very pythonic idiom like that instead :
if 'yes'.find("no") == -1:
pass
in
behavior on other kinds of sequences. 1 in [3, 2, 1]
is True
, but [2, 1] in [3, 2, 1]
is False
, and it could really be a problem if it were True
. But that's what would be needed to make it consistent with the string behavior explained here. So I think the .find()
approach is actually more Pythonic, although of course .find()
ought to have returned None
instead of -1. - Kragen Javier Sitaker
Metaclasses
of course :-) What is a metaclass in Python? [1]
[1] https://stackoverflow.com/questions/100003/what-is-a-metaclass-in-pythonI personally love the 3 different quotes
str = "I'm a string 'but still I can use quotes' inside myself!"
str = """ For some messy multi line strings.
Such as
<html>
<head> ... </head>"""
Also cool: not having to escape regular expressions, avoiding horrible backslash salad by using raw strings:
str2 = r"\n"
print str2
>> \n
'''
- user1686
'
and "
do pretty much the same thing in code. My IDE highlights strings from the two in different colors, and it makes it easy to differentiate short strings (with '
) from longer ones (with "
). - asmeurer
Generators
I think that a lot of beginning Python developers pass over generators without really grasping what they're for or getting any sense of their power. It wasn't until I read David M. Beazley's PyCon presentation on generators (it's available here [1]) that I realized how useful (essential, really) they are. That presentation illuminated what was for me an entirely new way of programming, and I recommend it to anyone who doesn't have a deep understanding of generators.
[1] http://www.dabeaz.com/generators/Implicit concatenation:
>>> print "Hello " "World"
Hello World
Useful when you want to make a long text fit on several lines in a script:
hello = "Greaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa Hello " \
"Word"
or
hello = ("Greaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa Hello "
"Word")
+
at the end of the line (though I still do use the implicit line continuations from parentheses). It just makes things clearer to read. - asmeurer
When using the interactive shell, "_" contains the value of the last printed item:
>>> range(10)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> _
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>>
_
automatic variable is the best feature when using Python shell as a calculator. Very powerful calculator, by the way. - Denilson Sá Maia
__
for second-last and ___
for third-last - wim
The
textwrap.dedent
[1] utility function in python can come quite in handy testing that a multiline string returned is equal to the expected output without breaking the indentation of your unittests:
import unittest, textwrap
class XMLTests(unittest.TestCase):
def test_returned_xml_value(self):
returned_xml = call_to_function_that_returns_xml()
expected_value = textwrap.dedent("""\
<?xml version="1.0" encoding="utf-8"?>
<root_node>
<my_node>my_content</my_node>
</root_node>
""")
self.assertEqual(expected_value, returned_xml)
[1] http://docs.python.org/library/textwrap.html#textwrap.dedentLambda functions are usually used for a quick transformation of one value into another, but they can also be used to wrap a value in a function:
>>> f = lambda: 'foo'
>>> f()
'foo'
They can also accept the usual *args
and **kwargs
syntax:
>>> g = lambda *args, **kwargs: args[0], kwargs['thing']
>>> g(1, 2, 3, thing='stuff')
(1, 'stuff')
defaultdict(lambda: 1)
- eswald
Using keyword arguments as assignments
Sometimes one wants to build a range of functions depending on one or more parameters. However this might easily lead to closures all referring to the same object and value:
funcs = []
for k in range(10):
funcs.append( lambda: k)
>>> funcs[0]()
9
>>> funcs[7]()
9
This behaviour can be avoided by turning the lambda expression into a function depending only on its arguments. A keyword parameter stores the current value that is bound to it. The function call doesn't have to be altered:
funcs = []
for k in range(10):
funcs.append( lambda k = k: k)
>>> funcs[0]()
0
>>> funcs[7]()
7
def make_lambda(k): return lambda: k
. - Jason Orendorff
"lambda: k=k:k"
and understand immediately that it is "saving" the current value of "k" ( as the lambda is created ), and attaching it to the lambda itself. This works the same with normal "def" functions, too. - Nick Perkins
Mod works correctly with negative numbers
-1 % 5 is 4, as it should be, not -1 as it is in other languages like JavaScript. This makes "wraparound windows" cleaner in Python, you just do this:
index = (index + increment) % WINDOW_SIZE
number = coefficient x quotient + remainder
. In Python (and Ruby), quotient
is different than in JavaScript (or C or Java), because integer division in Python rounds towards negative infinity, but in JavaScript it rounds towards zero (truncates). I agree that %
in Python makes more sense, but I don't know if /
does. See en.wikipedia.org/wiki/Modulo_operation for details on each language. - Mikel
First-class functions
It's not really a hidden feature, but the fact that functions are first class objects is simply great. You can pass them around like any other variable.
>>> def jim(phrase):
... return 'Jim says, "%s".' % phrase
>>> def say_something(person, phrase):
... print person(phrase)
>>> say_something(jim, 'hey guys')
'Jim says, "hey guys".'
Ternary operator
>>> 'ham' if True else 'spam'
'ham'
>>> 'ham' if False else 'spam'
'spam'
This was added in 2.5, prior to that you could use:
>>> True and 'ham' or 'spam'
'ham'
>>> False and 'ham' or 'spam'
'spam'
However, if the values you want to work with would be considered false, there is a difference:
>>> [] if True else 'spam'
[]
>>> True and [] or 'spam'
'spam'
Assigning and deleting slices:
>>> a = range(10)
>>> a
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> a[:5] = [42]
>>> a
[42, 5, 6, 7, 8, 9]
>>> a[:1] = range(5)
>>> a
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> del a[::2]
>>> a
[1, 3, 5, 7, 9]
>>> a[::2] = a[::-2]
>>> a
[9, 3, 5, 7, 1]
Note: when assigning to extended slices (s[start:stop:step]
), the assigned iterable must have the same length as the slice.
Not very hidden, but functions have attributes:
def doNothing():
pass
doNothing.monkeys = 4
print doNothing.monkeys
4
__call__()
also have a __call__()
function? - user142019
Much Python functions accept tuples, also it doesn't seem like. For example you want to test if your variable is a number, you could do:
if isinstance (number, float) or isinstance (number, int):
print "yaay"
But if you pass us tuple this looks much cleaner:
if isinstance (number, (float, int)):
print "yaay"
except (FooError, BarError)
clauses. - Beni Cherniavsky-Paskin
Nice treatment of infinite recursion in dictionaries:
>>> a = {}
>>> b = {}
>>> a['b'] = b
>>> b['a'] = a
>>> print a
{'b': {'a': {...}}}
str()
and repr()
return the string you posted above. However, the ipython
shell returns something a little different, a little more informative: {'b': {'a': <Recursion on dict with id=17830960>}} - Denilson Sá Maia
reversing an iterable using negative step
>>> s = "Hello World"
>>> s[::-1]
'dlroW olleH'
>>> a = (1,2,3,4,5,6)
>>> a[::-1]
(6, 5, 4, 3, 2, 1)
>>> a = [5,4,3,2,1]
>>> a[::-1]
[1, 2, 3, 4, 5]
(n for n in (1,2,3,4,5))[::-1]
doesn't work. - Don O'Donnell
reversed()
function is better, as it returns a reverse iterator instead of allocating a new sequence. - Denilson Sá Maia
Not "hidden" but quite useful and not commonly used
Creating string joining functions quickly like so
comma_join = ",".join
semi_join = ";".join
print comma_join(["foo","bar","baz"])
'foo,bar,baz
and
Ability to create lists of strings more elegantly than the quote, comma mess.
l = ["item1", "item2", "item3"]
replaced by
l = "item1 item2 item3".split()
Arguably, this is not a programming feature per se, but so useful that I'll post it nevertheless.
$ python -m http.server
...followed by $ wget http://<ipnumber>:8000/filename
somewhere else.
If you are still running an older (2.x) version of Python:
$ python -m SimpleHTTPServer
You can also specify a port e.g. python -m http.server 80
(so you can omit the port in the url if you have the root on the server side)
Multiple references to an iterator
You can create multiple references to the same iterator using list multiplication:
>>> i = (1,2,3,4,5,6,7,8,9,10) # or any iterable object
>>> iterators = [iter(i)] * 2
>>> iterators[0].next()
1
>>> iterators[1].next()
2
>>> iterators[0].next()
3
This can be used to group an iterable into chunks, for example, as in this example from the
itertools
documentation
[1]
def grouper(n, iterable, fillvalue=None):
"grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
args = [iter(iterable)] * n
return izip_longest(fillvalue=fillvalue, *args)
[1] http://docs.python.org/library/itertools.htmlitertools.tee
-- take one iterator and return n
that yield the same but do not share state. - Daenyth
a = iter(i); b = a
does the same thing and I could just as well have written that into the answer instead of [iter(i)] * n
. Either way, there is no "magic" about it. That's no different from any of the other answers to this question - none of them are "magical", they are all in the language. What makes the features "hidden" is that many people don't realize they're possible, or don't realize interesting ways in which they can be used, until they are pointed out explicitly. - David Z
[iter(i)]*n
. Also, it isn't necessarily well known (to many people's peril) that list*int
creates referential, not actual, copies of the elements of the list. It's good to see that that is actually useful somehow. - asmeurer
From python 3.1 ( 2.7 ) dictionary and set comprehensions are supported :
{ a:a for a in range(10) }
{ a for a in range(10) }
dict( (a,a) for a in range(10) )
works too and your error is probably due to remembering this form - Dan D.
Python can understand any kind of unicode digits [1], not just the ASCII kind:
>>> s = u'10585'
>>> s
u'\uff11\uff10\uff15\uff18\uff15'
>>> print s
10585
>>> int(s)
10585
>>> float(s)
10585.0
[1] http://bugs.python.org/issue6595__slots__
is a nice way to save memory, but it's very hard to get a dict of the values of the object. Imagine the following object:
class Point(object):
__slots__ = ('x', 'y')
Now that object obviously has two attributes. Now we can create an instance of it and build a dict of it this way:
>>> p = Point()
>>> p.x = 3
>>> p.y = 5
>>> dict((k, getattr(p, k)) for k in p.__slots__)
{'y': 5, 'x': 3}
This however won't work if point was subclassed and new slots were added. However Python automatically implements __reduce_ex__
to help the copy
module. This can be abused to get a dict of values:
>>> p.__reduce_ex__(2)[2][1]
{'y': 5, 'x': 3}
__reduce_ex__
can be overridden in subclasses, and since it's also used for pickling, it often is. (If you're making data containers, you should think of using it too! or it's younger siblings __getstate__
and __setstate__
.) - Ken Arnold
object.__reduce_ex__(p, 2)[2][1]
then. - Armin Ronacher
itertools
This module is often overlooked. The following example uses itertools.chain()
to flatten a list:
>>> from itertools import *
>>> l = [[1, 2], [3, 4]]
>>> list(chain(*l))
[1, 2, 3, 4]
See http://docs.python.org/library/itertools.html#recipes for more applications.
You can manipulate the modules cache directly, making modules available or unavailable as you wish:
>>> import sys
>>> import ham
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: No module named ham
# Make the 'ham' module available -- as a non-module object even!
>>> sys.modules['ham'] = 'ham, eggs, saussages and spam.'
>>> import ham
>>> ham
'ham, eggs, saussages and spam.'
# Now remove it again.
>>> sys.modules['ham'] = None
>>> import ham
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: No module named ham
This works even for modules that are available, and to some extent for modules that already are imported:
>>> import os
# Stop future imports of 'os'.
>>> sys.modules['os'] = None
>>> import os
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: No module named os
# Our old imported module is still available.
>>> os
<module 'os' from '/usr/lib/python2.5/os.pyc'>
As the last line shows, changing sys.modules only affects future import
statements, not past ones, so if you want to affect other modules it's important to make these changes before you give them a chance to try and import the modules -- so before you import them, typically. None
is a special value in sys.modules
, used for negative caching (indicating the module was not found the first time, so there's no point in looking again.) Any other value will be the result of the import
operation -- even when it is not a module object. You can use this to replace modules with objects that behave exactly like you want. Deleting the entry from sys.modules
entirely causes the next import
to do a normal search for the module, even if it was already imported before.
sys.modules['my_module'] = MyClass()
, to implement read only attributes 'module' if MyClass has the right hooks. - warvariuc
You can ask any object which module it came from by looking at its __ module__ property. This is useful, for example, if you're experimenting at the command line and have imported a lot of things.
Along the same lines, you can ask a module where it came from by looking at its __ file__ property. This is useful when debugging path issues.
Some of the builtin favorites, map(), reduce(), and filter(). All extremely fast and powerful.
reduce()
- SingleNegationElimination
reduce()
can achieve everything you can do with map()
, filter()
, or list comprehensions." - Kragen Javier Sitaker
One word: IPython [1]
Tab introspection, pretty-printing, %debug
, history management, pylab
, ... well worth the time to learn well.
Guessing integer base
>>> int('10', 0)
10
>>> int('0x10', 0)
16
>>> int('010', 0) # does not work on Python 3.x
8
>>> int('0o10', 0) # Python >=2.6 and Python 3.x
8
>>> int('0b10', 0) # Python >=2.6 and Python 3.x
2
You can build up a dictionary from a set of length-2 sequences. Extremely handy when you have a list of values and a list of arrays.
>>> dict([ ('foo','bar'),('a',1),('b',2) ])
{'a': 1, 'b': 2, 'foo': 'bar'}
>>> names = ['Bob', 'Marie', 'Alice']
>>> ages = [23, 27, 36]
>>> dict(zip(names, ages))
{'Alice': 36, 'Bob': 23, 'Marie': 27}
dict((x, x**2) for x in range(10))
. - Marian
Sometimes it's useful to extent (modify) value "returned" by descriptor in subclass. It can be easily done with super()
:
class A(object):
@property
def prop(self):
return {'a': 1}
class B(A):
@property
def prop(self):
return dict(super(B, self).prop, b=2)
Store this in test.py
and run python -i test.py
(another hidden feature: -i
option executed the script and allow you to continue in interactive mode):
>>> B().prop
{'a': 1, 'b': 2}
A slight misfeature of python. The normal fast way to join a list of strings together is,
''.join(list_of_strings)
str.join('',list_of_strings)
but other pythonistas may scorn you for trying to write java. - SingleNegationElimination
"".join(iterable)
over str.join("",iterable)
is because it's 4 characters shorter. - SingleNegationElimination
Creating enums
In Python, you can do this to quickly create an enumeration:
>>> FOO, BAR, BAZ = range(3)
>>> FOO
0
But the "enums" don't have to have integer values. You can even do this:
class Colors(object):
RED, GREEN, BLUE, YELLOW = (255,0,0), (0,255,0), (0,0,255), (0,255,255)
#now Colors.RED is a 3-tuple that returns the 24-bit 8bpp RGB
#value for saturated red
The Object Data Model
You can override any operator in the language for your own classes. See this page [1] for a complete list. Some examples:
You can override any operator (* + - / // % ^ == < > <= >= .
etc.). All this is done by overriding __mul__
, __add__
, etc. in your objects. You can even override things like __rmul__
to handle separately your_object*something_else
and something_else*your_object
. .
is attribute access (a.b
), and can be overridden to handle any arbitrary b
by using __getattr__
. Also included here is a(…)
using __call__
.
You can create your own slice syntax (a[stuff]
), which can be very complicated and quite different from the standard syntax used in lists (numpy has a good example of the power of this in their arrays) using any combination of ,
, :
, and …
that you like, using Slice objects.
Handle specially what happens with many keywords in the language. Included are del
, in
, import
, and not
.
Handle what happens when many built in functions are called with your object. The standard __int__
, __str__
, etc. go here, but so do __len__
, __reversed__
, __abs__
, and the three argument __pow__
(for modular exponentiation).
in
you have to override __contains__
. - asmeurer
The re.Scanner
class. http://code.activestate.com/recipes/457664-hidden-scanner-functionality-in-re-module/
"Unpacking" to function parameters
def foo(a, b, c):
print a, b, c
bar = (3, 14, 15)
foo(*bar)
When executed prints:
3 14 15
The reversed()
builtin. It makes iterating much cleaner in many cases.
quick example:
for i in reversed([1, 2, 3]):
print(i)
produces:
3
2
1
However, reversed()
also works with arbitrary iterators, such as lines in a file, or generator expressions.
The Zen of Python
>>> import this
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
Dutch
. - asmeurer
Changing function label at run time:
>>> class foo:
... def normal_call(self): print "normal_call"
... def call(self):
... print "first_call"
... self.call = self.normal_call
>>> y = foo()
>>> y.call()
first_call
>>> y.call()
normal_call
>>> y.call()
normal_call
...
string-escape
and unicode-escape
encodings
Lets say you have a string from outer source, that contains \n
, \t
and so on. How to transform them into new-line or tab? Just decode string using string-escape
encoding!
>>> print s
Hello\nStack\toverflow
>>> print s.decode('string-escape')
Hello
Stack overflow
Another problem. You have normal string with unicode literals like \u01245
. How to make it work? Just decode string using unicode-escape
encoding!
>>> s = '\u041f\u0440\u0438\u0432\u0456\u0442, \u0441\u0432\u0456\u0442!'
>>> print s
\u041f\u0440\u0438\u0432\u0456\u0442, \u0441\u0432\u0456\u0442!
>>> print unicode(s)
\u041f\u0440\u0438\u0432\u0456\u0442, \u0441\u0432\u0456\u0442!
>>> print unicode(s, 'unicode-escape')
Привіт, світ!
Someone blogged about Python not having an unzip function to go with zip(). unzip is straight-forward to calculate because:
>>> t1 = (0,1,2,3)
>>> t2 = (7,6,5,4)
>>> [t1,t2] == zip(*zip(t1,t2))
True
On reflection though, I'd rather have an explicit unzip().
[1] http://paddy3118.blogspot.com/2007/02/unzip-un-needed-in-python.htmlCreating dictionary of two sequences that have related data
In [15]: t1 = (1, 2, 3)
In [16]: t2 = (4, 5, 6)
In [17]: dict (zip(t1,t2))
Out[17]: {1: 4, 2: 5, 3: 6}
Top Secret Attributes
>>> class A(object): pass
>>> a = A()
>>> setattr(a, "can't touch this", 123)
>>> dir(a)
['__class__', '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', "can't touch this"]
>>> a.can't touch this # duh
File "<stdin>", line 1
a.can't touch this
^
SyntaxError: EOL while scanning string literal
>>> getattr(a, "can't touch this")
123
>>> setattr(a, "__class__.__name__", ":O")
>>> a.__class__.__name__
'A'
>>> getattr(a, "__class__.__name__")
':O'
>>> node = namedtuple('node', "a b")
>>> node(1,2) + node(5,6)
(1, 2, 5, 6)
>>> (node(1,2), node(5,6))
(node(a=1, b=2), node(a=5, b=6))
>>>
Some more experiments to respond to comments:
>>> from collections import namedtuple
>>> from operator import *
>>> mytuple = namedtuple('A', "a b")
>>> yourtuple = namedtuple('Z', "x y")
>>> mytuple(1,2) + yourtuple(5,6)
(1, 2, 5, 6)
>>> q = [mytuple(1,2), yourtuple(5,6)]
>>> q
[A(a=1, b=2), Z(x=5, y=6)]
>>> reduce(operator.__add__, q)
(1, 2, 5, 6)
So, namedtuple
is an interesting subtype of tuple
.
namedtuple
call directly into a class definition, as in class rectangle(namedtuple("rectangle", "width height")):
in order to add custom methods - Ben Blank
namedtuple
s as the representation for parse trees, and their behavior was useful in merging siblings so they looked more like lists. Imagine the typical grammar productions for a list... - Apalala
Dynamically added attributes
This might be useful if you think about adding some attributes to your classes just by calling them. This can be done by overriding the
__getattribute__
[1] member function which is called when the dot operand is used. So, let's see a dummy class for example:
class Dummy(object):
def __getattribute__(self, name):
f = lambda: 'Hello with %s'%name
return f
When you instantiate a Dummy object and do a method call you’ll get the following:
>>> d = Dummy()
>>> d.b()
'Hello with b'
Finally, you can even set the attribute to your class so it can be dynamically defined. This could be useful if you work with Python web frameworks and want to do queries by parsing the attribute's name.
I have a gist [2] at github with this simple code and its equivalent on Ruby made by a friend.
Take care!
[1] http://docs.python.org/reference/datamodel.html#object.__getattribute__
list
[1] with
sum()
[2].The
sum()
[3] built-in function can be used to
__add__
[4]
list
[5]s together, providing a handy way to flatten a
list
[6] of
list
[7]s:
Python 2.7.1 (r271:86832, May 27 2011, 21:41:45)
[GCC 4.2.1 (Apple Inc. build 5664)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> l = [[1, 2, 3], [4, 5], [6], [7, 8, 9]]
>>> sum(l, [])
[1, 2, 3, 4, 5, 6, 7, 8, 9]
[1] http://www.python.org/doc//current/library/stdtypes.html#sequence-types-str-unicode-list-tuple-buffer-xrangeThis is a killer from
Alex Martelli
[1]. All instances of Borg
share state. This removes the need to employ the singleton pattern (instances don't matter when state is shared) and is rather elegant (but is more complicated with new classes).
The value of foo
can be reassigned in any instance and all will be updated, you can even reassign the entire dict. Borg is the perfect name, read more
here
[2].
class Borg:
__shared_state = {'foo': 'bar'}
def __init__(self):
self.__dict__ = self.__shared_state
# rest of your class here
This is perfect for sharing an eventlet.GreenPool to control concurrency.
[1] https://stackoverflow.com/users/95810/alex-martellipdb — The Python Debugger
As a programmer, one of the first things that you need for serious program development is a debugger. Python has one built-in which is available as a module called pdb (for "Python DeBugger", naturally!).
http://docs.python.org/library/pdb.html
threading.enumerate() gives access to all Thread objects in the system and sys._current_frames() returns the current stack frames of all threads in the system, so combine these two and you get Java style stack dumps:
def dumpstacks(signal, frame):
id2name = dict([(th.ident, th.name) for th in threading.enumerate()])
code = []
for threadId, stack in sys._current_frames().items():
code.append("\n# Thread: %s(%d)" % (id2name[threadId], threadId))
for filename, lineno, name, line in traceback.extract_stack(stack):
code.append('File: "%s", line %d, in %s' % (filename, lineno, name))
if line:
code.append(" %s" % (line.strip()))
print "\n".join(code)
import signal
signal.signal(signal.SIGQUIT, dumpstacks)
Do this at the beginning of a multi-threaded python program and you get access to current state of threads at any time by sending a SIGQUIT. You may also choose signal.SIGUSR1 or signal.SIGUSR2.
See [1]
[1] https://stackoverflow.com/questions/132058/getting-stack-trace-from-a-running-python-application/2569696#2569696...that dict.get()
has a
default value
[1] of None, thereby avoiding KeyErrors:
In [1]: test = { 1 : 'a' }
In [2]: test[2]
---------------------------------------------------------------------------
<type 'exceptions.KeyError'> Traceback (most recent call last)
<ipython console> in <module>()
<type 'exceptions.KeyError'>: 2
In [3]: test.get( 2 )
In [4]: test.get( 1 )
Out[4]: 'a'
In [5]: test.get( 2 ) == None
Out[5]: True
and even to specify this 'at the scene':
In [6]: test.get( 2, 'Some' ) == 'Some'
Out[6]: True
And you can use setdefault(
) to have a value set and returned if it doesn't exist:
>>> a = {}
>>> b = a.setdefault('foo', 'bar')
>>> a
{'foo': 'bar'}
>>> b
'bar
[1] http://docs.python.org/library/stdtypes.html#dict.getinspect [1] module is also a cool feature.
[1] http://docs.python.org/library/inspect.html?highlight=inspect#retrieving-source-codeReloading modules enables a "live-coding" style. But class instances don't update. Here's why, and how to get around it. Remember, everything, yes, everything is an object.
>>> from a_package import a_module
>>> cls = a_module.SomeClass
>>> obj = cls()
>>> obj.method()
(old method output)
Now you change the method in a_module.py and want to update your object.
>>> reload(a_module)
>>> a_module.SomeClass is cls
False # Because it just got freshly created by reload.
>>> obj.method()
(old method output)
Here's one way to update it (but consider it running with scissors):
>>> obj.__class__ is cls
True # it's the old class object
>>> obj.__class__ = a_module.SomeClass # pick up the new class
>>> obj.method()
(new method output)
This is "running with scissors" because the object's internal state may be different than what the new class expects. This works for really simple cases, but beyond that, pickle
is your friend. It's still helpful to understand why this works, though.
pickle
(or cPickle
). It was really helpful for me, some weeks ago. - Denilson Sá Maia
Backslashes inside raw strings can still escape quotes. See this:
>>> print repr(r"aaa\"bbb")
'aaa\\"bbb'
Note that both the backslash and the double-quote are present in the final string.
As consequence, you can't end a raw string with a backslash:
>>> print repr(r"C:\")
SyntaxError: EOL while scanning string literal
>>> print repr(r"C:\"")
'C:\\"'
This happens because raw strings were implemented to help writing regular expressions, and not to write Windows paths. Read a long discussion about this at Gotcha — backslashes in Windows filenames [1].
[1] http://pythonconquerstheuniverse.wordpress.com/2008/06/04/gotcha-%E2%80%94-backslashes-in-windows-filenames/'
for the outer string. - asmeurer
Operators can be called as functions:
from operator import add
print reduce(add, [1,2,3,4,5,6])
infinite recursion in list
>>> a = [1,2]
>>> a.append(a)
>>> a
[1, 2, [...]]
>>> a[2]
[1, 2, [...]]
>>> a[2][2][2][2][2][2][2][2][2] == a
True
Ability to substitute even things like file deletion, file opening etc. - direct manipulation of language library. This is a huge advantage when testing. You don't have to wrap everything in complicated containers. Just substitute a function/method and go. This is also called monkey-patching.
Builtin methods or functions don't implement the descriptor protocol which makes it impossible to do stuff like this:
>>> class C(object):
... id = id
...
>>> C().id()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: id() takes exactly one argument (0 given)
However you can create a small bind descriptor that makes this possible:
>>> from types import MethodType
>>> class bind(object):
... def __init__(self, callable):
... self.callable = callable
... def __get__(self, obj, type=None):
... if obj is None:
... return self
... return MethodType(self.callable, obj, type)
...
>>> class C(object):
... id = bind(id)
...
>>> C().id()
7414064
class C(object): id = lambda s, *a, **kw: id(*a, **kw)
; and a better version of bind: def bind(callable): return lambda s, *a, **kw: callable(*a, **kw)
- Lie Ryan
Nested Function Parameter Re-binding
def create_printers(n):
for i in xrange(n):
def printer(i=i): # Doesn't work without the i=i
print i
yield printer
i=i
the i
in the printer
function references the i
from the for
loop rather than the local i
that is created when a new printer
function is created with the i=i
keyword arg. So it still does work (it yields functions, each with access to a closure) but it doesn't work in the way you'd expect without explicitly creating a local variable. - Sean Vieira
You can override the mro of a class with a metaclass
>>> class A(object):
... def a_method(self):
... print("A")
...
>>> class B(object):
... def b_method(self):
... print("B")
...
>>> class MROMagicMeta(type):
... def mro(cls):
... return (cls, B, object)
...
>>> class C(A, metaclass=MROMagicMeta):
... def c_method(self):
... print("C")
...
>>> cls = C()
>>> cls.c_method()
C
>>> cls.a_method()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'C' object has no attribute 'a_method'
>>> cls.b_method()
B
>>> type(cls).__bases__
(<class '__main__.A'>,)
>>> type(cls).__mro__
(<class '__main__.C'>, <class '__main__.B'>, <class 'object'>)
It's probably hidden for a good reason. :)
__metaclass__ = MROMagicMeta
instead. - Alexander Artemenko
Objects of small intgers (-5 .. 256) never created twice:
>>> a1 = -5; b1 = 256
>>> a2 = -5; b2 = 256
>>> id(a1) == id(a2), id(b1) == id(b2)
(True, True)
>>>
>>> c1 = -6; d1 = 257
>>> c2 = -6; d2 = 257
>>> id(c1) == id(c2), id(d1) == id(d2)
(False, False)
>>>
Edit: List objects never destroyed (only objects in lists). Python has array in which it keeps up to 80 empty lists. When you destroy list object - python puts it to that array and when you create new list - python gets last puted list from this array:
>>> a = [1,2,3]; a_id = id(a)
>>> b = [1,2,3]; b_id = id(b)
>>> del a; del b
>>> c = [1,2,3]; id(c) == b_id
True
>>> d = [1,2,3]; id(d) == a_id
True
>>>
You can decorate functions with classes - replacing the function with a class instance:
class countCalls(object):
""" decorator replaces a function with a "countCalls" instance
which behaves like the original function, but keeps track of calls
>>> @countCalls
... def doNothing():
... pass
>>> doNothing()
>>> doNothing()
>>> print doNothing.timesCalled
2
"""
def __init__ (self, functionToTrack):
self.functionToTrack = functionToTrack
self.timesCalled = 0
def __call__ (self, *args, **kwargs):
self.timesCalled += 1
return self.functionToTrack(*args, **kwargs)
Manipulating Recursion Limit
Getting or setting the maximum depth of recursion with sys.getrecursionlimit() & sys.setrecursionlimit().
We can limit it to prevent a stack overflow caused by infinite recursion.
Slices & Mutability
Copying lists
>>> x = [1,2,3]
>>> y = x[:]
>>> y.pop()
3
>>> y
[1, 2]
>>> x
[1, 2, 3]
Replacing lists
>>> x = [1,2,3]
>>> y = x
>>> y[:] = [4,5,6]
>>> x
[4, 5, 6]
Python 2.x ignores commas if found after the last element of the sequence:
>>> a_tuple_for_instance = (0,1,2,3,)
>>> another_tuple = (0,1,2,3)
>>> a_tuple_for_instance == another_tuple
True
A trailing comma causes a single parenthesized element to be treated as a sequence:
>>> a_tuple_with_one_element = (8,)
Slices as lvalues. This Sieve of Eratosthenes produces a list that has either the prime number or 0. Elements are 0'd out with the slice assignment in the loop.
def eras(n):
last = n + 1
sieve = [0,0] + list(range(2, last))
sqn = int(round(n ** 0.5))
it = (i for i in xrange(2, sqn + 1) if sieve[i])
for i in it:
sieve[i*i:last:i] = [0] * (n//i - i + 1)
return filter(None, sieve)
To work, the slice on the left must be assigned a list on the right of the same length.
Rounding Integers: Python has the function round, which returns numbers of type double:
>>> print round(1123.456789, 4)
1123.4568
>>> print round(1123.456789, 2)
1123.46
>>> print round(1123.456789, 0)
1123.0
This function has a wonderful magic property:
>>> print round(1123.456789, -1)
1120.0
>>> print round(1123.456789, -2)
1100.0
If you need an integer as a result use int to convert type:
>>> print int(round(1123.456789, -2))
1100
>>> print int(round(8359980, -2))
8360000
Thank you Gregor [1].
[1] http://mail.python.org/pipermail/tutor/2003-August/024395.htmlList comprehensions
Compare the more traditional (without list comprehension):
foo = []
for x in xrange(10):
if x % 2 == 0:
foo.append(x)
to:
foo = [x for x in xrange(10) if x % 2 == 0]
[1] http://www.secnetix.de/olli/Python/list_comprehensions.hawkToo lazy to initialize every field in a dictionary? No problem:
In Python > 2.3:
from collections import defaultdict
In Python <= 2.3:
def defaultdict(type_):
class Dict(dict):
def __getitem__(self, key):
return self.setdefault(key, type_())
return Dict()
In any version:
d = defaultdict(list)
for stuff in lots_of_stuff:
d[stuff.name].append(stuff)
UPDATE:
Thanks Ken Arnold [1]. I reimplemented a more sophisticated version of defaultdict. It should behave exactly as the one in the standard library [2].
def defaultdict(default_factory, *args, **kw):
class defaultdict(dict):
def __missing__(self, key):
if default_factory is None:
raise KeyError(key)
return self.setdefault(key, default_factory())
def __getitem__(self, key):
try:
return dict.__getitem__(self, key)
except KeyError:
return self.__missing__(key)
return defaultdict(*args, **kw)
[1] https://stackoverflow.com/users/69707/ken-arnolddefaultdict
reimplementation ends up calling type_
on every lookup instead of only when the item is missing. - Ken Arnold
dict
directly, so you'd need to subclass from UserDict.UserDict
. Better still would be to upgrade. - SingleNegationElimination
If you are using descriptors on your classes Python completely bypasses __dict__
for that key which makes it a nice place to store such values:
>>> class User(object):
... def _get_username(self):
... return self.__dict__['username']
... def _set_username(self, value):
... print 'username set'
... self.__dict__['username'] = value
... username = property(_get_username, _set_username)
... del _get_username, _set_username
...
>>> u = User()
>>> u.username = "foo"
username set
>>> u.__dict__
{'username': 'foo'}
This helps to keep dir()
clean.
__getattr__()
getattr
is a really nice way to make generic classes, which is especially useful if you're writing an API. For example, in the
FogBugz Python API
[1], getattr
is used to pass method calls on to the web service seamlessly:
class FogBugz:
...
def __getattr__(self, name):
# Let's leave the private stuff to Python
if name.startswith("__"):
raise AttributeError("No such attribute '%s'" % name)
if not self.__handlerCache.has_key(name):
def handler(**kwargs):
return self.__makerequest(name, **kwargs)
self.__handlerCache[name] = handler
return self.__handlerCache[name]
...
When someone calls FogBugz.search(q='bug')
, they don't get actually call a search
method. Instead, getattr
handles the call by creating a new function that wraps the makerequest
method, which crafts the appropriate HTTP request to the web API. Any errors will be dispatched by the web service and passed back to the user.
Exposing Mutable Buffers
Using the Python Buffer Protocol [1] to expose mutable byte-oriented buffers in Python (2.5/2.6).
(Sorry, no code here. Requires use of low-level C API or existing adapter module).
[1] http://docs.python.org/c-api/buffer.htmlThe pythonic idiom x = ... if ... else ...
is far superior to x = ... and ... or ...
and here is why:
Although the statement
x = 3 if (y == 1) else 2
Is equivalent to
x = y == 1 and 3 or 2
if you use the x = ... and ... or ...
idiom, some day you may get bitten by this tricky situation:
x = 0 if True else 1 # sets x equal to 0
and therefore is not equivalent to
x = True and 0 or 1 # sets x equal to 1
For more on the proper way to do this, see Hidden features of Python [1].
[1] https://stackoverflow.com/questions/101268/hidden-features-of-python/116480#116480Monkeypatching objects
Every object in Python has a __dict__
member, which stores the object's attributes. So, you can do something like this:
class Foo(object):
def __init__(self, arg1, arg2, **kwargs):
#do stuff with arg1 and arg2
self.__dict__.update(kwargs)
f = Foo('arg1', 'arg2', bar=20, baz=10)
#now f is a Foo object with two extra attributes
This can be exploited to add both attributes and functions arbitrarily to objects. This can also be exploited to create a quick-and-dirty struct
type.
class struct(object):
def __init__(**kwargs):
self.__dict__.update(kwargs)
s = struct(foo=10, bar=11, baz="i'm a string!')
__slots__
- John La Rooy
a = 2
, there is no a.__dict__
- Denilson Sá Maia
I'm not sure where (or whether) this is in the Python docs, but for python 2.x (at least 2.5 and 2.6, which I just tried), the print
statement can be called with parenthenses. This can be useful if you want to be able to easily port some Python 2.x code to Python 3.x.
Example:
print('We want Moshiach Now')
should print We want Moshiach Now
work in python 2.5, 2.6, and 3.x.
Also, the not
operator can be called with parenthenses in Python 2 and 3:
not False
and
not(False)
should both return True
.
Parenthenses might also work with other statements and operators.
EDIT: NOT a good idea to put parenthenses around not
operators (and probably any other operators), since it can make for surprising situations, like so (this happens because the parenthenses are just really around the 1
):
>>> (not 1) == 9
False
>>> not(1) == 9
True
This also can work, for some values (I think where it is not a valid identifier name), like this:
not'val'
should return False
, and print'We want Moshiach Now'
should return We want Moshiach Now
. (but not552
would raise a NameError since it is a valid identifier name).
In addition to this mentioned earlier by haridsv [1]:
>>> foo = bar = baz = 1
>>> foo, bar, baz
(1, 1, 1)
it's also possible to do this:
>>> foo, bar, baz = 1, 2, 3
>>> foo, bar, baz
(1, 2, 3)
[1] https://stackoverflow.com/questions/101268/hidden-features-of-python/2582013#2582013getattr
takes a third parametergetattr(obj, attribute_name, default)
is like:
try:
return obj.attribute
except AttributeError:
return default
except that attribute_name
can be any string.
This can be really useful for duck typing [1]. Maybe you have something like:
class MyThing:
pass
class MyOtherThing:
pass
if isinstance(obj, (MyThing, MyOtherThing)):
process(obj)
(btw, isinstance(obj, (a,b))
means isinstance(obj, a) or isinstance(obj, b)
.)
When you make a new kind of thing, you'd need to add it to that tuple everywhere it occurs. (That construction also causes problems when reloading modules or importing the same file under two names. It happens more than people like to admit.) But instead you could say:
class MyThing:
processable = True
class MyOtherThing:
processable = True
if getattr(obj, 'processable', False):
process(obj)
Add inheritance and it gets even better: all of your examples of processable objects can inherit from
class Processable:
processable = True
but you don't have to convince everybody to inherit from your base class, just to set an attribute.
[1] http://en.wikipedia.org/wiki/Duck_typingSimple built-in benchmarking tool
The Python Standard Library comes with a very easy-to-use benchmarking module called "timeit". You can even use it from the command line to see which of several language constructs is the fastest.
E.g.,
% python -m timeit 'r = range(0, 1000)' 'for i in r: pass'
10000 loops, best of 3: 48.4 usec per loop
% python -m timeit 'r = xrange(0, 1000)' 'for i in r: pass'
10000 loops, best of 3: 37.4 usec per loop
Here are 2 easter eggs:
One in python itself:
>>> import __hello__
Hello world...
And another one in the Werkzeug
module, which is a bit complicated to reveal, here it is:
By looking at Werkzeug
's source code, in werkzeug/__init__.py
, there is a line that should draw your attention:
'werkzeug._internal': ['_easteregg']
If you're a bit curious, this should lead you to have a look at the werkzeug/_internal.py
, there, you'll find an _easteregg()
function which takes a wsgi application in argument, it also contains some base64 encoded data and 2 nested functions, that seem to do something special if an argument named macgybarchakku
is found in the query string.
So, to reveal this easter egg, it seems you need to wrap an application in the _easteregg()
function, let's go:
from werkzeug import Request, Response, run_simple
from werkzeug import _easteregg
@Request.application
def application(request):
return Response('Hello World!')
run_simple('localhost', 8080, _easteregg(application))
Now, if you run the app and visit http://localhost:8080/?macgybarchakku, you should see the easter egg.
Dict Comprehensions
>>> {i: i**2 for i in range(5)}
{0: 0, 1: 1, 2: 4, 3: 9, 4: 16}
Wikipedia Entry [2]
[1] http://docs.python.org/dev/reference/expressions.html?highlight=comprehensions#dictionary-displaysSet Comprehensions
>>> {i**2 for i in range(5)}
set([0, 1, 4, 16, 9])
Wikipedia Entry [2]
[1] http://docs.python.org/dev/reference/expressions.html?highlight=comprehensions#set-displaysobject1.add(object2)
should always be object1 + object2
. - fncomp
Access Dictionary elements as attributes (properties). so if an a1=AttrDict() has key 'name' -> instead of a1['name'] we can easily access name attribute of a1 using -> a1.name
class AttrDict(dict):
def __getattr__(self, name):
if name in self:
return self[name]
raise AttributeError('%s not found' % name)
def __setattr__(self, name, value):
self[name] = value
def __delattr__(self, name):
del self[name]
person = AttrDict({'name': 'John Doe', 'age': 66})
print person['name']
print person.name
person.name = 'Frodo G'
print person.name
del person.age
print person
Tuple unpacking in for loops, list comprehensions and generator expressions:
>>> l=[(1,2),(3,4)]
>>> [a+b for a,b in l ]
[3,7]
Useful in this idiom for iterating over (key,data) pairs in dictionaries:
d = { 'x':'y', 'f':'e'}
for name, value in d.items(): # one can also use iteritems()
print "name:%s, value:%s" % (name,value)
prints:
name:x, value:y
name:f, value:e
l
is replaced with zip(something)
. - asmeurer
The first-classness of everything ('everything is an object'), and the mayhem this can cause.
>>> x = 5
>>> y = 10
>>>
>>> def sq(x):
... return x * x
...
>>> def plus(x):
... return x + x
...
>>> (sq,plus)[y>x](y)
20
The last line creates a tuple containing the two functions, then evaluates y>x (True) and uses that as an index to the tuple (by casting it to an int, 1), and then calls that function with parameter y and shows the result.
For further abuse, if you were returning an object with an index (e.g. a list) you could add further square brackets on the end; if the contents were callable, more parentheses, and so on. For extra perversion, use the result of code like this as the expression in another example (i.e. replace y>x with this code):
(sq,plus)[y>x](y)[4](x)
This showcases two facets of Python - the 'everything is an object' philosophy taken to the extreme, and the methods by which improper or poorly-conceived use of the language's syntax can lead to completely unreadable, unmaintainable spaghetti code that fits in a single expression.
Taking advantage of python's dynamic nature to have an apps config files in python syntax. For example if you had the following in a config file:
{
"name1": "value1",
"name2": "value2"
}
Then you could trivially read it like:
config = eval(open("filename").read())
eval
. - a paid nerd
ConfigParser
- 10 lines of code give you a full blown mechanism for creating universally readable configuration file. Your approach is really not portable and not extensible. - Escualo
eval()
for anything, especially settings. I always wrap Django settings around ConfigParser
and save actual information in a permission-guarded file. Like Rasmus Lerdorf said "If eval() is the answer, you’re almost certainly asking the wrong question." - AdmiralNemo
Method replacement for object instance
You can replace methods of already created object instances. It allows you to create object instance with different (exceptional) functionality:
>>> class C(object):
... def fun(self):
... print "C.a", self
...
>>> inst = C()
>>> inst.fun() # C.a method is executed
C.a <__main__.C object at 0x00AE74D0>
>>> instancemethod = type(C.fun)
>>>
>>> def fun2(self):
... print "fun2", self
...
>>> inst.fun = instancemethod(fun2, inst, C) # Now we are replace C.a by fun2
>>> inst.fun() # ... and fun2 is executed
fun2 <__main__.C object at 0x00AE74D0>
As we can C.a
was replaced by fun2()
in inst
instance (self
didn't change).
Alternatively we may use new
module, but it's depreciated since Python 2.6:
>>> def fun3(self):
... print "fun3", self
...
>>> import new
>>> inst.fun = new.instancemethod(fun3, inst, C)
>>> inst.fun()
fun3 <__main__.C object at 0x00AE74D0>
Node: This solution shouldn't be used as general replacement of inheritance mechanism! But it may be very handy in some specific situations (debugging, mocking).
Warning: This solution will not work for built-in types and for new style classes using slots.
instancemethod
to classes; paticularly so that the binding behavior foo.method
works normally. If I'm binding self explicitly, I'll instead use functools.partial
, which achieves the same effect, but makes it a bit clearer that the binding behavior is explicit. - SingleNegationElimination
With a minute amount of work, the threading module becomes amazingly easy to use. This decorator changes a function so that it runs in its own thread, returning a placeholder class instance instead of its regular result. You can probe for the answer by checking placeolder.result or wait for it by calling placeholder.awaitResult()
def threadify(function):
"""
exceptionally simple threading decorator. Just:
>>> @threadify
... def longOperation(result):
... time.sleep(3)
... return result
>>> A= longOperation("A has finished")
>>> B= longOperation("B has finished")
A doesn't have a result yet:
>>> print A.result
None
until we wait for it:
>>> print A.awaitResult()
A has finished
we could also wait manually - half a second more should be enough for B:
>>> time.sleep(0.5); print B.result
B has finished
"""
class thr (threading.Thread,object):
def __init__(self, *args, **kwargs):
threading.Thread.__init__ ( self )
self.args, self.kwargs = args, kwargs
self.result = None
self.start()
def awaitResult(self):
self.join()
return self.result
def run(self):
self.result=function(*self.args, **self.kwargs)
return thr
There are no secrets in Python ;)
You can assign several variables to the same value
>>> foo = bar = baz = 1
>>> foo, bar, baz
(1, 1, 1)
Useful to initialize several variable to None, in a compact way.
foo == bar == baz
. It's essentially the same thing as (what is right now) the top answer. - asmeurer
None
, though, since it is a singleton object. - asmeurer
Combine unpacking with the print function:
# in 2.6 <= python < 3.0, 3.0 + the print function is native
from __future__ import print_function
mylist = ['foo', 'bar', 'some other value', 1,2,3,4]
print(*mylist)
print(' '.join([str(x) for x in mylist]))
. Using unpacking like this is too clever. - Brian
*
means you're unpacking a list or tuple, and you can use the sep
keyword. - Wayne Werner
*
and **
to be too magical, period. - ssokolow
print(' '.join(word for word in mylist))
- Paweł Prażak
insert vs append
not a feature, but may be interesting
suppose you want to insert some data in a list, and then reverse it. the easiest thing is
count = 10 ** 5
nums = []
for x in range(count):
nums.append(x)
nums.reverse()
then you think: what about inserting the numbers from the beginning, instead? so:
count = 10 ** 5
nums = []
for x in range(count):
nums.insert(0, x)
but it turns to be 100 times slower! if we set count = 10 ** 6, it will be 1,000 times slower; this is because insert is O(n^2), while append is O(n).
the reason for that difference is that insert has to move each element in a list each time it's called; append just add at the end of the list that elements (sometimes it has to re-allocate everything, but it's still much more fast)
A module exports EVERYTHING in its namespace
Including names imported from other modules!
# this is "answer42.py"
from operator import *
from inspect import *
Now test what's importable from the module.
>>> import answer42
>>> answer42.__dict__.keys()
['gt', 'imul', 'ge', 'setslice', 'ArgInfo', 'getfile', 'isCallable', 'getsourcelines', 'CO_OPTIMIZED', 'le', 're', 'isgenerator', 'ArgSpec', 'imp', 'lt', 'delslice', 'BlockFinder', 'getargspec', 'currentframe', 'CO_NOFREE', 'namedtuple', 'rshift', 'string', 'getframeinfo', '__file__', 'strseq', 'iconcat', 'getmro', 'mod', 'getcallargs', 'isub', 'getouterframes', 'isdatadescriptor', 'modulesbyfile', 'setitem', 'truth', 'Attribute', 'div', 'CO_NESTED', 'ixor', 'getargvalues', 'ismemberdescriptor', 'getsource', 'isMappingType', 'eq', 'index', 'xor', 'sub', 'getcomments', 'neg', 'getslice', 'isframe', '__builtins__', 'abs', 'getmembers', 'mul', 'getclasstree', 'irepeat', 'is_', 'getitem', 'indexOf', 'Traceback', 'findsource', 'ModuleInfo', 'ipow', 'TPFLAGS_IS_ABSTRACT', 'or_', 'joinseq', 'is_not', 'itruediv', 'getsourcefile', 'dis', 'os', 'iand', 'countOf', 'getinnerframes', 'pow', 'pos', 'and_', 'lshift', '__name__', 'sequenceIncludes', 'isabstract', 'isbuiltin', 'invert', 'contains', 'add', 'isSequenceType', 'irshift', 'types', 'tokenize', 'isfunction', 'not_', 'istraceback', 'getmoduleinfo', 'isgeneratorfunction', 'getargs', 'CO_GENERATOR', 'cleandoc', 'classify_class_attrs', 'EndOfBlock', 'walktree', '__doc__', 'getmodule', 'isNumberType', 'ilshift', 'ismethod', 'ifloordiv', 'formatargvalues', 'indentsize', 'getmodulename', 'inv', 'Arguments', 'iscode', 'CO_NEWLOCALS', 'formatargspec', 'iadd', 'getlineno', 'imod', 'CO_VARKEYWORDS', 'ne', 'idiv', '__package__', 'CO_VARARGS', 'attrgetter', 'methodcaller', 'truediv', 'repeat', 'trace', 'isclass', 'ior', 'ismethoddescriptor', 'sys', 'isroutine', 'delitem', 'stack', 'concat', 'getdoc', 'getabsfile', 'ismodule', 'linecache', 'floordiv', 'isgetsetdescriptor', 'itemgetter', 'getblock']
>>> from answer42 import getmembers
>>> getmembers
<function getmembers at 0xb74b2924>
>>>
That's a good reason not to from x import *
and to define __all__ =
.
__all__
exists to limit what's exported, and it's even in the tutorial. - Cat Plus Plus
math
to do its thing, and cope with standard arithmetic in a later version; importing modules should not know about that (traditionally). - Apalala
math.sqrt()
, then you should import it from math
, not from answer42
. - Apalala
int
or float
? All of them are objects and all of them reside in the module's namespace. The design is consistent. It let's you do stuff like def f(m, x): return m.sqrt(x); f(math, x)
, although this is not exactly a good coding style. The "feature" is unusual and unexpected only if you compare it with other languages (that are less "dynamic" whatever that means). - Cristian Ciupitu
os
module. It does stuff like import posixpath as path
or import ntpath as path
, posixpath
and ntpath
being other modules, of course. - Cristian Ciupitu
Unicode identifier in Python3:
>>> 'Unicode字符_تكوين_Variable'.isidentifier()
True
>>> Unicode字符_تكوين_Variable='Python3 rules!'
>>> Unicode字符_تكوين_Variable
'Python3 rules!'
Python have exceptions for very unexpected things:
Imports
This let you import an alternative if a lib is missing
try:
import json
except ImportError:
import simplejson as json
Iteration
For loops do this internally, and catch StopIteration:
iter([]).next()
Traceback (most recent call last):
File "<pyshell#4>", line 1, in <module>
iter(a).next()
StopIteration
Assertion
>>> try:
... assert []
... except AssertionError:
... print "This list should not be empty"
This list should not be empty
While this is more verbose for one check, multiple checks mixing exceptions and boolean operators with the same error message can be shortened this way.
Everything is dynamic
"There is no compile-time". Everything in Python is runtime. A module is 'defined' by executing the module's source top-to-bottom, just like a script, and the resulting namespace is the module's attribute-space. Likewise, a class is 'defined' by executing the class body top-to-bottom, and the resulting namespace is the class's attribute-space. A class body can contain completely arbitrary code -- including import statements, loops and other class statements. Creating a class, function or even module 'dynamically', as is sometimes asked for, isn't hard; in fact, it's impossible to avoid, since everything is 'dynamic'.
dict.x = 3
, and Python won't let you) - Denilson Sá Maia
dict.x = 3
has nothing to do with things being dynamic, but with the dict
type not allowing attributes to be assigned. You can make your own classes, in Python, that don't allow that. You can make your own type, in C, that does allow it. It's unrelated. - Thomas Wouters
Objects in boolean context
Empty tuples, lists, dicts, strings and many other objects are equivalent to False in boolean context (and non-empty are equivalent to True).
empty_tuple = ()
empty_list = []
empty_dict = {}
empty_string = ''
empty_set = set()
if empty_tuple or empty_list or empty_dict or empty_string or empty_set:
print 'Never happens!'
This allows logical operations to return one of it's operands instead of True/False, which is useful in some situations:
s = t or "Default value" # s will be assigned "Default value"
# if t is false/empty/none
There's a common idiom in Python of denoting methods and other class members that are not intended to be part of the class's external API by giving them names that start with underscores. This is convenient and works very well in practice, but it gives the false impression that Python does not support true encapsulation of private code and/or data. In fact, Python automatically gives you lexical closures [1], which make it very easy to encapsulate data in a much more bulletproof way when the situation really warrants it. Here's a contrived example of a class that makes use of this technique:
class MyClass(object):
def __init__(self):
privateData = {}
self.publicData = 123
def privateMethod(k):
print privateData[k] + self.publicData
def privilegedMethod():
privateData['foo'] = "hello "
privateMethod('foo')
self.privilegedMethod = privilegedMethod
def publicMethod(self):
print self.publicData
And here's a contrived example of its use:
>>> obj = MyClass()
>>> obj.publicMethod()
123
>>> obj.publicData = 'World'
>>> obj.publicMethod()
World
>>> obj.privilegedMethod()
hello World
>>> obj.privateMethod()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'MyClass' object has no attribute 'privateMethod'
>>> obj.privateData
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'MyClass' object has no attribute 'privateData'
The key is that privateMethod
and privateData
aren't really attributes of obj at all, so they can't be accessed from outside, nor do they show up in dir()
or similar. They're local variables in the constructor, completely inaccessible outside of __init__
. However, because of the magic of closures, they really are per-instance variables with the same lifetime as the object with which they're associated, even though there's no way to access them from outside except (in this example) by invoking privilegedMethod
. Often this sort of very strict encapsulation is overkill, but sometimes it really can be very handy for keeping an API or a namespace squeaky clean.
In Python 2.x, the only way to have mutable private state is with a mutable object (such as the dict in this example). Many people have remarked on how annoying this can be. Python 3.x will remove this restriction by introducing the nonlocal
keyword described in
PEP 3104
[2].
__getattribute__
or __getattr__
sentinal and route accepted calls accordingly. Again, secrecy and python isnt a good idea. - Jeffrey Jose
Functional support.
Generators and generator expressions, specifically.
Ruby made this mainstream again, but Python can do it just as well. Not as ubiquitous in the libraries as in Ruby, which is too bad, but I like the syntax better, it's simpler.
Because they're not as ubiquitous, I don't see as many examples out there on why they're useful, but they've allowed me to write cleaner, more efficient code.
Simulating the tertiary operator using and and or.
and and or operators in python return the objects themselves rather than Booleans. Thus:
In [18]: a = True
In [19]: a and 3 or 4
Out[19]: 3
In [20]: a = False
In [21]: a and 3 or 4
Out[21]: 4
However, Py 2.5 seems to have added an explicit tertiary operator
In [22]: a = 5 if True else '6'
In [23]: a
Out[23]: 5
Well, this works if you are sure that your true clause does not evaluate to False. example:
>>> def foo():
... print "foo"
... return 0
...
>>> def bar():
... print "bar"
... return 1
...
>>> 1 and foo() or bar()
foo
bar
1
To get it right, you've got to just a little bit more:
>>> (1 and [foo()] or [bar()])[0]
foo
0
However, this isn't as pretty. if your version of python supports it, use the conditional operator.
>>> foo() if True or bar()
foo
0
(falseValue, trueValue)[cond]
is a cleaner (IMO) way to simulate a ternary operator. - Ponkadoodle
If you've renamed a class in your application where you're loading user-saved files via Pickle, and one of the renamed classes are stored in a user's old save, you will not be able to load in that pickled file.
However, simply add in a reference to your class definition and everything's good:
e.g., before:
class Bleh:
pass
now,
class Blah:
pass
so, your user's pickled saved file contains a reference to Bleh, which doesn't exist due to the rename. The fix?
Bleh = Blah
simple!
The fact that EVERYTHING is an object, and as such is extensible. I can add member variables as metadata to a function that I define:
>>> def addInts(x,y):
... return x + y
>>> addInts.params = ['integer','integer']
>>> addInts.returnType = 'integer'
This can be very useful for writing dynamic unit tests, e.g.
Simple way to test if a key is in a dict:
>>> 'key' in { 'key' : 1 }
True
>>> d = dict(key=1, key2=2)
>>> if 'key' in d:
... print 'Yup'
...
Yup
** Using sets to reference contents in sets of frozensets**
As you probably know, sets are mutable and thus not hashable, so it's necessary to use frozensets if you want to make a set of sets (or use sets as dictionary keys):
>>> fabc = frozenset('abc')
>>> fxyz = frozenset('xyz')
>>> mset = set((fabc, fxyz))
>>> mset
{frozenset({'a', 'c', 'b'}), frozenset({'y', 'x', 'z'})}
However, it's possible to test for membership and remove/discard members using just ordinary sets:
>>> abc = set('abc')
>>> abc in mset
True
>>> mset.remove(abc)
>>> mset
{frozenset({'y', 'x', 'z'})}
To quote from the Python Standard Library docs:
Note, the
elem
argument to the__contains__()
,remove()
, anddiscard()
methods may be a set. To support searching for an equivalent frozenset, theelem
set is temporarily mutated during the search and then restored. During the search, theelem
set should not be read or mutated since it does not have a meaningful value.
Unfortunately, and perhaps astonishingly, the same is not true of dictionaries:
>>> mdict = {fabc:1, fxyz:2}
>>> fabc in mdict
True
>>> abc in mdict
Traceback (most recent call last):
File "<interactive input>", line 1, in <module>
TypeError: unhashable type: 'set'
Variables that start, but not end, with a double underscore become private, and not just by convention. Actually __var turns into _Classname__var, where Classname is the class in which the variable was created. They are not inherited and cannot be overriden.
>>> class A:
... def __init__(self):
... self.__var = 5
... def getvar(self):
... return self.__var
...
>>> a = A()
>>> a.__var
Traceback (most recent call last):
File "", line 1, in
AttributeError: A instance has no attribute '__var'
>>> a.getvar()
5
>>> dir(a)
['_A__var', '__doc__', '__init__', '__module__', 'getvar']
>>>
while not very pythonic you can write to a file using
print
[1]
print>>outFile, 'I am Being Written'
Explanation [2]:
[1] http://docs.python.org/reference/simple_stmts.html#the-print-statementThis form is sometimes referred to as “
write()
method as described above. With this extended form, the subsequent expressions are printed to this file object. If the first expression evaluates toNone
, thensys.stdout
is used as the file for output.
print('I am being writtten', file=outFile)
. I was just reading about the changes. So now it actually is much more pythonic. - shadowland
Print multiline strings one screenful at a time
Not really useful feature hidden in the site._Printer
class, whose the license
object is an instance. The latter, when called, prints the Python license. One can create another object of the same type, passing a string -- e.g. the content of a file -- as the second argument, and call it:
type(license)(0,open('textfile.txt').read(),0)()
That would print the file content splitted by a certain number of lines at a time:
...
file row 21
file row 22
file row 23
Hit Return for more, or q (and Return) to quit:
If you use exec
in a function the variable lookup rules change drastically. Closures are no longer possible but Python allows arbitrary identifiers in the function. This gives you a "modifiable locals()" and can be used to star-import identifiers. On the downside it makes every lookup slower because the variables end up in a dict rather than slots in the frame:
>>> def f():
... exec "a = 42"
... return a
...
>>> def g():
... a = 42
... return a
...
>>> import dis
>>> dis.dis(f)
2 0 LOAD_CONST 1 ('a = 42')
3 LOAD_CONST 0 (None)
6 DUP_TOP
7 EXEC_STMT
3 8 LOAD_NAME 0 (a)
11 RETURN_VALUE
>>> dis.dis(g)
2 0 LOAD_CONST 1 (42)
3 STORE_FAST 0 (a)
3 6 LOAD_FAST 0 (a)
9 RETURN_VALUE
spam
module in standard PythonIt is used for testing purposes.
I've picked it from
ctypes
tutorial
[1]. Try it yourself:
>>> import __hello__
Hello world...
>>> type(__hello__)
<type 'module'>
>>> from __phello__ import spam
Hello world...
Hello world...
>>> type(spam)
<type 'module'>
>>> help(spam)
Help on module __phello__.spam in __phello__:
NAME
__phello__.spam
FILE
c:\python26\<frozen>
[1] http://starship.python.net/crew/theller/ctypes/tutorial.htmlctypes
tutorial starship.python.net/crew/theller/ctypes/… - jfs
Memory Management
Python dynamically allocates memory and uses garbage collection to recover unused space. Once an object is out of scope, and no other variables reference it, it will be recovered. I do not have to worry about buffer overruns and slowly growing server processes. Memory management is also a feature of other dynamic languages but Python just does it so well.
Of course, we must watch out for circular references and keeping references to objects which are no longer needed, but weak references help a lot here.
The getattr built-in function :
>>> class C():
def getMontys(self):
self.montys = ['Cleese','Palin','Idle','Gilliam','Jones','Chapman']
return self.montys
>>> c = C()
>>> getattr(c,'getMontys')()
['Cleese', 'Palin', 'Idle', 'Gilliam', 'Jones', 'Chapman']
>>>
Useful if you want to dispatch function depending on the context. See examples in Dive Into Python ( Here [1])
[1] http://diveintopython.net/power_of_introspection/getattr.htmlClasses as first-class objects (shown through a dynamic class definition)
Note the use of the closure as well. If this particular example looks like a "right" approach to a problem, carefully reconsider ... several times :)
def makeMeANewClass(parent, value):
class IAmAnObjectToo(parent):
def theValue(self):
return value
return IAmAnObjectToo
Klass = makeMeANewClass(str, "fred")
o = Klass()
print isinstance(o, str) # => True
print o.theValue() # => fred
Regarding Nick Johnson's implementation of a Property class [1] (just a demonstration of descriptors, of course, not a replacement for the built-in), I'd include a setter that raises an AttributeError:
class Property(object): def __init__(self, fget): self.fget = fget def __get__(self, obj, type): if obj is None: return self return self.fget(obj) def __set__(self, obj, value): raise AttributeError, 'Read-only attribute'
Including the setter makes this a data descriptor as opposed to a method/non-data descriptor. A data descriptor has precedence over instance dictionaries. Now an instance can't have a different object assigned to the property name, and attempts to assign to the property will raise an attribute error.
[1] https://stackoverflow.com/questions/101268/hidden-features-of-python/102062#102062Not at all a hidden feature but still nice:
import os.path as op
root_dir = op.abspath(op.join(op.dirname(__file__), ".."))
Saves lots of characters when manipulating paths !
Ever used xrange(INT) instead of range(INT) .... It's got less memory usage and doesn't really depend on the size of the integer. Yey!! Isn't that good?
Not really a hidden feature but something that might come in handy.
for looping through items in a list pairwise
for x, y in zip(s, s[1:]):
>>> float('infinity')
inf
>>> float('NaN')
nan
More info:
float('inf')
would do nicely, not requiring any special code at all (inf - 1 == inf
). - Chris Morgan
>>> x=[1,1,2,'a','a',3]
>>> y = [ _x for _x in x if not _x in locals()['_[1]'] ]
>>> y
[1, 2, 'a', 3]
"locals()['_[1]']" is the "secret name" of the list being created. Very useful when state of list being built affects subsequent build decisions.
mapreduce using map and reduce functions
create a simple sumproduct this way:
def sumprod(x,y):
return reduce(lambda a,b:a+b, map(lambda a, b: a*b,x,y))
example:
In [2]: sumprod([1,2,3],[4,5,6])
Out[2]: 32
Not a programming feature but is useful when using Python with bash
or shell scripts
.
python -c"import os; print(os.getcwd());"
See the python documentation here [1]. Additional things to note when writing longer Python scripts can be seen in this discussion [2].
[1] http://docs.python.org/using/cmdline.html#cmdoption-unittest-discover-cPython's positional and keyword expansions can be used on the fly, not just from a stored list.
l=lambda x,y,z:x+y+z
a=1,2,3
print l(*a)
print l(*[a[0],2,3])
It is usually more useful with things like this:
a=[2,3]
l(*(a+[3]))
You can construct a functions kwargs on demand:
kwargs = {}
kwargs[str("%s__icontains" % field)] = some_value
some_function(**kwargs)
The str() call is somehow needed, since python complains otherwise that it is no string. Don't know why ;) I use this for a dynamic filters within Djangos object model:
result = model_class.objects.filter(**kwargs)
Multiply a string to get it repeated
print "SO"*5
gives
SOSOSOSOSO
[3]*3 == [3, 3, 3]
- inspectorG4dget
commands.getoutput
If you want to get the output of a function which outputs directly to stdout
or stderr
as is the case with os.system
,
commands.getoutput
[1] comes to the rescue. The whole module is just made of awesome.
>>> print commands.getoutput('ls')
myFile1.txt myFile2.txt myFile3.txt myFile4.txt myFile5.txt
myFile6.txt myFile7.txt myFile8.txt myFile9.txt myFile10.txt
myFile11.txt myFile12.txt myFile13.txt myFile14.txt module.py
[1] http://docs.python.org/library/commands.htmlsubprocess
module and has been removed in Python 3.0, shouldn't you be talking about subprocess
instead of commands
? - ssokolow
subprocess.check_output
- wim
Here is a helpful function I use when debugging type errors
def typePrint(object):
print(str(object) + " - (" + str(type(object)) + ")")
It simply prints the input followed by the type, for example
>>> a = 101
>>> typePrint(a)
101 - (<type 'int'>)
Interactive Debugging of Scripts (and doctest strings)
I don't think this is as widely known as it could be, but add this line to any python script:
import pdb; pdb.set_trace()
will cause the PDB debugger to pop up with the run cursor at that point in the code. What's even less known, I think, is that you can use that same line in a doctest:
"""
>>> 1 in (1,2,3)
Becomes
>>> import pdb; pdb.set_trace(); 1 in (1,2,3)
"""
You can then use the debugger to checkout the doctest environment. You can't really step through a doctest because the lines are each run autonomously, but it's a great tool for debugging the doctest globs and environment.
In Python 2 you can generate a string representation of an expression by enclosing it with backticks:
>>> `sorted`
'<built-in function sorted>'
This is gone in python 3.X.
some cool features with reduce and operator.
>>> from operator import add,mul
>>> reduce(add,[1,2,3,4])
10
>>> reduce(mul,[1,2,3,4])
24
>>> reduce(add,[[1,2,3,4],[1,2,3,4]])
[1, 2, 3, 4, 1, 2, 3, 4]
>>> reduce(add,(1,2,3,4))
10
>>> reduce(mul,(1,2,3,4))
24
is_ok() and "Yes" or "No"
True and False or True
is True, but False if True else True
is false, which is almost certainly what you expected. This is especially important where the operands have side effects, and the conditional operator will NEVER evaluate more than one of its conditional clauses. - SingleNegationElimination
for line in open('foo'):
print(line)
which is equivalent (but better) to:
f = open('foo', 'r')
for line in f.readlines():
print(line)
f.close()
to activate the autocompletion in IDE that accepts it (like IDLE, Editra, IEP) instead of making: "hi". (and then you hit TAB), you can cheat in the IDE, just make hi". (and you heat TAB) (as you can see, there is no single quote in the beginning) because it will only follows the latest punctuation, it's like when you add : and hit enter, it adds directly an indentation, dont know if it will make change, but it's a tip no more :)
Braces
def g():
print 'hi!'
def f(): (
g()
)
>>> f()
hi!
#{
and #}
for braces. Subject to certain lexical constraints. - detly
10 > x <= 9
the same asx <= 9
(ignoring overloaded operators, that is) - tzot(< 1 x 10)
. You can even apply them to single arguments, like(= 10)
: cs.cmu.edu/Groups/AI/html/hyperspec/HyperSpec/Body/… - KenA in B == C in D
. - Joschuaa == b == c
, and it will return True only if all three items are equal. - asmeureris not
andnot in
are similarly surprisingly good too. Apparentlyis not
is 1 binary operator, not a binary and then a unary.not in
is the same too. This makes code like'foo' is not 'bar'
so much more readable. - Y.H Wong