A fresh release of Python, version Two.0, wasgoed released on October 16, 2000. This article covers the titillating fresh features te Two.0, highlights some other useful switches, and points out a few incompatible switches that may require rewriting code.
Python’s development never entirely stops inbetween releases, and a constant flow of bug fixes and improvements are always being submitted. A host of minor fixes, a few optimizations, extra docstrings, and better error messages went into Two.0, to list them all would be unlikely, but they’re certainly significant. Raadpleging the publicly-available CVS logs if you want to see the total list. This progress is due to the five developers working for PythonLabs are now getting paid to spend their days fixing bugs, and also due to the improved communication resulting from moving to SourceForge.
What About Python 1.6?
Python 1.6 can be thought of spil the Contractual Obligations Python release. After the core development team left CNRI te May 2000, CNRI requested that a 1.6 release be created, containing all the work on Python that had bot performed at CNRI. Python 1.6 therefore represents the state of the CVS tree spil of May 2000, with the most significant fresh feature being Unicode support. Development continued after May, of course, so the 1.6 tree received a few fixes to ensure that it’s forward-compatible with Python Two.0. 1.6 is therefore part of Python’s evolution, and not a side branch.
So, should you take much rente te Python 1.6? Very likely not. The 1.6final and Two.0beta1 releases were made on the same day (September Five, 2000), the project being to finalize Python Two.0 within a month or so. If you have applications to maintain, there seems little point ter cracking things by moving to 1.6, fixing them, and then having another round of breakage within a month by moving to Two.0, you’re better off just going straight to Two.0. Most of the truly interesting features described ter this document are only te Two.0, because a loterijlot of work wasgoed done inbetween May and September.
Fresh Development Process
The most significant switch ter Python Two.0 may not be to the code at all, but to how Python is developed: te May 2000 the Python developers began using the instruments made available by SourceForge for storing source code, tracking bug reports, and managing the queue of patch submissions. To report bugs or submit patches for Python Two.0, use the bug tracking and patch manager devices available from Python’s project pagina, located at http://sourceforge.televisiekanaal/projects/python/.
The most significant of the services now hosted at SourceForge is the Python CVS tree, the version-controlled repository containing the source code for Python. Previously, there were toughly 7 or so people who had write access to the CVS tree, and all patches had to be studied and checked ter by one of the people on this brief list. Obviously, this wasn’t very scalable. By moving the CVS tree to SourceForge, it became possible to grant write access to more people, spil of September 2000 there were 27 people able to check ter switches, a fourfold increase. This makes possible large-scale switches that wouldn’t be attempted if they’d have to be filtered through the petite group of core developers. For example, one day Peter Schneider-Kamp took it into his head to druppel K&R C compatibility and convert the C source for Python to ANSI C. After getting approval on the python-dev mailing list, he launched into a flurry of checkins that lasted about a week, other developers joined ter to help, and the job wasgoed done. If there were only Five people with write access, very likely that task would have bot viewed spil “nice, but not worth the time and effort needed” and it would never have gotten done.
The shift to using SourceForge’s services has resulted te a remarkable increase ter the speed of development. Patches now get submitted, commented on, revised by people other than the llamativo submitter, and bounced back and forward inbetween people until the patch is deemed worth checking te. Bugs are tracked ter one central location and can be assigned to a specific person for fixing, and wij can count the number of open bugs to measure progress. This didn’t come without a cost: developers now have more e-mail to overeenkomst with, more mailing lists to go after, and special contraptions had to be written for the fresh environment. For example, SourceForge sends default patch and bug notification e-mail messages that are totally unhelpful, so Ka-Ping Yee wrote an HTML screen-scraper that sends more useful messages.
The ease of adding code caused a few initial growing agonies, such spil code wasgoed checked te before it wasgoed ready or without getting clear agreement from the developer group. The approval process that has emerged is somewhat similar to that used by the Apache group. Developers can vote +1, +0, -0, or -1 on a patch, +1 and -1 denote acceptance or rejection, while +0 and -0 mean the developer is mostly indifferent to the switch, tho’ with a slight positive or negative slant. The most significant switch from the Apache specimen is that the voting is essentially advisory, letting Guido van Rossum, who has Benevolent Dictator For Life status, know what the normal opinion is. He can still overlook the result of a vote, and approve or reject a switch even if the community disagrees with him.
Producing an presente patch is the last step te adding a fresh feature, and is usually effortless compared to the earlier task of coming up with a good vormgeving. Discussions of fresh features can often explode into lengthy mailing list threads, making the discussion hard to go after, and no one can read every posting to python-dev. Therefore, a relatively formal process has bot set up to write Python Enhancement Proposals (PEPs), modelled on the Internet RFC process. PEPs are draft documents that describe a proposed fresh feature, and are continually revised until the community reaches a overeenstemming, either accepting or rejecting the proposal. Quoting from the introduction to PEP 1, “PEP Purpose and Guidelines”:
PEP stands for Python Enhancement Proposal. A PEP is a vormgeving document providing information to the Python community, or describing a fresh feature for Python. The PEP should provide a concise technical specification of the feature and a rationale for the feature.
Wij intend PEPs to be the primary mechanisms for proposing fresh features, for collecting community input on an punt, and for documenting the vormgeving decisions that have gone into Python. The PEP author is responsible for building overeenstemming within the community and documenting dissenting opinions.
Read the surplus of PEP 1 for the details of the PEP editorial process, style, and format. PEPs are kept ter the Python CVS tree on SourceForge, tho’ they’re not part of the Python Two.0 distribution, and are also available ter HTML form from http://www.python.org/peps/. Spil of September 2000, there are 25 PEPS, ranging from PEP 201, “Lockstep Iteration”, to PEP 225, “Elementwise/Objectwise Operators”.
The largest fresh feature ter Python Two.0 is a fresh fundamental gegevens type: Unicode strings. Unicode uses 16-bit numbers to represent characters instead of the 8-bit number used by ASCII, meaning that 65,536 distinct characters can be supported.
The final interface for Unicode support wasgoed arrived at through uncountable often- stormy discussions on the python-dev mailing list, and mostly implemented by Marc-André Lemburg, based on a Unicode string type implementation by Fredrik Lundh. A detailed explanation of the interface wasgoed written up spil PEP 100, “Python Unicode Integration”. This article will simply voorkant the most significant points about the Unicode interfaces.
Te Python source code, Unicode strings are written spil u",string", . Arbitrary Unicode characters can be written using a fresh escape sequence, \uHHHH , where HHHH is a 4-digit hexadecimal number from 0000 to FFFF. The existing \xHHHH escape sequence can also be used, and octal escapes can be used for characters up to U+01FF, which is represented by \777 .
Unicode strings, just like regular strings, are an immutable sequence type. They can be indexed and sliced, but not modified ter place. Unicode strings have an encode( [encoding] ) method that comes back an 8-bit string te the desired encoding. Encodings are named by strings, such spil ‘ascii’ , ‘utf-8’ , ‘iso-8859-1’ , or whatever. A codec API is defined for implementing and registering fresh encodings that are then available across a Python program. If an encoding isn’t specified, the default encoding is usually 7-bit ASCII, however it can be switched for your Python installation by calling the sys.setdefaultencoding(encoding)() function ter a customised version of webpagina.py .
Combining 8-bit and Unicode strings always coerces to Unicode, using the default ASCII encoding, the result of ‘a’ + u’bc’ is u’alfabet’ .
Fresh built-in functions have bot added, and existing built-ins modified to support Unicode:
- unichr(ch) comebacks a Unicode string 1 character long, containing the character ch.
- ord(u) , where u is a 1-character regular or Unicode string, comes back the number of the character spil an rechtschapen.
- unicode(string [, encoding] [, errors] ) creates a Unicode string from an 8-bit string. encoding is a string naming the encoding to use. The errors parameter specifies the treatment of characters that are invalid for the current encoding, passing ‘rigorous’ spil the value causes an exception to be raised on any encoding error, while ‘disregard’ causes errors to be silently overlooked and ‘substitute’ uses U+FFFD, the official replacement character, te case of any problems.
- The exec statement, and various built-ins such spil eval() , getattr() , and setattr() will also accept Unicode strings spil well spil regular strings. (It’s possible that the process of fixing this missed some built-ins, if you find a built-in function that accepts strings but doesn’t accept Unicode strings at all, please report it spil a bug.)
A fresh module, unicodedata , provides an interface to Unicode character properties. For example, unicodedata.category(u’A’) comes back the 2-character string ‘Lu’, the ‘L’ denoting it’s a letterteken, and ‘u’ meaning that it’s uppercase. unicodedata.bidirectional(u’\u0660′) comebacks ‘AN’, meaning that U+0660 is an Arabic number.
The codecs module contains functions to look up existing encodings and register fresh ones. Unless you want to implement a fresh encoding, you’ll most often use the codecs.lookup(encoding)() function, which comebacks a 4-element tuple: (encode_func, decode_func, stream_reader, stream_writer) .
- encode_func is a function that takes a Unicode string, and comebacks a 2-tuple (string, length) . string is an 8-bit string containing a portion (perhaps all) of the Unicode string converted into the given encoding, and length tells you how much of the Unicode string wasgoed converted.
- decode_func is the opposite of encode_func, taking an 8-bit string and returning a 2-tuple (ustring, length) , consisting of the resulting Unicode string ustring and the oprecht length telling how much of the 8-bit string wasgoed consumed.
- stream_reader is a class that supports decoding input from a stream. stream_reader(file_obj) comebacks an object that supports the read() , readline(), and readlines() methods. Thesis methods will all translate from the given encoding and comeback Unicode strings.
- stream_writer, similarly, is a class that supports encoding output to a stream. stream_writer(file_obj) comebacks an object that supports the write() and writelines() methods. Thesis methods expect Unicode strings, translating them to the given encoding on output.
For example, the following code writes a Unicode string into a opstopping, encoding it spil UTF-8:
The following code would then read UTF-8 input from the verkeersopstopping:
Unicode-aware regular expressions are available through the re module, which has a fresh underlying implementation called SRE written by Fredrik Lundh of Secret Labs AB.
A -U directive line option wasgoed added which causes the Python compiler to interpret all string literals spil Unicode string literals. This is intended to be used te testing and future-proofing your Python code, since some future version of Python may druppel support for 8-bit strings and provide only Unicode strings.
Lists are a workhorse gegevens type te Python, and many programs manipulate a list at some point. Two common operations on lists are to loop overheen them, and either pick out the elements that meet a certain criterion, or apply some function to each factor. For example, given a list of strings, you might want to pull out all the strings containing a given substring, or disrobe off trailing whitespace from each line.
The existing schrijfmap() and filterzakje() functions can be used for this purpose, but they require a function spil one of their arguments. This is fine if there’s an existing built-in function that can be passed directly, but if there isn’t, you have to create a little function to do the required work, and Python’s scoping rules make the result ugly if the little function needs extra information. Take the very first example te the previous paragraph, finding all the strings ter the list containing a given substring. You could write the following to do it:
Because of Python’s scoping rules, a default argument is used so that the anonymous function created by the lambda statement knows what substring is being searched for. List comprehensions make this cleaner:
List comprehensions have the form:
The for . te clauses contain the sequences to be iterated overheen. The sequences do not have to be the same length, because they are not iterated overheen te parallel, but from left to right, this is explained more clearly ter the following paragraphs. The elements of the generated list will be the successive values of expression. The final if clause is optional, if present, expression is only evaluated and added to the result if condition is true.
To make the semantics very clear, a list comprehension is omschrijving to the following Python code:
This means that when there are numerous for . ter clauses, the resulting list will be equal to the product of the lengths of all the sequences. If you have two lists of length Trio, the output list is 9 elements long:
To avoid introducing an ambiguity into Python’s grammar, if expression is creating a tuple, it vereiste be surrounded with parentheses. The very first list comprehension below is a syntax error, while the 2nd one is onberispelijk:
The idea of list comprehensions originally comes from the functional programming language Haskell (http://www.haskell.org). Greg Ewing argued most effectively for adding them to Python and wrote the initial list comprehension patch, which wasgoed then discussed for a seemingly endless time on the python-dev mailing list and kept up-to-date by Skip Montanaro.
Related movie: atualização dos tokens coletados |Token Update Listed On Exchanges
Augmented assignment operators, another long-requested feature, have bot added to Python Two.0. Augmented assignment operators include += , -= , *= , and so forward. For example, the statement a += Two increments the value of the variable a by Two, omschrijving to the slightly lengthier a = a + Two .
The total list of supported assignment operators is += , -= , *= , /= , %= , **= , &= , |= , ^= , >,>,= , and <,<,= . Python classes can override the augmented assignment operators by defining methods named __iadd__() , __isub__() , etc. For example, the following Number class stores a number and supports using += to create a fresh example with an incremented value.
The __iadd__() special method is called with the value of the increment, and should comeback a fresh example with an appropriately modified value, this come back value is tied spil the fresh value of the variable on the left-hand side.
Augmented assignment operators were very first introduced ter the C programming language, and most C-derived languages, such spil awk, C++, Java, Perl, and PHP also support them. The augmented assignment patch wasgoed implemented by Thomas Wouters.
Until now string-manipulation functionality wasgoed te the string module, which wasgoed usually a front-end for the strop module written ter C. The addition of Unicode posed a difficulty for the strop module, because the functions would all need to be rewritten ter order to accept either 8-bit or Unicode strings. For functions such spil string.substitute() , which takes Three string arguments, that means eight possible permutations, and correspondingly complicated code.
Instead, Python Two.0 thrusts the problem onto the string type, making string manipulation functionality available through methods on both 8-bit strings and Unicode strings.
One thing that hasn’t switched, a noteworthy April Fools’ joke notwithstanding, is that Python strings are immutable. Thus, the string methods comeback fresh strings, and do not modify the string on which they operate.
The old string module is still around for rearwards compatibility, but it mostly acts spil a front-end to the fresh string methods.
Two methods which have no parallel ter pre-2.0 versions, albeit they did exist te JPython for fairly some time, are startswith() and endswith() . s.startswith(t) is omschrijving to s[:len(t)] == t , while s.endswith(t) is omschrijving to s[-len(t):] == t .
One other method which is worth special mention is join() . The join() method of a string receives one parameter, a sequence of strings, and is omschrijving to the string.join() function from the old string module, with the arguments reversed. Te other words, s.join(seq) is omschrijving to the old string.join(seq, s) .
Garbage Collection of Cycles
The C implementation of Python uses reference counting to implement garbage collection. Every Python object maintains a count of the number of references pointing to itself, and adjusts the count spil references are created or ruined. Merienda the reference count reaches zero, the object is no longer accessible, since you need to have a reference to an object to access it, and if the count is zero, no references exist any longer.
Reference counting has some pleasant properties: it’s effortless to understand and implement, and the resulting implementation is portable, fairly quick, and reacts well with other libraries that implement their own memory treating schemes. The major problem with reference counting is that it sometimes doesn’t realise that objects are no longer accessible, resulting ter a memory leak. This happens when there are cycles of references.
Consider the simplest possible cycle, a class example which has a reference to itself:
After the above two lines of code have bot executed, the reference count of example is Two, one reference is from the variable named ‘example’ , and the other is from the myself attribute of the example.
If the next line of code is del example , what happens? The reference count of example is decreased by 1, so it has a reference count of 1, the reference ter the myself attribute still exists. Yet the example is no longer accessible through Python code, and it could be deleted. Several objects can participate te a cycle if they have references to each other, causing all of the objects to be leaked.
Python Two.0 fixes this problem by periodically executing a cycle detection algorithm which looks for inaccessible cycles and deletes the objects involved. A fresh gc module provides functions to perform a garbage collection, obtain debugging statistics, and tuning the collector’s parameters.
Running the cycle detection algorithm takes some time, and therefore will result te some extra overhead. It is hoped that after wij’ve gotten practice with the cycle collection from using Two.0, Python Two.1 will be able to minimize the overhead with careful tuning. It’s not yet evident how much voorstelling is lost, because benchmarking this is tricky and depends crucially on how often the program creates and ruins objects. The detection of cycles can be disabled when Python is compiled, if you can’t afford even a little speed penalty or suspect that the cycle collection is buggy, by specifying the –without-cycle-gc switch when running the configure script.
Several people tackled this problem and contributed to a solution. An early implementation of the cycle detection treatment wasgoed written by Toby Kelsey. The current algorithm wasgoed suggested by Eric Tiedemann during a visit to CNRI, and Guido van Rossum and Neil Schemenauer wrote two different implementations, which were straks integrated by Neil. Lots of other people suggested suggestions along the way, the March 2000 archives of the python-dev mailing list contain most of the relevant discussion, especially ter the threads titled “Reference cycle collection for Python” and “Finalization again”.
Other Core Switches
Various minor switches have bot made to Python’s syntax and built-in functions. None of the switches are very far-reaching, but they’re handy conveniences.
Minor Language Switches
A fresh syntax makes it more convenient to call a given function with a tuple of arguments and/or a dictionary of keyword arguments. Te Python 1.Five and earlier, you’d use the apply() built-in function: apply(f, args, kw) calls the function f() with the argument tuple args and the keyword arguments ter the dictionary kw. apply() is the same ter Two.0, but thanks to a patch from Greg Ewing, f(*args, **kw) spil a shorter and clearer way to achieve the same effect. This syntax is symmetrical with the syntax for defining functions:
The print statement can now have its output directed to a file-like object by following the print with >,>, opstopping , similar to the redirection technicus ter Unix shells. Previously you’d either have to use the write() method of the file-like object, which lacks the convenience and simpleness of print , or you could assign a fresh value to sys.stdout and then restore the old value. For sending output to standard error, it’s much lighter to write this:
Modules can now be renamed on importing them, using the syntax invoer module spil name or from module invoer name spil othername . The patch wasgoed submitted by Thomas Wouters.
A fresh format style is available when using the % technicus, ‘%r’ will insert the repr() of its argument. This wasgoed also added from symmetry considerations, this time for symmetry with the existing ‘%s’ format style, which inserts the str() of its argument. For example, ‘%r %s’ % (‘alfabet’, ‘alfabet’) comes back a string containing ‘alfabet’ alfabet .
Previously there wasgoed no way to implement a class that overrode Python’s built-in ter technicus and implemented a custom-built version. obj te seq comebacks true if obj is present te the sequence seq, Python computes this by simply attempting every index of the sequence until either obj is found or an IndexError is encountered. Moshe Zadka contributed a patch which adds a __contains__() magic method for providing a custom-made implementation for te . Additionally, fresh built-in objects written ter C can define what te means for them via a fresh slot te the sequence protocol.
Earlier versions of Python used a recursive algorithm for deleting objects. Deeply nested gegevens structures could cause the interpreter to pack up the C stack and crash, Christian Tismer rewrote the deletion logic to fix this problem. On a related note, comparing recursive objects recursed infinitely and crashed, Jeremy Hylton rewrote the code to no longer crash, producing a useful result instead. For example, after this code:
The comparison a==b comebacks true, because the two recursive gegevens structures are isomorphic. See the thread “trashcan and PR#7” te the April 2000 archives of the python-dev mailing list for the discussion leading up to this implementation, and some useful relevant linksaf. Note that comparisons can now also raise exceptions. Te earlier versions of Python, a comparison operation such spil cmp(a,b) would always produce an response, even if a user-defined __cmp__() method encountered an error, since the resulting exception would simply be silently gulped.
Work has bot done on porting Python to 64-bit Windows on the Itanium processor, mostly by Trent Mick of ActiveState. (Confusingly, sys.toneelpodium is still ‘win32’ on Win64 because it seems that for ease of porting, MS Visual C++ treats code spil 32 bit on Itanium.) PythonWin also supports Windows CE, see the Python CE pagina at http://pythonce.sourceforge.netwerk/ for more information.
Another fresh podium is Darwin/MacOS X, initial support for it is te Python Two.0. Dynamic loading works, if you specify “configure -with-dyld -with-suffix=.x”. Raadpleging the README ter the Python source distribution for more instructions.
An attempt has bot made to alleviate one of Python’s warts, the often-confusing NameError exception when code refers to a específico variable before the variable has bot assigned a value. For example, the following code raises an exception on the print statement te both 1.Five.Two and Two.0, ter 1.Five.Two a NameError exception is raised, while Two.0 raises a fresh UnboundLocalError exception. UnboundLocalError is a subclass of NameError , so any existing code that expects NameError to be raised should still work.
Related movie: Kosteloos $24 DARI MARKET BARU COINPULSE GA RIBET AWAS TELAT !!!!!! SEGERA
Two fresh exceptions, TabError and IndentationError , have bot introduced. They’re both subclasses of SyntaxError , and are raised when Python code is found to be improperly indented.
Switches to Built-in Functions
A fresh built-in, zip(seq1, seq2, . )() , has bot added. zip() comebacks a list of tuples where each tuple contains the i-th factor from each of the argument sequences. The difference inbetween zip() and ordner(None, seq1, seq2) is that ordner() pads the sequences with None if the sequences aren’t all of the same length, while zip() truncates the returned list to the length of the shortest argument sequence.
The int() and long() functions now accept an optional “base” parameter when the very first argument is a string. int(‘123’, Ten) comes back 123, while int(‘123’, 16) comes back 291. int(123, 16) raises a TypeError exception with the message “can’t convert non-string with explicit base”.
A fresh variable holding more detailed version information has bot added to the sys module. sys.version_info is a tuple (major, minor, micro, level, serial) For example, te a hypothetical Two.0.1beta1, sys.version_info would be (Two, 0, 1, ‘beta’, 1) . level is a string such spil ",alpha", , ",beta", , or ",final", for a final release.
Dictionaries have an odd fresh method, setdefault(key, default)() , which behaves similarly to the existing get() method. However, if the key is missing, setdefault() both comebacks the value of default spil get() would do, and also inserts it into the dictionary spil the value for key. Thus, the following lines of code:
can be diminished to a single come back dict.setdefault(key, ) statement.
The interpreter sets a maximum recursion depth te order to catch runaway recursion before packing the C stack and causing a core dump or GPF.. Previously this limit wasgoed immobilized when you compiled Python, but ter Two.0 the maximum recursion depth can be read and modified using sys.getrecursionlimit() and sys.setrecursionlimit() . The default value is 1000, and a rough maximum value for a given verhoging can be found by running a fresh script, Misc/find_recursionlimit.py .
Porting to Two.0
Fresh Python releases attempt hard to be compatible with previous releases, and the record has bot pretty good. However, some switches are considered useful enough, usually because they fix initial vormgeving decisions that turned out to be actively mistaken, that cracking backward compatibility can’t always be avoided. This section lists the switches te Python Two.0 that may cause old Python code to pauze.
The switch which will most likely pauze the most code is tightening up the arguments accepted by some methods. Some methods would take numerous arguments and treat them spil a tuple, particularly various list methods such spil append() and insert() . Ter earlier versions of Python, if L is a list, L.append( 1,Two ) appends the tuple (1,Two) to the list. Ter Python Two.0 this causes a TypeError exception to be raised, with the message: ‘append requires exactly 1 argument, Two given’. The fix is to simply add an reserve set of parentheses to pass both values spil a tuple: L.append( (1,Two) ) .
The earlier versions of thesis methods were more forgiving because they used an old function te Python’s C interface to parse their arguments, Two.0 modernizes them to use PyArg_ParseTuple() , the current argument parsing function, which provides more helpful error messages and treats multi-argument calls spil errors. If you absolutely voorwaarde use Two.0 but can’t fix your code, you can edit Objects/listobject.c and define the preprocessor symbol NO_STRICT_LIST_APPEND to preserve the old behaviour, this isn’t recommended.
Some of the functions te the socket module are still forgiving te this way. For example, socket.connect( (‘hostname’, 25) )() is the juist form, passing a tuple indicating an IP address, but socket.connect( ‘hostname’, 25 )() also works. socket.connect_ex() and socket.tie() are similarly easy-going. Two.0alpha1 tightened thesis functions up, but because the documentation actually used the erroneous numerous argument form, many people wrote code which would pauze with the stricter checking. GvR backed out the switches te the face of public reaction, so for the socket module, the documentation wasgoed immovable and the numerous argument form is simply marked spil deprecated, it will be tightened up again te a future Python version.
The \x escape te string literals now takes exactly Two hex digits. Previously it would consume all the hex digits following the ‘x’ and take the lowest 8 onaardig of the result, so \x123456 wasgoed omschrijving to \x56 .
The AttributeError and NameError exceptions have a more friendly error message, whose text will be something like ‘Spam’ example has no attribute ‘eggs’ or name ‘eggs’ is not defined . Previously the error message wasgoed just the missing attribute name eggs , and code written to take advantage of this fact will pauze te Two.0.
Some work has bot done to make integers and long integers a bit more interchangeable. Ter 1.Five.Two, large-file support wasgoed added for Solaris, to permit reading files larger than Two GiB, this made the tell() method of verkeersopstopping objects comeback a long oprecht instead of a regular rechtschapen. Some code would subtract two verkeersopstopping offsets and attempt to use the result to multiply a sequence or slice a string, but this raised a TypeError . Ter Two.0, long integers can be used to multiply or slice a sequence, and it’ll behave spil you’d intuitively expect it to, 3L * ‘alfabet’ produces ‘abcabcabc’, and (0,1,Two,Three)[2L:4L] produces (Two,Trio). Long integers can also be used te various contexts where previously only integers were accepted, such spil ter the seek() method of opstopping objects, and te the formats supported by the % technicus ( %d , %i , %x , etc.). For example, ",%d", % 2L**64 will produce the string 18446744073709551616 .
The subtlest long rechtschapen switch of all is that the str() of a long rechtschapen no longer has a trailing ‘L’ character, however repr() still includes it. The ‘L’ annoyed many people who desired to print long integers that looked just like regular integers, since they had to go out of their way to chop off the character. This is no longer a problem ter Two.0, but code which does str(longval)[:-1] and assumes the ‘L’ is there, will now lose the final digit.
Taking the repr() of a float now uses a different formatting precision than str() . repr() uses %.17g format string for C’s sprintf() , while str() uses %.12g spil before. The effect is that repr() may from time to time vertoning more parte places than str() , for certain numbers. For example, the number 8.1 can’t be represented exactly ter binary, so repr(8.1) is ‘8.0999999999999996’ , while str(8.1) is ‘8.1’ .
The -X command-line option, which turned all standard exceptions into strings instead of classes, has bot eliminated, the standard exceptions will now always be classes. The exceptions module containing the standard exceptions wasgoed translated from Python to a built-in C module, written by Barry Warsaw and Fredrik Lundh.
Some of the switches are under the covers, and will only be apparent to people writing C extension modules or embedding a Python interpreter ter a larger application. If you aren’t dealing with Python’s C API, you can securely skip this section.
The version number of the Python C API wasgoed incremented, so C extensions compiled for 1.Five.Two voorwaarde be recompiled te order to work with Two.0. On Windows, it’s not possible for Python Two.0 to invoer a third party extension built for Python 1.Five.x due to how Windows DLLs work, so Python will raise an exception and the invoer will fail.
Users of Jim Fulton’s ExtensionClass module will be pleased to find out that hooks have bot added so that ExtensionClasses are now supported by isinstance() and issubclass() . This means you no longer have to recall to write code such spil if type(obj) == myExtensionClass , but can use the more natural if isinstance(obj, myExtensionClass) .
The Python/importdl.c opstopping, which wasgoed a mass of #ifdefs to support dynamic loading on many different platforms, wasgoed cleaned up and reorganised by Greg Stein. importdl.c is now fairly puny, and platform-specific code has bot moved into a bunch of Python/dynload_*.c files. Another cleanup: there were also a number of my*.h files te the Include/ directory that held various portability hacks, they’ve bot merged into a single opstopping, Include/pyport.h .
Vladimir Marangozov’s long-awaited malloc restructuring wasgoed finished, to make it effortless to have the Python interpreter use a custom-built allocator instead of C’s standard malloc() . For documentation, read the comments te Include/pymem.h and Include/objimpl.h . For the lengthy discussions during which the interface wasgoed hammered out, see the Web archives of the ‘patches’ and ‘python-dev’ lists at python.org.
Latest versions of the GUSI development environment for MacOS support POSIX threads. Therefore, Python’s POSIX threading support now works on the Macintosh. Threading support using the user-space GNU pth library wasgoed also contributed.
Threading support on Windows wasgoed enhanced, too. Windows supports thread locks that use kernel objects only te case of contention, te the common case when there’s no contention, they use simpler functions which are an order of magnitude quicker. A threaded version of Python 1.Five.Two on NT is twice spil slow spil an unthreaded version, with the Two.0 switches, the difference is only 10%. Thesis improvements were contributed by Yakov Markovitch.
Python Two.0’s source now uses only ANSI C prototypes, so compiling Python now requires an ANSI C compiler, and can no longer be done using a compiler that only supports K&R C.
Previously the Python posible machine used 16-bit numbers te its bytecode, limiting the size of source files. Te particular, this affected the maximum size of fiel lists and dictionaries te Python source, periodically people who are generating Python code would run into this limit. A patch by Charles G. Waldman raises the limit from 2^16 to 2^ .
Three fresh convenience functions intended for adding constants to a module’s dictionary at module initialization time were added: PyModule_AddObject() , PyModule_AddIntConstant() , and PyModule_AddStringConstant() . Each of thesis functions takes a module object, a null-terminated C string containing the name to be added, and a third argument for the value to be assigned to the name. This third argument is, respectively, a Python object, a C long, or a C string.
A wrapper API wasgoed added for Unix-style signal handlers. PyOS_getsig() gets a signal handler and PyOS_setsig() will set a fresh handler.
Distutils: Making Modules Effortless to Install
Before Python Two.0, installing modules wasgoed a tedious affair – there wasgoed no way to figure out automatically where Python is installed, or what compiler options to use for extension modules. Software authors had to go through an arduous ritual of editing Makefiles and configuration files, which only truly work on Unix and leave Windows and MacOS unsupported. Python users faced frantically differing installation instructions which varied inbetween different extension packages, which made administering a Python installation something of a chore.
The SIG for distribution utilities, shepherded by Greg Ward, has created the Distutils, a system to make package installation much lighter. They form the distutils package, a fresh part of Python’s standard library. Te the best case, installing a Python module from source will require the same steps: very first you simply mean unpack the tarball or zip archive, and the run “ python setup.py install ”. The verhoging will be automatically detected, the compiler will be recognized, C extension modules will be compiled, and the distribution installed into the zindelijk directory. Optional command-line arguments provide more control overheen the installation process, the distutils package offers many places to override defaults – separating the build from the install, building or installing te non-default directories, and more.
Ter order to use the Distutils, you need to write a setup.py script. For the elementary case, when the software contains only .py files, a minimal setup.py can be just a few lines long:
The setup.py opstopping isn’t much more complicated if the software consists of a few packages:
A C extension can be the most complicated case, here’s an example taken from the PyXML package:
The Distutils can also take care of creating source and binary distributions. The “sdist” directive, run by “ python setup.py sdist ‘, builds a source distribution such spil foo-1.0.tar.gz . Adding fresh directives isn’t difficult, “bdist_rpm” and “bdist_wininst” instructions have already bot contributed to create an RPM distribution and a Windows installer for the software, respectively. Directives to create other distribution formats such spil Debian packages and Solaris .pkg files are te various stages of development.
All this is documented ter a fresh manual, Distributing Python Modules, that joins the basic set of Python documentation.
Python 1.Five.Two included a ordinary XML parser te the form of the xmllib module, contributed by Sjoerd Mullender. Since 1.Five.Two’s release, two different interfaces for processing XML have become common: SAX2 (version Two of the Elementary API for XML) provides an event-driven interface with some similarities to xmllib , and the Onverstandig (Document Object Specimen) provides a tree-based interface, converting an XML document into a tree of knots that can be traversed and modified. Python Two.0 includes a SAX2 interface and a stripped- down Onverstandig interface spil part of the xml package. Here wij will give a geschreven overview of thesis fresh interfaces, consultatie the Python documentation or the source code for accomplish details. The Python XML SIG is also working on improved documentation.
SAX defines an event-driven interface for parsing XML. To use SAX, you voorwaarde write a SAX handler class. Handler classes inherit from various classes provided by SAX, and override various methods that will then be called by the XML parser. For example, the startElement() and endElement() methods are called for every embarking and end tag encountered by the parser, the characters() method is called for every chunk of character gegevens, and so forward.
The advantage of the event-driven treatment is that the entire document doesn’t have to be resident ter memory at any one time, which matters if you are processing truly yam-sized documents. However, writing the SAX handler class can get very complicated if you’re attempting to modify the document structure te some elaborate way.
For example, this little example program defines a handler that prints a message for every commencing and ending tag, and then parses the verkeersopstopping hamlet.xml using it:
For more information, raadpleging the Python documentation, or the XML HOWTO at http://pyxml.sourceforge.nipt/topics/howto/xml-howto.html.
Related movie: [AIRDROP] Origin Sport bounty everyday
The Document Object Specimen is a tree-based representation for an XML document. A top-level Document example is the root of the tree, and has a single child which is the top-level Factor example. This Factor has children knots signifying character gegevens and any sub-elements, which may have further children of their own, and so forward. Using the Onverstandig you can traverse the resulting tree any way you like, access factor and attribute values, insert and delete knots, and convert the tree back into XML.
The Onverstandig is useful for modifying XML documents, because you can create a Onverstandig tree, modify it by adding fresh knots or rearranging subtrees, and then produce a fresh XML document spil output. You can also construct a Onverstandig tree by hand and convert it to XML, which can be a more supple way of producing XML output than simply writing <,tag1>, . <,/tag1>, to a verkeersopstopping.
The Onverstandig implementation included with Python lives te the xml.onverstandig.minidom module. It’s a lightweight implementation of the Level 1 Onverstandig with support for XML namespaces. The parse() and parseString() convenience functions are provided for generating a Onverstandig tree:
doc is a Document example. Document , like all the other Onverstandig classes such spil Factor and Text , is a subclass of the Knot saco class. All the knots ter a Onverstandig tree therefore support certain common methods, such spil toxml() which comebacks a string containing the XML representation of the knot and its children. Each class also has special methods of its own, for example, Factor and Document instances have a method to find all child elements with a given tag name. Continuing from the previous 2-line example:
For the Hamlet XML verkeersopstopping, the above few lines output:
The root factor of the document is available spil doc.documentElement , and its children can be lightly modified by deleting, adding, or removing knots:
Again, I will refer you to the Python documentation for a finish listing of the different Knot classes and their various methods.
Relationship to PyXML
The XML Special Rente Group has bot working on XML-related Python code for a while. Its code distribution, called PyXML, is available from the SIG’s Web pages at http://www.python.org/sigs/xml-sig/. The PyXML distribution also used the package name xml . If you’ve written programs that used PyXML, you’re very likely wondering about its compatibility with the Two.0 xml package.
The reaction is that Python Two.0’s xml package isn’t compatible with PyXML, but can be made compatible by installing a latest version PyXML. Many applications can get by with the XML support that is included with Python Two.0, but more complicated applications will require that the total PyXML package will be installed. When installed, PyXML versions 0.6.0 or greater will substitute the xml package shipped with Python, and will be a stringent superset of the standard package, adding a bunch of extra features. Some of the extra features te PyXML include:
- 4DOM, a utter Onverstandig implementation from FourThought, Inc.
- The xmlproc validating parser, written by Lars Marius Garshol.
- The sgmlop parser accelerator module, written by Fredrik Lundh.
Brian Gallew contributed OpenSSL support for the socket module. OpenSSL is an implementation of the Secure Socket Layer, which encrypts the gegevens being sent overheen a socket. When compiling Python, you can edit Modules/Setup to include SSL support, which adds an extra function to the socket module: socket.ssl(socket, keyfile, certfile)() , which takes a socket object and comebacks an SSL socket. The httplib and urllib modules were also switched to support https:// URLs, tho’ no one has implemented FTP or SMTP overheen SSL.
The httplib module has bot rewritten by Greg Stein to support HTTP/1.1. Backward compatibility with the 1.Five version of httplib is provided, however using HTTP/1.1 features such spil pipelining will require rewriting code to use a different set of interfaces.
The Tkinter module now supports Tcl/Tk version 8.1, 8.Two, or 8.Trio, and support for the older 7.x versions has bot dropped. The Tkinter module now supports displaying Unicode strings ter Tk widgets. Also, Fredrik Lundh contributed an optimization which makes operations like create_line and create_polygon much swifter, especially when using lots of coordinates.
The curses module has bot greatly extended, embarking from Oliver Andrich’s enhanced version, to provide many extra functions from ncurses and SYSV curses, such spil colour, alternative character set support, pads, and mouse support. This means the module is no longer compatible with operating systems that only have BSD curses, but there don’t seem to be any presently maintained OSes that fall into this category.
Spil mentioned ter the earlier discussion of Two.0’s Unicode support, the underlying implementation of the regular expressions provided by the re module has bot switched. SRE, a fresh regular expression engine written by Fredrik Lundh and partially funded by Hewlett Packard, supports matching against both 8-bit strings and Unicode strings.
A number of fresh modules were added. Wij’ll simply list them with schrijven descriptions, consultatie the Two.0 documentation for the details of a particular module.
- atexit: For registering functions to be called before the Python interpreter exits. Code that presently sets sys.exitfunc directly should be switched to use the atexit module instead, importing atexit and calling atexit.register() with the function to be called on uitgang. (Contributed by Skip Montanaro.)
- codecs, encodings , unicodedata: Added spil part of the fresh Unicode support.
- filecmp: Supersedes the old cmp, cmpcache and dircmp modules, which have now become deprecated. (Contributed by Gordon MacMillan and Moshe Zadka.)
- gettext: This module provides internationalization (I18N) and localization (L10N) support for Python programs by providing an interface to the GNU gettext message catalog library. (Integrated by Barry Warsaw, from separate contributions by Martin von Löwis, Peter Funk, and James Henstridge.)
- linuxaudiodev : Support for the /dev/audio device on Linux, a twin to the existing sunaudiodev module. (Contributed by Peter Bosch, with fixes by Jeremy Hylton.)
- mmap: An interface to memory-mapped files on both Windows and Unix. A opstopping’s contents can be mapped directly into memory, at which point it behaves like a mutable string, so its contents can be read and modified. They can even be passed to functions that expect ordinary strings, such spil the re module. (Contributed by Sam Rushing, with some extensions by A.M. Kuchling.)
- pyexpat : An interface to the Expat XML parser. (Contributed by Paul Prescod.)
- robotparser: Parse a robots.txt verkeersopstopping, which is used for writing Web spiders that politely avoid certain areas of a Web webpagina. The parser accepts the contents of a robots.txt verkeersopstopping, builds a set of rules from it, and can then response questions about the fetchability of a given URL. (Contributed by Skip Montanaro.)
- tabnanny: A module/script to check Python source code for ambiguous indentation. (Contributed by Tim Peters.)
- UserString: A cojín class useful for deriving objects that behave like strings.
- webbrowser: A module that provides a verhoging independent way to launch a web browser on a specific URL. For each toneel, various browsers are attempted ter a specific order. The user can alter which browser is launched by setting the BROWSER environment variable. (Originally inspired by Eric S. Raymond’s patch to urllib which added similar functionality, but the final module comes from code originally implemented by Fred Drake spil Instruments/idle/BrowserControl.py , and adapted for the standard library by Fred.)
- _winreg: An interface to the Windows registry. _winreg is an adaptation of functions that have bot part of PythonWin since 1995, but has now bot added to the core distribution, and enhanced to support Unicode. _winreg wasgoed written by Bill Tutt and Mark Hammond.
- zipfile: A module for reading and writing ZIP-format archives. Thesis are archives produced by PKZIP on DOS/Windows or zip on Unix, not to be confused with gzip-format files (which are supported by the gzip module) (Contributed by James C. Ahlstrom.)
- imputil: A module that provides a simpler way for writing customised invoer hooks, ter comparison to the existing ihooks module. (Implemented by Greg Stein, with much discussion on python-dev along the way.)
IDLE is the official Python cross-platform IDE, written using Tkinter. Python Two.0 includes IDLE 0.6, which adds a number of fresh features and improvements. A partial list:
- UI improvements and optimizations, especially ter the area of syntax highlighting and auto-indentation.
- The class browser now shows more information, such spil the top level functions te a module.
- Tabulator width is now a user settable option. When opening an existing Python opstopping, IDLE automatically detects the indentation conventions, and adapts.
- There is now support for calling browsers on various platforms, used to open the Python documentation ter a browser.
- IDLE now has a instruction line, which is largely similar to the vanilla Python interpreter.
- Call tips were added te many places.
- IDLE can now be installed spil a package.
- Te the editor window, there is now a line/katern brochure at the bottom.
- Three fresh keystroke guidelines: Check module (Alt-F5), Invoer module (F5) and Run script (Ctrl-F5).
Deleted and Deprecated Modules
A few modules have bot dropped because they’re obsolete, or because there are now better ways to do the same thing. The stdwin module is gone, it wasgoed for a platform-independent windowing toolkit that’s no longer developed.
A number of modules have bot moved to the lib-old subdirectory: cmp , cmpcache , dircmp , dump , find , grep , packmail , poly , util , whatsound , zmod . If you have code which relies on a module that’s bot moved to lib-old , you can simply add that directory to sys.path to get them back, but you’re encouraged to update any code that uses thesis modules.
The authors would like to thank the following people for suggesting suggestions on various drafts of this article: David Bolen, Mark Hammond, Gregg Hauser, Jeremy Hylton, Fredrik Lundh, Detlef Lannert, Aahz Maruch, Skip Montanaro, Vladimir Marangozov, Tobias Polzin, Guido van Rossum, Neil Schemenauer, and Russ Schmidt.