Lessons on Python Modules and Packages Walter Didimo [ 60 minutes ]
Programs so far So far we have written and tested simple programs in one of the two following ways: using the Python interactive mode (IDLE shell) putting the entire code in a single file and running it (either within the IDLE editor or from an OS shell)
Structuring programs When a program becomes more complex, you need to structure it in a better way: you may want to decompose the program into smaller parts, each written in a separate file you may want to reuse some of these parts to build other programs Every modern programming language makes it possible to structure a program into smaller parts, which can be exchanged among programmers and reused in several programs
Modules Python defines the concept of module: a module is a file that groups a certain set of definitions (variables, functions, classes, ); modules can be imported by any other module or in the interactive mode; by default, each module has its own namespace for the defined elements this mechanism allows different modules to coexist without causing collisions if they use an element (e.g., a variable) with the same name
A first example Let us start with a very simple example: create the following file, named util.py PI = 3.141593 minimum = min maximum = max util.py def avg(*args): sum = 0 for s in args: sum += float(s) avg = sum/len(args) return avg
A first example In the interactive mode you can write >>> import util.. and this statement imports all elements defined in the util.py module; after that, any of these elements can be recalled, using the module namespace (i.e., the string util.) as its prefix >>> util.minimum(10,util.pi,20) 3.141593 >>> util.avg(10,util.pi,20) 11.047197666666667
The import statement Other than in the interactive mode, the import statement can be used within any module to use another module As already observed, if A and B are two modules that define an element with the same name, say for example elem, both modules can be simultaneously imported by another module without problems A.elem will refer to the definition of elem given in A B.elem will refer to the definition of elem given in B
The Module Search Path When a module is imported, Python needs to know where this module can be searched: initially, it checks if it is a standard module, i.e., a module included in the standard library that comes with the Python installation if not, it searches in the list of directories represented by the variable sys.path (sys is a standard module)
The sys.path variable The sys.path variable is initialized with the following directories: the directory of the input program (or the current directory if a file is not specified) the list of directories in the OS environment variable PYTHONPATH (which can be set with the same syntax as the variable PATH) the installation-dependent default
The sys.path variable A program can modify the variable sys.path; for instance, the following code adds to the variable the path "C:\my_modules" >>> import sys >>> sys.path.append('c:\my_modules') Note that, sys is not automatically imported you need to import it if you want to use its definitions some other standard modules are automatically imported (e.g., string)
The from.. import statement The from.. import statement is a variant of the import statement: it can be used to import only some parts of a module it will not import the namespace of the module the imported definitions must be used directly with their names, without being preceded by the name of the module >>> from util import PI >>> print(pi) 3.141593 Curiosity: What happens if you import from two distinct modules an element with the same name?
The from.. import statement The from.. import statement can also be used with the * wildcard >>> from util import * It will import every element of the module, except those whose names start with one or more "_" elements whose names start with one or more "_" are considered private in Python and are not imported by default of course, you can import them explicitly
Inspecting a module: dir() The function dir() can be used to get the list of definitions in a module, lexicographically ordered >>> import util >>> dir(util) ['PI', ' builtins ', ' cached ', ' doc ', ' file ', ' loader ', ' name ', ' package ', ' spec ', 'avg', 'maximum', 'minimum']
The variable name The variable name for a module just stores the name of the module as a string >>> util. name 'util' If name is accessed without specifying a module, you get the name of the module that is currently running >>> name ' main '
The main module The first module that Python runs for a program is called the top-level program: if you run the interactive Python interpreter, it will be the top-level program If you use a command python x.py, module x will be the top-level program The top-level program is called main Thus, the value of the name variable within the top-level program is always ' main '
Executing modules as programs The name value can be checked within a module to run some code if and only if that module is the top-level program def compute(n): f = 1; for i in range(1,n+1): f *= i return f factorial.py if name == ' main ': import sys f = compute(int(sys.argv[1])) print(f)
Executing modules as programs def compute(n): f = 1; for i in range(1,n+1): f *= i return f factorial.py if name == ' main ': import sys f = compute(int(sys.argv[1])) print(f) python factorial 4 24 sys.argv[0] sys.argv[1]
Compiled modules For efficiency reasons, the first time a module is imported or called, Python generates a compiled version of that module (file) and caches it into a specific directory for future calls: the cache directory is named pycache Compiled codes allow for faster executions than interpreted codes, but compiling a source code may take some time: thus, caching compiling codes saves time
Compiled modules For a module named module, the corresponding compiled file is called: module.version.pyc.. where version is the Python version If the source code of the module changes, Python automatically recompiles it and updates the cache In an interpreter session, a module is imported only once; if the module changes you need to reload it, by writing the following import importlib; importlib.reload(module)
Packages Similarly to other programming languages (e.g. Java), modules can be grouped into packages From a logical point of view, a package should group a set of affine modules, i.e., modules used for some specific purpose (graphics, text processing, sounds,..) From a technical point of view, a package corresponds to a directory: it must contain all its modules and a special file called init.py
Packages The init.py file can be empty, or can contain some initialization code when the package is imported for the first time To import a module mod.py that is in a package pack, use one of the following syntaxes: import pack.mod from pack import mod A package can contain other sub-packages; if mod.py is inside subpack, and subpack is inside pack, you can write: import pack.subpack.mod
The * wildcard The * symbol can be used as a wildcard to import all modules of a package The following syntax can be used from pack import * In fact, this instruction will NOT import all modules of the package pack: it will execute init.py and will import those modules of pack whose names are inserted in the list variable all, defined in init_.py example: all =["mod1","mod2"]