This is the Qpy package. Qpy provides a convenient mechanism for generating safely-quoted xml text from python code. It does this by implementing a quote-no-more string data type and a slight modification of the python compiler. (This main idea comes from Quixote's htmltext/PTL.) Quoting ------- XML reserves 5 characters ('<', '>', '&', quote and apostrophe) so that they can be used as markup delimiters. When a document needs to use these characters for some other purpose, they must be escaped, that is, replaced by the an equivalent entity or character reference. This package defines a xml_quote() function that, for a string argument, returns a string with these 5 characters with equivalents: for example, '<' becomes '<'. When assembling an XML (or similar markup such as HTML) document, it is important to remember to quote everything that should be quoted, such as text that comes from a database or some (untrusted) outside source. In the case of web pages, underquoting this dangerous, as it leaves the door open for cross-site scripting and other attacks. It would be nice if you could assemble your document as a string and then call xml_quote() on it at the end, just to make sure that everything was quoted, but this generally results in over-quoting, where you lose the intended markup structure. For web pages, over-quoting produces a result that is ugly, but much safer than the underquoted alternative. Programs that produce XML documents must keep track of just what has been quoted already and what has not been quoted already, and mistakes are common. Our objective is to make quoting errors rare, especially underquoting errors. The Quoted-No-More Class: xml ----------------------------- Our xml_quote() function always returns an xml instance. The class named "xml" is a subclass of Python's unicode string class. An instance of xml is a string that is known to need no more XML quoting. When the xml_quote() function gets an xml instance as an argument, it just returns the instance immediately, without any changes. When the xml_quote() function gets None as an argument, it always returns an empty xml instance. All other arguments to quote are converted to unicode strings and then the reserved characters are escaped to produce the resulting xml instance. The xml class defines some functions that make it easy to build quoted documents. When an xml instance is combined with another object using the '+' operator, the result is the xml instance formed by concatenating the quoted operands. The value of the expression xml('') + '<' is equal to the value of xml('<') When an xml instance is used as a format string with the '%' operator, the (non-number) arguments to the format string are quoted as they are used. The xml class includes a join() method that quotes the items in the sequence before joining them. The common case of using an empty xml instance to join a sequence is implemented in the join_xml() function. The join_str() function acts the same way, except that it does not escape any characters. The Qpy Compiler ---------------- The Qpy compiler is Python compiler with an added preprocessor that can best be understood understood as a source-code transformation. The transformation is limited to the definitions of certain functions we call "templates". An xml template is designated in qpy source code by ":xml" just after the function name in the function's definition. For example, this is an xml template: def f:xml(x): "
" x "
" The Qpy preprocessor essentially replaces this by: from qpy import xml as _qpy_xml, join_xml as _qpy_join_xml def f(x): qpy_accumulation = [] qpy_append = qpy_accumulation.append qpy_append(_qpy_xml("
")) qpy_append(x) qpy_append(_qpy_xml("
")) return _qpy_join_xml(qpy_accumulation) There are two main things going on here. One is that every string-literal in the body of the function is wrapped by the xml constructor. The assumption is that a literal string, provided by the programmer, does not need any more quoting. The other part of the conversion is that expression values are accumulated on a local list, and the default return value is the xml instance formed by concatenating these values, after quoting them. The values returned by f are xml instances, and here are some samples: f(None) -> "
" # None becomes ''. f("
") -> "
<hr />
" # Quoting happens. f(1) -> "
1
" # Converted. f(xml("
")) -> "

" # Already quoted. The nice thing about this is that the expressions appearing in a template, possibly including values provided from outside sources, will always be quoted unless they are already instances of the xml class. If the programmer makes a mistake with respect to quoting, it will very likely appear as over-quoting instead of lurking as a security problem. Templates can't have normal python docstrings after the arguments: we just use comments. A template may also be designated by ":str", instead of ":xml" appearing before the function name. The difference is that a str template will accumulate the values of expression statements and return the join_str() of the list, and there is no XML-quoting. Templates can be nested arbitrarily along with other functions. A template's code transformation does not apply inside ordinary functions that are defined inside the template body. Using Qpy --------- Source code files that include templates should be named with a ".qpy" suffix and placed in a python package directory. The package __init__.py should contain the following lines to make sure that the compiled versions of the qpy modules are up-to-date: from qpy.compile import compile_qpy_files compile_qpy_files(__path__[0]) The qpcheck.py Utility ---------------------- This package also includes qpcheck.py, a script that looks for unknown names and unused imports in directories containing python and qpy source code. Installation ------------ If you have been using a previous version of QPY, remove all ".pyc" files that have been compiled from ".qpy" files. python setup.py install or python setup.py build_ext -i # build extension in place. Put this directory on your python path. Example ------- An "example" package is included. To try it, install as described above, start a python interpreter, and try importing the "qpy.example.example1" module. The real purpose of the example is to provide an example package, __init__.py, and a Qpy module. Content-in-code instead of code-in-content. ------------------------------------------- Most "template" systems are designed to embed program-like value-substitution and control flow into what would otherwise be static content. Qpy (like Quixote's PTL templates) uses the opposite pattern, embedding static content in what would otherwise be an ordinary program. This program-centric pattern is especially attractive when content maintenance team is the same as the programming team. Notes for Quixote Users ----------------------- The basic idea for Qpy comes from Quixote. In Qpy, quoting also quotes the apostrophe character. The xml class is like Quixote's htmltext class. Unlike htmltext instances, xml instances can be pickled. Most .ptl files work without changes with Qpy. Qpy treats PTL's html templates as xml templates, and PTL's plain templates as str templates. Qpy doesn't use ihooks or any other kind of import hook. Notes for Users of Previous Versions of QPY ------------------------------------------- Use xml in place of h8. The name "h8" is deprecated. The older syntax for templates still works, but it is deprecated. The html_escape_string function is not present in this release. The u8 class is not really present in this release because Python 3 makes it unnecessary. The name "u8" is deprecated. There is no class method for quoting. Use xml_quote() instead. The stringify() function is still present, but deprecated. Copyright --------- Copyright (c) Corporation for National Research Initiatives 2009.