Metadata-Version: 1.0
Name: repoze.postoffice
Version: 0.17
Summary: Provides central depot for incoming mail for use by applications.
Home-page: http://www.repoze.org
Author: Chris Rossi
Author-email: repoze-dev@lists.repoze.org
License: BSD-derived (http://www.repoze.org/LICENSE.txt)
Description: =================
        repoze.postoffice
        =================
        
        `repoze.postoffice` provides a centralized depot for collecting incoming email
        for consumption by multiple applications.  Incoming mail is sorted into queues
        according to rules with the expectation that each application will then consume
        its own queue.  Each queue is a first-in-first-out (FIFO) queue, so messages
        are processed in the order received.
        
        ZODB is used for storage and is also used to provide the client interface.
        `repoze.postoffice` clients create a ZODB connection and manipulate models.
        This makes consuming the message queue in the context of a transaction,
        relatively simple.
        
        Setting up the depot
        ====================
        
        `repoze.postoffice` assumes that a message transport agent (MTA), such as
        Postfix, has been configured to deliver messages to a folder using the Maildir
        format. Configuring the MTA is outside of the scope of this document.
        
        Configuration File
        ++++++++++++++++++
        
        The depot is configured via a configuration file in ini format.  The ini file
        consists of a single 'post office' section followed by one or more named
        queue sections.  The 'post office' section contains information about the ZODB
        set up as well as the location of the incoming Maildir::
        
            [post office]
            # Required parameters
            zodb_uri = zconfig://%(here)s/zodb.conf#main
            maildir = %(here)s/incoming/Maildir
        
            # Optional parameters
            zodb_path = /postoffice
            ooo_loop_frequency = 60 # 1 Hertz
            ooo_loop_headers = To,Subject
            ooo_throttle_period = 300 # 5 minutes
            max_message_size = 500m
        
        `zodb_uri` is interpreted using `repoze.zodbconn` and follows the format laid
        out there.  See: http://docs.repoze.org/zodbconn/narr.html
        
        `zodb_path` is the path in the db to the postoffice queues.  This parameter
        is optional and defaults to '/postoffice'.
        
        `maildir` is the path to the incoming Maildir format folder from which messages
        are pulled.
        
        `ooo_loop_frequency` specifies the threshold frequency of incoming messages
        from the same user to the same queue, in messages per minute. When the
        threshold is reached by a particular user, messages from that user will be
        marked as rejected for period of time in an attempt to break a possible out
        of office auto-reply loop. If not specified, no check is performed on
        frequency of incoming messages.
        
        `ooo_loop_headers` optionally causes loop detection to use the specified email
        headers as discriminators.  If specified, these headers must match for incoming
        messages to trigger the ooo throttle.  If not specified, no header matching is
        done, and messages need only be sent from the same user to the same queue to
        trigger the throttle.
        
        `ooo_throttle_period` specifies the amount of time, in minutes, for which a
        user's incoming mail will be marked as rejected if loop detection is in use
        and the user reaches the `ooo_loop_frequency` threshold. Defaults to 5
        minutes. If `ooo_loop_frequency` is not set, this setting has no effect.
        
        `max_message_size` sets the maximum size, in bytes, of incoming messages.
        Messages which exceed this limit will have their payloads discarded and will
        be marked as rejected. The suffixes 'k', 'm' or 'g' may be used to specify
        that the number of bytes is expressed in kilobytes, megabytes or gigabytes,
        respectively. A number without suffix will be interpreted as bytes. If not
        set, no limit will be imposed on incoming message size.
        
        Each message queue is configured in a section with the prefix 'queue:'::
        
            [queue:Customer A]
            filters =
                to_hostname: app.customera.com app.aliasa.com
        
            [queue:Customer B]
            filters =
                to_hostname: .customerb.com
        
        Filters
        +++++++
        
        Filters are used to determine which messages land in which queues. When a new
        message enters the system each queue is tried in the order specified in the
        ini file until a match is found or until all of the queues have been tried.
        For each queue each filter for that queue is processed. In order to match for
        a queue a message must match all filters for that queue.
        
        At the time of the following filters are implemented:
        
        + `to_hostname`: This filter matches the hostname of the email address in the
          'To' or 'CC' headers of the message. Hostnames which beging with a period will
          match any hostname that ends with the specified name, ie '.example.com'
          matches 'example.com' and 'app.example.com'. If the hostname does not begin
          with a period it must match exactly. Multiple hostnames, delimited by
          whitespace, may be listed. If multiple hostnames are used, an incoming message
          need match only one.
        
        + `header_regexp`: This filter allows the matching of arbitrary regular
          expressions against the headers of a message.  Only a single regular
          expression can be specified.  An example::
        
            [queue:Parties]
            filters =
                header_regexp: Subject:.+[Pp]arty.+
        
        + `header_regexp_file`: This filter is the same as `header_regexp` except that
          multiple regular expressions can be written in a file. Regular expressions are
          newline delimited in the file. The argument to this filter is the path to the
          file::
        
            [queue:Weddings]
            filters =
                header_regexp_file: %(here)s/wedding_invitation_header_checks.txt
        
        + `body_regexp`: Like `header_regexp` except the regular expression must match
          some text in one of the message part bodies.
        
        + `body_regexp_file`: Like `header_regexp_file` except the regular expressions
          must match some text in one of the message part bodies.
        
        Global Reject Filters
        +++++++++++++++++++++
        
        In addition to defining filters for queues, filters can be defined globally
        for rejection of messages before they can be assigned to queues. Any filter
        that can be used for a queue can be used here. The basic difference, though,
        is that for a queue, if a filter matches, the message goes into the queue.
        Here, though, if a filter matches the message is rejected.  ::
        
            [post office]
            reject_filters =
                header_regexp_file: reject_headers.txt
                body_regexp_file: reject_body.txt
                to_hostname: *.partycentral.com  # We need to get them to change their MX
        
        Populating Queues
        =================
        
        Queues are populated using the `postoffice` console script that is provided
        when the `repoze.postoffice` egg is installed.  This script reads messages from
        the incoming maildir and imports them into the ZODB-based depot.  Messages are
        matched and placed in appropriate queues.  Messages which do not match any
        queues are erased.  There are no required arguments to the script--if it can
        find its .ini file, it will work::
        
            $ bin/postoffice
        
        The `postoffice` script will search for an ini file named 'postoffice.ini'
        first in the current directory, then in an 'etc' folder in the current
        directory, then an 'etc' folder that is a sibling of the 'bin' folder which
        contains the `postoffice` script and then, finally, in '/etc'.  You can also
        use a non-standard location for the ini file by passing the path as an
        argument to the script::
        
            $ bin/postoffice -C path/to/config.ini
        
        Use the '-h' or '--help' switch to see all of the options available.
        
        Out of Office Loop Detection
        ============================
        
        `repoze.postoffice` does attempt to address out of office loops. An out of
        office loop can occur when `repoze.postoffice` is used to populate content in
        an application which generates an email to alert users of the new content.
        Essentially, a poorly behaved email client will respond to the new content
        alert email with an out of office reply which in turn causes more content to
        be created and another alert email to be sent. Without some form of loop
        detection, this can lead to a large amount of junk content being generated
        very quickly.
        
        When a new email enters the system, `repoze.postoffice` first checks for some
        headers that could be set by well behaved MTA's to indicate automated
        responses and marks as rejected messages which match these known heuristics.
        First, the non-standard, but widely supported, 'Precedence' header is checked
        and messages with a precedence of 'bulk', 'junk', or 'list' are marked as
        rejected. Next `repoze.postoffice` will check for the presence of the
        'Auto-Submitted' header which is described in rfc3834 and is standard, but not
        yet widely supported. Messages containing this header are marked. In either of
        these two cases, the incoming message is marked by adding the header::
        
          X-Postoffice-Rejected: Auto-response
        
        Out of office messages sent by certain clients (Microsoft) will typically not
        use either of the above standards to indicate an automated reply. As a last
        line of defense, `repoze.postoffice` also tracks the frequency of incoming
        mail by email address and, optionally, other headers specified by the
        'ooo_loop_headers' configuration option. When the number of messages arriving
        from the same user surpasses a particular, assumedly inhuman, threshold, a
        temporary block is placed on messages from that user, such that all messages
        from that user are marked as rejected for a certain period of time, hopefully
        breaking the auto reply feedback loop. Messages which trigger are fall under a
        throttle are marked with header::
        
          X-Postoffice-Rejected: Throttled
        
        Messages marked with the 'X-Postoffice-Rejected' header are still conveyed to
        the client.  It is up to the client to check for this header and take
        appropriate action.  This allows the client to choose and take appropriate
        action, such as bouncing with a particular bounce message, etc.
        
        Message Size Limit
        ==================
        
        If 'max_message_size' is specified in the configuration, messages which exceed
        this size will have their payloads (body and any attachments) discarded and
        will be marked with the header:
        
          X-Postoffice-Rejected: Maximum Message Size Exceeded
        
        The trimmed message is still conveyed to the client, which should check for
        the 'X-Postoffice-Rejected' header and take appropriate action, possibly
        including bouncing the message with an appropriate bounce message.
        
        Consuming Queues
        ================
        
        Client applications consume message queues by establishing a connection to the
        ZODB which houses the depot and interacting with queue and message objects.
        `repoze.postoffice.queue` contains a helper method, `open_queue` which given
        connection information can open the connection for you and return a Queue
        instance::
        
          from my.example import process_message
          from my.example import validate_message
          from repoze.postoffice.queue import open_queue
          import sys
          import transaction
        
          ZODB_URI = zconfig://%(here)s/zodb.conf#main
          queue_name = 'my queue'
          queue = open_queue(ZODB_URI, queue_name, path='/postoffice')
          while queue:
              message = queue.pop_next()
              if not validate_message(message):
                  queue.bounce(message, 'Message is invalid.')
              try:
                  process_message(message)
                  transaction.commit()
              except:
                  transaction.abort()
                  queue.quarantine(message, sys.exc_info())
                  transaction.commit()
        
        
        0.17 (2011-09-26)
        -----------------
        
        - Added a header, 'X-Postoffice-Date', to queued messages.  It records
          the time each message was received (as seconds since the epoch.)  It
          is set to the modified time of the maildir message file.  
        
        0.16 (2011-06-30)
        -----------------
        
        - Added better fault tolerance for insane date headers generated by spambots.
          (LP #697033)
        
        0.15 (2011-06-15)
        -----------------
        
        - Body checks are now multiline regexp checks. (LP #787573)
        
        0.14 (2011-05-17)
        -----------------
        
        - Fixed problem where the zodb_uri could be unicode, which eventually breaks
          the ZEO client for ZEO uris.  zodb_uri is now converted to a UTF-8 string.
        
        0.13 (2011-05-05)
        -----------------
        
        - Fixed problem with header filters not working properly with non-ASCII
          characters in headers.  (LP #777455)
        
        - Fixed bug in regular expression body filter which improperly parsed the
          'Content-Type' header in order to extract the character set of a Mime part.
        
        0.12 (2011-04-25)
        -----------------
        
        - Respect leading and trailing whitespace in rules files.
        
        0.11 (2011-04-25)
        -----------------
        
        - When a message is rejected by a filter, a message is logged showing which
          filter triggered the rejection.
        
        0.10 (2011-04-20)
        -----------------
        
        - Improved logging output now includes a timestamp.
        
        - Worked around (probable) bug in stdlib email parser where a message part
          might have a charset set in the 'Content-Type' header, but
          message.get_charset() returns None.
        
        0.9 (2011-04-15)
        ----------------
        
        - Added greater fault tolerance for malformed email addresses.  (Shakes fist at
          spammers.)
        
        - Added four new filter types based on regular expression matching:
          `header_regexp`, `header_regexp_file`, `body_regexp`, `body_regexp_file`.
          See README.txt for information on how to use these new filters.
        
        - Added a new option to the global configuration: `reject_filters`. This allows
          you to set up filters at a global level for rejecting certain messages.  See
          README.txt for more information.
        
        0.8 (2011-01-14)
        ----------------
        
        - The 'to_hostname' filter now parses multiple email addresses and checks the
          'Cc' header as well as the 'To' header.  (LP #659243)
        
        - If multiple incoming messages in a 24 hour period have the same Message-Id,
          they are presumed to be duplicates and all but the first are discarded.
          (LP #659243)
        
        0.7 (2010-09-15)
        ----------------
        
        - Fixed another case where non-RFC 2047 compliant headers could cause an
          exception to be raised.  (LP #637484)
        
        0.6 (2010-09-13)
        ----------------
        
        - Added Queue.requeue_quarantined_messages() convenience method to API.
        
        - Allow for multiple hosts in 'to_hostname' filter. (LP #614528)
        
        - Added graceful degradation for non-RFC 2047 compliant headers, in order to
          avoid crashing when spambots send us malformed messages. (LP #637484)
        
        0.5 (2010-08-03)
        ----------------
        
        - Added 'X-Postoffice: Bounced' header to outgoing bounce and quarantine
          messages. The presence of this header is checked when importing messages and
          any messages which contain it are discarded. This is to prevent possible
          ricochets of bounce messages back into the system. (LP #612587)
        
        - Incoming messages with a 'From' header which matches exactly its 'To' header
          are now discarded as probable spam. (LP #612588)
        
        0.4 (2010-07-30)
        ----------------
        
        - Fixed bug in processing body of bounce messages when non-ascii unicode
          characters are present.
        
        0.3 (2010-07-20)
        ----------------
        
        - Fixed divide by zero error when calculating instantaneous message frequency.
        
        - Fixed bug in repoze.postoffice.queue.open_queue where a ZEO connection would
          be left open if there was a KeyError on the queue name.
        
        0.2 (2010-06-29)
        ----------------
        
        - Fixed bug in parsing headers with no values.
        
        - Added ability to use arbitrary message headers as discriminator values in
          out of office loop detection.
        
        - When messages exceed maximum message size, are throttled or are found to be
          an auto-response, they are no longer discarded.  Instead these messages get
          an 'X-Postoffice-Rejected' header added where the value gives the reason for
          rejection.  These messages are then consumable by clients in the normal way.
          It is up to the client to detect the 'X-Postoffice-Rejected' header and take
          appropriate action.  This change was made to allow the client to determine
          what, if any, sort of bounce message should be generated if any of these
          conditions are true.
        
        0.1 (2010-06-03)
        ----------------
        
        - Initial Release.
        
Keywords: e-mail zope repoze
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python
