Parser
- class headerparser.HeaderParser(normalizer: Callable[[str], Any] | None = None, body: bool | None = None, **kwargs: Any)[source]
A parser for RFC 822-style header sections. Define the fields the parser should recognize with the
add_field()method, configure handling of unrecognized fields withadd_additional(), and then parse input withparse()or anotherparse_*()method.- Parameters:
normalizer (callable) – By default, the parser will consider two field names to be equal iff their lowercased forms are equal. This can be overridden by setting
normalizerto a custom callable that takes a field name and returns a “normalized” name for use in equality testing. The normalizer will also be used when looking up keys in theNormalizedDictinstances returned by the parser’sparse_*()methods.body (bool) – whether the parser should allow or forbid a body after the header section;
Truemeans a body is required,Falsemeans a body is prohibited, andNone(the default) means a body is optionalkwargs – Passed to the
Scannerconstructor
- add_additional(enable: bool = True, **kwargs: Any) None[source]
Specify how the parser should handle fields in the input that were not previously registered with
add_field. By default, unknown fields will cause theparse_*methods to raise anUnknownFieldError, but calling this method withenable=True(the default) will change the parser’s behavior so that all unregistered fields are processed according to the options in**kwargs. (If no options are specified, the additional values will just be stored in the result dictionary.)If this method is called more than once, only the settings from the last call will be used.
Note that additional field values are always stored in the result dictionary using their field name as the key, and two fields are considered the same (for the purposes of
multiple) iff their names are the same after normalization. Customization of the dictionary key and field name can only be done throughadd_field.Changed in version 0.2.0:
actionargument added- Parameters:
enable (bool) – whether the parser should accept input fields that were not registered with
add_field; setting this toFalsedisables additional fields and restores the parser’s default behaviormultiple (bool) – If
True, each additional header field will be allowed to occur more than once in the input, and each field’s values will be stored in a list. IfFalse(the default), aDuplicateFieldErrorwill be raised if an additional field occurs more than once in the input.unfold (bool) – If
True(defaultFalse), additional field values will be “unfolded” (i.e., line breaks will be removed and whitespace around line breaks will be converted to a single space) before applyingtypetype (callable) – a callable to apply to additional field values before storing them in the result dictionary
choices (iterable) – A sequence of values which additional fields are allowed to have. If
choicesis defined, all additional field values in the input must have one of the given values (after applyingtype) or else anInvalidChoiceErroris raised.action (callable) – A callable to invoke whenever the field is encountered in the input. The callable will be passed the current dictionary of header fields, the field’s name, and the field’s value (after processing with
typeandunfoldand checking againstchoices). The callable replaces the default behavior of storing the field’s values in the result dictionary, and so the callable must explicitly store the values if desired.
- Returns:
- Raises:
if
enableis true and a previous call toadd_fieldused a customdestif
choicesis an empty sequence
- add_field(name: str, *altnames: str, **kwargs: Any) None[source]
Define a header field for the parser to parse. During parsing, if a field is encountered whose name (modulo normalization) equals either
nameor one of thealtnames, the field’s value will be processed according to the options in**kwargs. (If no options are specified, the value will just be stored in the result dictionary.)Changed in version 0.2.0:
actionargument added- Parameters:
name (string) – the primary name for the field, used in error messages and as the default value of
destaltnames (strings) – field name synonyms
dest – The key in the result dictionary in which the field’s value(s) will be stored; defaults to
name. When additional headers are enabled (seeadd_additional),destmust equal (after normalization) one of the field’s names.required (bool) – If
True(defaultFalse), theparse_*methods will raise aMissingFieldErrorif the field is not present in the inputdefault – The value to associate with the field if it is not present in the input. If no default value is specified, the field will be omitted from the result dictionary if it is not present in the input.
defaultcannot be set when the field is required.type,unfold, andactionwill not be applied to the default value, and the default value need not belong tochoices.multiple (bool) – If
True, the header field will be allowed to occur more than once in the input, and all of the field’s values will be stored in a list. IfFalse(the default), aDuplicateFieldErrorwill be raised if the field occurs more than once in the input.unfold (bool) – If
True(defaultFalse), the field value will be “unfolded” (i.e., line breaks will be removed and whitespace around line breaks will be converted to a single space) before applyingtypetype (callable) – a callable to apply to the field value before storing it in the result dictionary
choices (iterable) – A sequence of values which the field is allowed to have. If
choicesis defined, all occurrences of the field in the input must have one of the given values (after applyingtype) or else anInvalidChoiceErroris raised.action (callable) – A callable to invoke whenever the field is encountered in the input. The callable will be passed the current dictionary of header fields, the field’s
name, and the field’s value (after processing withtypeandunfoldand checking againstchoices). The callable replaces the default behavior of storing the field’s values in the result dictionary, and so the callable must explicitly store the values if desired. Whenactionis defined for a field,destcannot be.
- Returns:
- Raises:
if another field with the same name or
destwas already definedif
destis not one of the field’s names andadd_additionalis enabledif
defaultis defined andrequiredis trueif
choicesis an empty sequenceif both
destandactionare defined
TypeError – if
nameor one of thealtnamesis not a string
- parse(data: str | Iterable[str]) NormalizedDict[source]
New in version 0.4.0.
Parse an RFC 822-style header field section (possibly followed by a message body) from the contents of the given string, filehandle, or sequence of lines and return a dictionary of the header fields (possibly with body attached). If
datais an iterable ofstr, newlines will be appended to lines in multiline header fields where not already present but will not be inserted where missing inside the body.Changed in version 0.5.0:
datacan now be a string.- Parameters:
iterable – a string, text-file-like object, or iterable of lines to parse
- Return type:
- Raises:
ParserError – if the input fields do not conform to the field definitions declared with
add_fieldandadd_additionalScannerError – if the header section is malformed
- parse_next_stanza(iterator: Iterator[str]) NormalizedDict[source]
New in version 0.4.0.
Parse a RFC 822-style header field section from the contents of the given filehandle or iterator of lines and return a dictionary of the header fields. Input processing stops at the end of the header section, leaving the rest of the iterator unconsumed. As a message body is not consumed, calling this method when
bodyis true will produce aMissingBodyError.Deprecated since version 0.5.0: Instead combine
Scanner.scan_next_stanza()withparse_stream()- Parameters:
iterator – a text-file-like object or iterator of lines to parse
- Return type:
- Raises:
ParserError – if the input fields do not conform to the field definitions declared with
add_fieldandadd_additionalScannerError – if a header section is malformed
- parse_next_stanza_string(s: str) tuple[NormalizedDict, str][source]
New in version 0.4.0.
Parse a RFC 822-style header field section from the given string and return a pair of a dictionary of the header fields and the rest of the string. As a message body is not consumed, calling this method when
bodyis true will produce aMissingBodyError.Deprecated since version 0.5.0: Instead combine
Scanner.scan_next_stanza()withparse_stream()- Parameters:
s (string) – the text to parse
- Return type:
pair of
NormalizedDictand a string- Raises:
ParserError – if the input fields do not conform to the field definitions declared with
add_fieldandadd_additionalScannerError – if a header section is malformed
- parse_stanzas(data: str | Iterable[str]) Iterator[NormalizedDict][source]
New in version 0.4.0.
Parse zero or more stanzas of RFC 822-style header fields from the given string, filehandle, or sequence of lines and return a generator of dictionaries of header fields.
All of the input is treated as header sections, not message bodies; as a result, calling this method when
bodyis true will produce aMissingBodyError.Changed in version 0.5.0:
datacan now be a string.- Parameters:
data – a string, text-file-like object, or iterable of lines to parse
- Return type:
generator of
NormalizedDict- Raises:
ParserError – if the input fields do not conform to the field definitions declared with
add_fieldandadd_additionalScannerError – if a header section is malformed
- parse_stanzas_stream(fields: Iterable[Iterable[tuple[str, str]]]) Iterator[NormalizedDict][source]
New in version 0.4.0.
Parse an iterable of iterables of
(name, value)pairs as returned byscan_stanzas()and return a generator of dictionaries of header fields. This is a low-level method that you will usually not need to call.- Parameters:
fields – an iterable of iterables of pairs of strings
- Return type:
generator of
NormalizedDict- Raises:
ParserError – if the input fields do not conform to the field definitions declared with
add_fieldandadd_additionalScannerError – if a header section is malformed
- parse_stanzas_string(s: str) Iterator[NormalizedDict][source]
New in version 0.4.0.
Parse zero or more stanzas of RFC 822-style header fields from the given string and return a generator of dictionaries of header fields.
All of the input is treated as header sections, not message bodies; as a result, calling this method when
bodyis true will produce aMissingBodyError.Deprecated since version 0.5.0: Use
parse_stanzas()instead.- Parameters:
s (string) – the text to parse
- Return type:
generator of
NormalizedDict- Raises:
ParserError – if the input fields do not conform to the field definitions declared with
add_fieldandadd_additionalScannerError – if a header section is malformed
- parse_stream(fields: Iterable[tuple[str | None, str]]) NormalizedDict[source]
Process a sequence of
(name, value)pairs as returned byscan()and return a dictionary of header fields (possibly with body attached). This is a low-level method that you will usually not need to call.- Parameters:
fields (iterable of pairs of strings) – a sequence of
(name, value)pairs representing the input fields- Return type:
- Raises:
ParserError – if the input fields do not conform to the field definitions declared with
add_fieldandadd_additionalValueError – if the input contains more than one body pair
- parse_string(s: str) NormalizedDict[source]
Parse an RFC 822-style header field section (possibly followed by a message body) from the given string and return a dictionary of the header fields (possibly with body attached)
Deprecated since version 0.5.0: Use
parse()instead.- Parameters:
s (string) – the text to parse
- Return type:
- Raises:
ParserError – if the input fields do not conform to the field definitions declared with
add_fieldandadd_additionalScannerError – if the header section is malformed