pdf2docx.common.share module¶
Common methods.
- class pdf2docx.common.share.BlockType(*values)¶
Bases:
EnumBlock types.
- FLOAT_IMAGE = 4¶
- IMAGE = 1¶
- LATTICE_TABLE = 2¶
- STREAM_TABLE = 3¶
- TEXT = 0¶
- UNDEFINED = -1¶
- class pdf2docx.common.share.IText¶
Bases:
objectText related interface considering text direction.
- property is_horizontal_text¶
Check whether text direction is from left to right.
- property is_mix_text¶
Check whether text direction is either from left to right or from bottom to top.
- property is_vertical_text¶
Check whether text direction is from bottom to top.
- property text_direction¶
Text direction is from left to right by default.
- class pdf2docx.common.share.RectType(*values)¶
Bases:
EnumShape type in context.
- BORDER = 16¶
- HIGHLIGHT = 1¶
- HYPERLINK = 8¶
- SHADING = 32¶
- STRIKE = 4¶
- UNDERLINE = 2¶
- class pdf2docx.common.share.TextAlignment(*values)¶
Bases:
EnumText alignment.
Note
The difference between
NONEandUNKNOWN:NONE: none of left/right/center align -> need TAB stop
UNKNOWN: can’t decide, e.g. single line only
- CENTER = 2¶
- JUSTIFY = 4¶
- LEFT = 1¶
- NONE = -1¶
- RIGHT = 3¶
- UNKNOWN = 0¶
- class pdf2docx.common.share.TextDirection(*values)¶
Bases:
EnumText direction. * LEFT_RIGHT: from left to right within a line, and lines go from top to bottom * BOTTOM_TOP: from bottom to top within a line, and lines go from left to right * MIX : a mixture if LEFT_RIGHT and BOTTOM_TOP * IGNORE : neither LEFT_RIGHT nor BOTTOM_TOP
- BOTTOM_TOP = 1¶
- IGNORE = -1¶
- LEFT_RIGHT = 0¶
- MIX = 2¶
- pdf2docx.common.share.cmyk_to_rgb(c: float, m: float, y: float, k: float, cmyk_scale: float = 100)¶
CMYK components to GRB value.
- pdf2docx.common.share.debug_plot(title: str, show=True)¶
Plot the returned objects of inner function.
- Args:
title (str): Page title. show (bool, optional): Don’t plot if show==False. Default to True.
Note
- Prerequisite of the inner function:
the first argument is a
BasePageinstance.the last argument is configuration parameters in
dicttype.
- pdf2docx.common.share.decode(s: str)¶
Try to decode a unicode string.
- pdf2docx.common.share.flatten(items, klass)¶
Yield items from any nested iterable.
- pdf2docx.common.share.is_list_item(text, bullets=True, numbers=True)¶
Returns text if bullets is true and text is a bullet character, or numbers is true and text is not empty and consists entirely of digits 0-9. Otherwise returns None.
If bullets is True we use an internal list of bullet characters; otherwise it should be a list of integer Unicode values.
- pdf2docx.common.share.is_number(str_number)¶
Whether can be converted to a float.
- class pdf2docx.common.share.lazyproperty(func)¶
Bases:
objectCalculate only once and cache property value.
- pdf2docx.common.share.lower_round(number: float, ndigits: int = 0)¶
Round number to lower bound with specified digits, e.g. lower_round(1.26, 1)=1.2
- pdf2docx.common.share.new_page(doc, width: float, height: float, title: str)¶
Insert a new page with given title.
- Args:
doc (fitz.Document): pdf document object. width (float): Page width. height (float): Page height. title (str): Page title shown in page.
- pdf2docx.common.share.rgb_component(srgb: int)¶
srgb value to R,G,B components, e.g. 16711680 -> (255, 0, 0).
Equal to PyMuPDF built-in method:
[int(255*x) for x in fitz.sRGB_to_pdf(x)]
- pdf2docx.common.share.rgb_component_from_name(name: str = '')¶
Get a named RGB color (or random color) from fitz predefined colors, e.g. ‘red’ -> (1.0,0.0,0.0).
- pdf2docx.common.share.rgb_to_value(rgb: list)¶
RGB components to decimal value, e.g. (1,0,0) -> 16711680.
- pdf2docx.common.share.rgb_value(components: list)¶
Gray/RGB/CMYK mode components to color value.