pdf2docx.image.Image module

Image object.

Data structure defined in link https://pymupdf.readthedocs.io/en/latest/textpage.html:

{
    'type': 1,
    'bbox': (x0,y0,x1,y1),
    'width': w,
    'height': h,
    'image': b'',

    # --- discard properties ---
    'ext': 'png',
    'colorspace': n,
    'xref': xref, 'yref': yref, 'bpc': bpc
}
class pdf2docx.image.Image.Image(raw: dict = None)

Bases: Element

Base image object.

from_image(image)

Update with image block/span.

Args:

image (Image): Target image block/span.

make_docx(paragraph)

Add image span to a docx paragraph.

plot(page, color: tuple)

Plot image bbox with diagonal lines (for debug purpose).

Args:

page (fitz.Page): Plotting page.

store()

Store image with base64 encode.

  • Encode image bytes with base64 -> base64 bytes

  • Decode base64 bytes -> str -> so can be serialized in json format

property text

Get an image placeholder <image>.