Command Line Interface¶
$ pdf2docx --help
NAME
pdf2docx - Command line interface for pdf2docx.
SYNOPSIS
pdf2docx COMMAND | -
DESCRIPTION
Command line interface for pdf2docx.
COMMANDS
COMMAND is one of the following:
convert
Convert pdf file to docx file.
debug
Convert one PDF page and plot layout information for debugging.
table
Extract table content from pdf pages.
By range of pages¶
Specify pages range by --start (from the first page if omitted) and
--end (to the last page if omitted).
Note
The page index is zero-based by default, but can turn it off by
--zero_based_index=False, i.e. the first page index starts from 1.
Convert all pages:
$ pdf2docx convert test.pdf test.docx
Convert pages from the second to the end:
$ pdf2docx convert test.pdf test.docx --start=1
Convert pages from the first to the third (index=2):
$ pdf2docx convert test.pdf test.docx --end=3
Convert second and third pages:
$ pdf2docx convert test.pdf test.docx --start=1 --end=3
Convert the first and second pages with zero-based index turn off:
$ pdf2docx convert test.pdf test.docx --start=1 --end=3 --zero_based_index=False
By page numbers¶
Convert the first, third and 5th pages:
$ pdf2docx convert test.pdf test.docx --pages=0,2,4
Multi-Processing¶
Turn on multi-processing with default count of CPU:
$ pdf2docx convert test.pdf test.docx --multi_processing=True
Specify the count of CPUs:
$ pdf2docx convert test.pdf test.docx --multi_processing=True --cpu_count=4