v 20200118

Python pdf extraction package

PDFMiner is a suite of programs that help extracting and analyzing text data of PDF documents. Unlike other PDF-related tools, it allows to obtain the exact location of texts in a page, as well as other extra information such as font information or ruled lines. It includes a PDF converter that can transform PDF files into other text formats (such as HTML). It has an extensible PDF parser that can be used for other purposes instead of text analysis. (Currently tracking pdfminer.six fork.)

Add to my watchlist

Installations 0
Requested Installations 0