1 year ago

#296184

test-img

Code Guy

get pdf cordinates for text extraction

I am using pdftotext for extracting text at certain parts of PDF but finding the x, y and W and H cordinate locations is very difficult. I am unsure whether there are tools to do that.

I have tried importing pdf to inkscape and gimp but the cordinates values have no correspondence in them to those of PDFTOTEXT values.

Please suggest me a good open source program/utility to find the cordinates / layouts

pdftotext -f 3 -l 3 -x 205 -y 40 -W 180 -H 75 -layout input.pdf - | sed 's/\(.*\)/\"\1\"/g' | tr '\n' ',' | sed 's/.$//'

pdf

ocr

pdftotext

0 Answers

Your Answer

Accepted video resources