1 year ago
#296184
Code Guy
get pdf cordinates for text extraction
I am using pdftotext for extracting text at certain parts of PDF but finding the x, y and W and H cordinate locations is very difficult. I am unsure whether there are tools to do that.
I have tried importing pdf to inkscape and gimp but the cordinates values have no correspondence in them to those of PDFTOTEXT values.
Please suggest me a good open source program/utility to find the cordinates / layouts
pdftotext -f 3 -l 3 -x 205 -y 40 -W 180 -H 75 -layout input.pdf - | sed 's/\(.*\)/\"\1\"/g' | tr '\n' ',' | sed 's/.$//'
ocr
pdftotext
0 Answers
Your Answer