1 year ago
#309416

enzo
PHP PdfToText coordinates not recognized
I need help, I'm going crazy with this problem.
I'm working with the PdfToText library in Php, to recognize a portion of text that interests me but I don't understand why I can't. From the documentation you find here:
https://github.com/christian-vigh-phpclasses/PdfToText/
explains how to get the coordinates from a PDF document, I ran this code:
$pdf = new PdfToText('myfile.pdf', PdfToText::PDFOPT_DEBUG_SHOW_COORDINATES );
print_r($pdf->Text);
This code returns me a series of lines with their coordinates, such as:
[Page : 1, width = 595, height = 850]
[x:31.41, y:726.38, w: 76.295, h:5.46, font:/Fo70]TEST 1
[x:117.81, y:726.38, w: 47.305, h:5.46, font:/Fo70] TEST 2
[x:319.41, y:726.38, w: 111.146, h:5.46, font:/Fo70] TEST 3
[x:511.41, y:726.38, w: 91.366, h:5.46, font:/Fo70] TEST 4
[x:31.41, y:711.88, w: 708.51, h:7.02, font:/Fo58]TEST 5
Now I would like to consider the line with the "TEST 5"
To do this I created the following xml file (test.xml):
<?xml version="1.0" encoding="utf-8" ?>
<captures>
<rectangle name="Test5">
<page number="1" left="31" right="600" top="711" bottom="690" />
</rectangle>
</captures>
What is not clear to me is how to take the value of the "right" attribute, in any case by running the script below and integrating the xml file specified above I do not get any results
$pdf = new PdfToText('myfile.pdf', PdfToText::PDFOPT_CAPTURE);
$pdf->SetCaptures('test.xml');
$captures = $pdf->GetCaptures();
var_dump($captures);
I cannot understand where I am wrong. Thanks for any invaluable help. Greetings
php
pdftotext
0 Answers
Your Answer