Hi. When you try the following query, change word to some word that could only be in the PDF file:
Code:
select keyword from keywords where keyword like '%word%';
The file in the text_content directory that contains the following:
Index of /pdf Name Last modified Size Description Parent Directory
28-Apr-2004 16:35 - 01123SOC2004013.PDF 28-Apr-2004 18:30 69k pdf.html
28-Apr-2004 18:18 1k test.doc 28-Apr-2004 17:24 19k zyz.xls 28-Apr-2004
17:24 14k Apache1.3.29 - ProXad [Apr 1 2004 16:04:22] Server at
monsiteweb.fr Port 80 Index of /pdf Index of /pdf Index of /pdf
That seems like a directory listing rather than for the actual PDF file. The $result array contains the following:
Result contains: Array ( [0] => Hébergement [1] => Facture [2] => partners -- 5 Sq de tuile_ 78000 Versailles -- Tél. / Fax : 0666666666 -- Email : contact@partners.com [3] => SARL au capital de 3000# -- Siret545454445RCS Versailles -- APE 222Z -- Web : www.partners.com [4] => [5] => FACTURE [6] => partners CLIENT [7] => 5 Sq de tuile Adzd MAdzNdzAS [8] => 78000 Versailles [9] => Tél./fax. : 01 3226222626 [10] => Prestation : Hébergement [11] => Facture du: 01/04/2004 au 31/06/2004 [12] => N° de Facture: 12122/66 [13] => Article Objet Quantité [14] => / [15] => Slots [16] => Prix [17] => unitaire / [18] => Trimestre [19] => Montant TVA [20] => Hébergement Serveur [21] => Total HT 122.36 [22] => Total TVA 23.61 [23] => Total TTC 122.00 [24] => A payer 122.00 EUROS [25] => Mode de paiement : A réception de facture [26] => )
And with $retval being zero, the following code should make a temp file containing the stuff from the $result array:
PHP Code:
if (!$retval) {
// the replacement if š is for unbreaking spaces
// returned by catdoc parsing msword files
// and '0xAD' "tiret quadratin" returned by pstotext
// in iso-8859-1
// Adjust with your encoding and/or your tools
if ((is_array($result)) && (count($result) > 0)) {
$f_handler = fopen($tempfile1,'wb');
fwrite($f_handler,str_replace('š',' ',str_replace(chr(0xad),'-',implode(' ',$result))));
fclose($f_handler);
}
}
else {
return array('tempfile'=>0,'tempfilesize'=>0);
}
Also, what do you get with the following query:
Code:
select file,first_words from spider where file like '%01123SOC2004013%';
And are the admin/temp and text_content directories set to 777 permissions?