|
01-15-2004, 07:58 AM | #1 |
Green Mole
Join Date: Jan 2004
Location: West Yorkshire
Posts: 3
|
PowerPoint
How easy is it to add another parser. I want to add the ppthtml, so I inserted the following in the config.php
define('PHPDIG_INDEX_MSPOWERPOINT',true); define('PHPDIG_PARSE_MSPOWERPOINT','/usr/bin/ppthtml'); define('PHPDIG_OPTION_MSPOWERPOINT','-s 8859-1'); is there anything else I need to do?
__________________
YF Helpdesk |
01-18-2004, 09:31 AM | #2 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. There are several places in robot_functions.php that would need editing and another line added to the config file. Just search the PhpDig PHP files for PDF (case insensitive) and you'll find all the places.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
01-19-2004, 03:16 AM | #3 |
Green Mole
Join Date: Jan 2004
Location: West Yorkshire
Posts: 3
|
Cheers, that worked in getting the powerpoint extension recognised.
Q1. I have two questions, if I may. 1. When the phpdig executes the binary does it automatically dump the contents into the tempfiles located in /admin/temp? From what I can see ppthtml is not STDOUT. Running the program from the command line ie # ppthtml filename.ppt throws the outputs to the screen. If I add > filename.html then it outputs to file. Q2. Phpdig can seem to parse the powerpoint file. Is it because the ppthtml is not outputting a file? If so how do I get around it?
__________________
YF Helpdesk |
01-19-2004, 07:21 AM | #4 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. If output goes to STDOUT, then use define('PHPDIG_MSPOWERPOINT_EXTENSION',''); in the config file. If output goes to a file, then use define('PHPDIG_MSPOWERPOINT_EXTENSION','.html'); to the config file.
PhpDig assigns the filename so '> filename.html' should not go in define('PHPDIG_OPTION_MSPOWERPOINT',''); in the config file. For example, with pdftotext and no PHPDIG_OPTION_PDF set, output gets assigned to filename_set_by_PhpDig.txt so only '.txt' should go in PHPDIG_PDF_EXTENSION in the config file. The admin/temp directory is a temporary holding place for processing. Once done, the files are deleted from admin/temp and a text file containing the output, whether from a webpage or a PPT file, is held in the text_content directory. As 'ppthtml filename.ppt' throws output to the screen, it's going to STDOUT so the following should suffice: PHP Code:
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
01-19-2004, 08:12 AM | #5 |
Green Mole
Join Date: Jan 2004
Location: West Yorkshire
Posts: 3
|
Thanks for your help. That worked a treat! As this is GNU/GPL should I sumbit script updates in the Mod Submissions?
__________________
YF Helpdesk |
01-19-2004, 08:20 AM | #6 |
Head Mole
Join Date: May 2003
Posts: 2,539
|
Hi. Glad it's working. The GNU/GPL does not currently require that you publish your modifications unless you plan to release them, but any script updates are welcome in the Mod Submissions forum.
__________________
Responses are offered on a voluntary if/as time is available basis, no guarantees. Double posting or bumping threads will not get your question answered any faster. No support via PM or email, responses not guaranteed. Thank you for your comprehension. |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
phpdig spider hangs (a powerpoint file problem) | davideyre | Troubleshooting | 1 | 03-29-2004 01:35 PM |