Page MenuHomePhabricator

configuration problems with windows paths with blanks/spaces
Closed, InvalidPublic

Description

Using a customised configuration according to [1] will not work on some Windows systems. I think because of issues with blanks in the standard file pathes of Windows 2003 - "Program Files" and "Program Files (x86)" where gs, IM and xpdf install by default.

For Example with the following configuration PDFHandler will not work. There is no reaction of the Extension like mentioned in [2].

<pre>

Konfiguration von pdfhandler

$wgPdfProcessor = 'C:\Program Files\gs\gs9.07\bin\gswin64.exe';
$wgPdfPostProcessor = $wgImageMagickConvertCommand; if defined via ImageMagick
$wgPdfPostProcessor = 'C:\Program Files (x86)\ImageMagick-6.6.9-Q16\convert.exe'; // if not defined via ImageMagick
$wgPdfInfo = 'C:\Program Files\xpdf\bin64\pdfinfo.exe';
$wgPdftoText = 'C:\Program Files\xpdf\bin64\pdftotext.exe';
</pre>

When running /maintenance/runjobs.php the following error comes up always:

<pre>
Der Befehl "C:\Program" ist entweder falsch geschrieben oder konnte nicht gefunden werden.
</pre>

[1] https://www.mediawiki.org/wiki/Extension:PdfHandler

[2] https://www.mediawiki.org/wiki/Extension:PdfHandler#Usage


Version: REL1_23-branch
Severity: major
OS: Windows Server 2003
Platform: PC
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=72044

Details

Reference
bz70083

Related Objects

Duplicates Merged Here
T77757: Check bug 70083

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 3:30 AM
bzimport set Reference to bz70083.
bzimport added a subscriber: Unknown Object (MLST).

I created a testcase in

https://aquanautweb.de/testwiki/index.php?title=Datei:0003_1110_Technische_Dokumentation.pdf

Bug: There is no preview of the pdf File within the page

Expectation: There is a preview of the pdf File within the page

Does wrapping the path with double quotes instead of single quotes help?

You mean like this:

require_once "$IP/extensions/PdfHandler/PdfHandler.php";

Konfiguration von pdfhandler

$wgPdfProcessor = "C:\Program Files\gs9.07\bin\gswin64.exe";
$wgPdfPostProcessor = $wgImageMagickConvertCommand; if defined via ImageMagick
$wgPdfPostProcessor = "C:\Program Files (x86)\ImageMagick-6.6.9-Q16\convert.exe"; // if not defined via ImageMagick
$wgPdfInfo = "C:\Program Files\xpdf\bin64\pdfinfo.exe";
$wgPdftoText = "C:\Program Files\xpdf\bin64\pdftotext.exe";

No, that does not help.

Backslashes should be doubled inside a quoted string (whether single- or double-quotes). However in this specific case I don't think it matters, PHP preserves the backslash if the following character does not match a valid escape sequence.

All the variables mentioned in comment 0 are passed to wfEscapeShellArg, which should wrap them in double quotes. Can you enable debug logging ([[mw:Manual:How_to_debug]]) and look for a line starting with either "doTransform" or "retrieveMetaData" (without the quotes)? That should tell what the exact command was; you can then try to execute it manually.

Here some extracted lines from debuggin:
I used the following in LocalSettings.php

error_reporting( -1 );
ini_set( 'display_errors', 1 );
$wgShowExceptionDetails = true;
$wgDebugToolbar = true;
$wgDebugLogFile = 'MYPATH/MYDEBUGLOGTEXTFILENAME';

wfShellExec: "C:\Program Files\xpdf\bin64\pdfinfo.exe" -enc UTF-8 -l 9999999 -meta "C:\PHP\temp\phpDC.tmp"
PdfImage::retrieveMetaData: "C:\Program Files\xpdf\bin64\pdftotext.exe" "C:\PHP\temp\phpDC.tmp" -
wfShellExec: "C:\Program Files\xpdf\bin64\pdftotext.exe" "C:\PHP\temp\phpDC.tmp" -
wfShellExec: "C:\Program Files\xpdf\bin64\pdfinfo.exe" -enc UTF-8 -l 9999999 -meta "C:\PHP\temp\phpDC.tmp"
PdfImage::retrieveMetaData: "C:\Program Files\xpdf\bin64\pdftotext.exe" "C:\PHP\temp\phpDC.tmp" -
wfShellExec: "C:\Program Files\xpdf\bin64\pdftotext.exe" "C:\PHP\temp\phpDC.tmp" -

There is no "doTransform" within the logfile.

I tried to execute "C:\Program Files\xpdf\bin64\pdftotext.exe" (without quotes) in command prompt and the result somehow crazy. Windows can't execute it, rising the same error as in Comment 1.

Der Befehl "C:\Program" ist entweder falsch geschrieben oder konnte nicht gefunden werden

Using quotes arround "Program Files" in commandprompt works. So I tried this in LocalSettings.php

C:\"Program Files"\xpdf\bin64\pdftotext.exe

Getting this in my Logfile

wfShellExec: "C:\\\"Program Files\"\xpdf\bin64\pdfinfo.exe" -enc UTF-8 -l 9999999 -meta "C:\PHP\temp\phpDD.tmp"
PdfImage::retrieveMetaData: "C:\\\"Program Files\"\xpdf\bin64\pdftotext.exe" "C:\PHP\temp\phpDD.tmp" -
wfShellExec: "C:\\\"Program Files\"\xpdf\bin64\pdftotext.exe" "C:\PHP\temp\phpDD.tmp" -
wfShellExec: "C:\\\"Program Files\"\xpdf\bin64\pdfinfo.exe" -enc UTF-8 -l 9999999 -meta "C:\PHP\temp\phpDD.tmp"
PdfImage::retrieveMetaData: "C:\\\"Program Files\"\xpdf\bin64\pdftotext.exe" "C:\PHP\temp\phpDD.tmp" -
wfShellExec: "C:\\\"Program Files\"\xpdf\bin64\pdftotext.exe" "C:\PHP\temp\phpDD.tmp" -

Using this kind of path in command prompt works.

C:\\\"Program Files\"\xpdf\bin64\pdftotext.exe

runjobs.php no longer shows errors with this BUT looking at my testpage unfortunatly is not showing a preview as before.

https://aquanautweb.de/testwiki/index.php?title=Datei:0003_1110_Technische_Dokumentation.pdf

Making a link to the pdf file on page

https://aquanautweb.de/testwiki/index.php?title=Test

results in this in my logfile

Linker::makeImageLink: Datei:0003_1110_Technische_Dokumentation.pdf does not allow inline display

The wfShellExec debug output is correct, with quotes at the right places, so maybe the runjobs.php error message mentioned in comment 0 is not related to this issue? You can use [[mw:Manual:ShowJobs.php]] to see what jobs are stuck in the queue, and runjobs.php --type createPdfThumbnailsJob to make sure only PDF thumbnail jobs are executed.

At any rate, the Linker:: error message sounds like ImageHandler::canRender (and ultimately, PdfHandler:retrieveMetadata) failing. Can you manually run the pdfinfo debug line (without the wfShellExec: prefix, but including the quotes) and see what output it produces? If the tempfile in the last parameter does not exist anymore, you can just replace it with the path to the PDF file.

...\Testwiki\maintenance>C:\"Program Files"\xpdf\bin64\pdfinfo.e
xe -enc UTF-8 -l 9999999 -meta D:\MediawikiTestwiki_Images\f\f0\0003_1110_Tech
nische_Dokumentation.pdf
Title: 1RITZ-pumpdata-...4RITZ-dimension-
Subject:
Keywords:
Author:
Creator: wPDF by WPCubed GmbH
Producer: C:\Programme\VSX\Spaix 2\Spaix2Aw
CreationDate: 02/21/11 00:00:00
ModDate: 02/21/11 00:00:00
Tagged: no
Form: none
Pages: 4
Encrypted: no
Page 1 size: 595 x 842 pts (A4)
Page 2 size: 595 x 842 pts (A4)
Page 3 size: 595 x 842 pts (A4)
Page 4 size: 595 x 842 pts (A4)
File size: 57638 bytes
Optimized: no
PDF version: 1.4

or the other way round with double backslashes

...\Testwiki\maintenance>C:\\\"Program Files\"\xpdf\bin64\pdfinf
o.exe -enc UTF-8 -l 9999999 -meta D:\MediawikiTestwiki_Images\f\f0\0003_1110_T
echnische_Dokumentation.pdf
Title: 1RITZ-pumpdata-...4RITZ-dimension-
Subject:
Keywords:
Author:
Creator: wPDF by WPCubed GmbH
Producer: C:\Programme\VSX\Spaix 2\Spaix2Aw
CreationDate: 02/21/11 00:00:00
ModDate: 02/21/11 00:00:00
Tagged: no
Form: none
Pages: 4
Encrypted: no
Page 1 size: 595 x 842 pts (A4)
Page 2 size: 595 x 842 pts (A4)
Page 3 size: 595 x 842 pts (A4)
Page 4 size: 595 x 842 pts (A4)
File size: 57638 bytes
Optimized: no
PDF version: 1.4

Looks good or not?

This looks fine. Can you try running

maintenance/runJobs.php --type createPdfThumbnailsJob

and see if you still get the error about not finding "C:\Program"?

I reuploaded the file.

Here is the result of runjobs.php

C:\Inetpub\Www_root\Testwiki\maintenance>php showjobs.php
1

C:\Inetpub\Www_root\Testwiki\maintenance>php runjobs.php --type createPdfThumbna
ilsJob

C:\Inetpub\Www_root\Testwiki\maintenance>

No job of this type?

The job in queue was htmlCacheUpdate.

carchaias, can you clarify that

runjobs.php --type htmlCacheUpdate

gives the "C:\Program" error message? That would be an error of its own.

As for the "does not allow inline display" part, can you check what's stored in the image.img_metadata database field for this file?

There are no error messages on doing runjobs.php --type htmlCacheUpdate:

C:\Inetpub\Www_root\Testwiki\maintenance>php runjobs.php --type htmlCacheUpdate

2014-09-12 06:59:59 htmlCacheUpdate Datei:0003_1110_Technische_Dokumentation.pdf
pages=array(2) rootJobSignature=479d9c18d68e2a2ec2176268810ef88b37ab0de6 rootJo
bTimestamp=20140911084944 STARTING
2014-09-12 06:59:59 htmlCacheUpdate Datei:0003_1110_Technische_Dokumentation.pdf
pages=array(2) rootJobSignature=479d9c18d68e2a2ec2176268810ef88b37ab0de6 rootJo
bTimestamp=20140911084944 t=16 good

C:\Inetpub\Www_root\Testwiki\maintenance>

Content of image.img_metadata database field is

b:0;

Adding Chris since error messages about "C:\Program" are, in general, not a good sign. I have been unable to find any problematic code in PdfHandler, though.

carchaias, can you test the job with no extra " in the paths? That is, using

$wgPdftoText = 'C:\Program Files\xpdf\bin64\pdftotext.exe';

and not

$wgPdftoText = 'C:\"Program Files"\xpdf\bin64\pdftotext.exe';

and same for the other settings. These parameters should work without manual escaping; if they don't, that's a bug, and I would really like to pinpoint the cause.

HI, I changed LocalSettings.php to:

$wgPdfProcessor = 'C:\Program Files\gs9.07\bin\gswin64.exe';
$wgPdfPostProcessor = $wgImageMagickConvertCommand; if defined via ImageMagick
$wgPdfPostProcessor = "C:\Program Files (x86)\ImageMagick-6.6.9-Q16\convert.exe"; // if not defined via ImageMagick
$wgPdfInfo = 'C:\Program Files\xpdf\bin64\pdfinfo.exe';
$wgPdftoText = 'C:\Program Files\xpdf\bin64\pdftotext.exe';

Did a change to page Datei:0003_1110_Technische_Dokumentation.pdf

As I can see there is no difference.

Results:

No pdf-preview on page
No error message while runjobs.php

C:\Inetpub\Www_root\Testwiki\maintenance>php showjobs.php
1

C:\Inetpub\Www_root\Testwiki\maintenance>php runjobs.php
2014-09-15 13:05:28 refreshLinks Datei:0003_1110_Technische_Dokumentation.pdf ta
ble=imagelinks recursive=1 rootJobSignature=052088007c075a9829b1876e7fd6770eba89
51a1 rootJobTimestamp=20140915130451 STARTING
2014-09-15 13:05:28 refreshLinks Datei:0003_1110_Technische_Dokumentation.pdf ta
ble=imagelinks recursive=1 rootJobSignature=052088007c075a9829b1876e7fd6770eba89
51a1 rootJobTimestamp=20140915130451 t=10 good
2014-09-15 13:05:28 refreshLinks Datei:0003_1110_Technische_Dokumentation.pdf pa
ges=array(1) rootJobSignature=052088007c075a9829b1876e7fd6770eba8951a1 rootJobTi
mestamp=20140915130451 masterPos= STARTING
2014-09-15 13:05:29 refreshLinks Datei:0003_1110_Technische_Dokumentation.pdf pa
ges=array(1) rootJobSignature=052088007c075a9829b1876e7fd6770eba8951a1 rootJobTi
mestamp=20140915130451 masterPos= t=309 good
2014-09-15 13:05:29 refreshLinks Datei:0003_1110_Technische_Dokumentation.pdf pa
ges=array(1) rootJobSignature=052088007c075a9829b1876e7fd6770eba8951a1 rootJobTi
mestamp=20140915130451 masterPos= STARTING
2014-09-15 13:05:29 refreshLinks Datei:0003_1110_Technische_Dokumentation.pdf pa
ges=array(1) rootJobSignature=052088007c075a9829b1876e7fd6770eba8951a1 rootJobTi
mestamp=20140915130451 masterPos= t=34 good

I cant find any real "Error" within the debug_log.txt file. Only one thing that occurs on reuploads (it was there before the changes also):

Start request POST /testwiki/index.php?title=Spezial:Hochladen
...
FSFileBackend::getFileListInternal() given directory does not exist: 'd:\MediawikiTestwiki_Images/thumb/f/f0/0003_1110_Technische_Dokumentation.pdf'
...

Which is right: The directory does not exist.

If there is a problem with a media handler, it will only occur when the file is changed or reuploaded (or purged, although not 100% sure of that), not when the file description page is edited.

(In reply to carchaias from comment #16)

FSFileBackend::getFileListInternal() given directory does not exist:
'd:\MediawikiTestwiki_Images/thumb/f/f0/0003_1110_Technische_Dokumentation.
pdf'
...

Which is right: The directory does not exist.

Is it possible that the thumbnail directory is misconfigured? Does thumbnailing work for other file types?

As you can see on

https://aquanautweb.de/testwiki/index.php?title=Spezial:Dateien

thumbnailing in general works. There is one issue with uppercase file extensions like .JPG wich is related to Bug 66667 .

..it will only occur when the file is changed or reuploaded (or purged, although not 100% sure of that), not when the file description page is edited.

That's right.

I checked if thumbnailing still works and reuploaded a changed picture

https://aquanautweb.de/testwiki/index.php?title=Datei:Appointment.png

It does.

The message exist for those files too.

Start request POST /testwiki/index.php?title=Spezial:Hochladen
...
FSFileBackend::getFileListInternal() given directory does not exist: 'd:\MediawikiTestwiki_Images/thumb/f/f8/Appointment.png'

Well this file is pretty small and it is a png. The message stays away for larger jpg files like

https://aquanautweb.de/testwiki/index.php?title=Datei:DSCN1499.jpg

(In reply to carchaias from comment #16)

Start request POST /testwiki/index.php?title=Spezial:Hochladen
...
FSFileBackend::getFileListInternal() given directory does not exist:
'd:\MediawikiTestwiki_Images/thumb/f/f0/0003_1110_Technische_Dokumentation.
pdf'
...

Which is right: The directory does not exist.

What's the correct path? d:\MediawikiTestwiki_Images/f/f0/0003_1110_Technische_Dokumentation.pdf ? Do you have any custom configuration for file backends, thumbnail directories etc?

The path to the file (d:\MediawikiTestwiki_Images/f/f0/0003_1110_Technische_Dokumentation.pdf) is correct, but there is no entry within the thumb-part (d:\MediawikiTestwiki_Images/thumb) for some files.

I think those files that are small in height & width don't have a thumb file. And my 0003_1110_Technische_Dokumentation.pdf is recognised as 0x0 px, what you can see on its page. Thus it does not have a thumbnail file too (but it should have one). Thumbnails are only created when "necessary" [2].

LocalSettings section:

$wgUploadPath = "https://aquanautweb.de/Testwiki/MediawikiTestwiki_Images";

$wgUploadDirectory = "d:\MediawikiTestwiki_Images";

and we use img_auth here [1] wich works generally.

[1] http://www.mediawiki.org/wiki/Manual:Image_Authorization#IIS%20Instructions
[2] http://www.mediawiki.org/wiki/Manual:Image_administration#Data%20storage

TheDJ subscribed.

The wrappers to call out to binaries has changed twice since this report. This report is therefore unreproducible. If it still does not work on Windows, please open a new ticket against the new version of mediawiki.