when creating PDF files many files are missing see https://en.wikipedia.org/w/index.php?title=Wikipedia:Village_pump_%28technical%29&oldid=563383020#Infobox_images_missing_from_PDFs
Version: unspecified
Severity: enhancement
when creating PDF files many files are missing see https://en.wikipedia.org/w/index.php?title=Wikipedia:Village_pump_%28technical%29&oldid=563383020#Infobox_images_missing_from_PDFs
Version: unspecified
Severity: enhancement
From that VPT section:
I think you'll find it has left out the "fair use" images. That's probably a deliberate design choice, though I can't immediately find it documented anywhere. -- John of Reading (talk) 15:50, 7 July 2013 (UTC)
Betacommand: is that the case? Do you only see this happening when the images are used on WP under a fair use claim? My guess is the vast majority (99%) of images in the category referenced (marvel comics related) are used under a fair use claim. Can you reproduce on another article that doesn't have any fair use images?
Betacommand tells me on IRC that yes, this only occurs when the images are marked as Fair Use. Retitling bug as such. This may be a WONTFIX issue. I'll let the extension authors/those interested weigh in.
volker.haas wrote:
It is not possible to include fair use images in PDFs due to potential copyright issues. If you want to generate a PDF for personal use, you could install the PDF rendering software (mwlib and mwlib.rl) on your local machine and disable filtering of images.
How is it any more of a copyright issue than serving the existing article? If a user decides they want to include the files there should be an option to override the filtering. It wouldnt affect the default process but would enable better offline access.
I'm going to pre-emptively nip this copyright conversation in the bud, at least the aspect of having it in the bug tracker.
!! Please do not debate the relative merits of any interpretation of copyright law in the bug tracker. !!
The local wiki is the correct place to have this discussion as that is where the various lists of excluded categories/templates live (per wiki).
This "Feature" isn't controlled at the local level. This was a <s>feature</s> Bug that was created by developers, implemented by developers, without ever asking local users.
What would be great would be a parameter that could be passed to this process that overrode the image filters.
Volker: can you comment on the history of this decision (to add that functionality)? Maybe pointing to any public discussion?
(In reply to comment #9)
Volker: can you comment on the history of this decision (to add that
functionality)? Maybe pointing to any public discussion?
Greg: have you tried looking up the relevant code? It would be helpful if someone could paste the code here, perhaps with the accompanying SVN revision or Git commit. :-)
volker.haas wrote:
@Betacommand:
There is no bug, the software works as intended. The software is just configured in a way that does not suit your current need. At the moment it is not possible to pass any user-configuration for specific PDFs/collections to the rendering software. Therefore it is not possible to include fair-use images for specific collections.
@greg:
I can't point you to any public discussion regarding that issue. There has been lot's of talk regarding image copyrights related to the Collection Extension, but I don't remember anything specific about fair use images.
@MZMcBride:
The code is not really the problem, it's the configuration which explicitly removes fair use images.
All this happens in mwlib's licensechecker https://github.com/pediapress/mwlib/blob/master/mwlib/writer/licensechecker.py
The licensechecker supports three modes:
The license information is imported from a csv file ( https://github.com/pediapress/mwlib/blob/master/mwlib/writer/wplicenses.csv ). This file contains the info about fair use images:
"Fairuse",,"nonfree",,"- Copyrighted content that may be used as ''fair-use'', but since the commons does not accept ''fair use'' content this image will need to be deleted. [[Commons:Licensing#Material under the fair use clause is not allowed on the Commons|See here for details why]]."
The PDF writer is currently configured to use blacklisting for all wikipedia projects except for the german wikipedia. In the german wikipedia we are using whitelisting after a community uproar about including images with an unknown license in the PDFs.
( https://github.com/pediapress/mwlib.rl/blob/master/mwlib/rl/rlwriter.py#L190 )
So, that's pretty much all I can say about that topic.
It should be trivial to create a method for setting nofilter mode upon command and per PDF generation. Thus allowing all images if a user wants them.
volker.haas wrote:
(In reply to comment #12)
It should be trivial to create a method for setting nofilter mode upon
command
and per PDF generation. Thus allowing all images if a user wants them.
Unfortunately it is not trivial:
And there are probably more things that I forgot.
But of course patches are always welcome!
In general I'd highly recommend to never start sentences with "It should be trivial" in bug reports if you don't know the codebase by heart. ;)
Closing as obsolete - PDF generation via collection has been dead for years. The current PDF generation system does include fair-use images.