Page MenuHomePhabricator

Whitelist uploading of DjVu files
Closed, ResolvedPublic

Description

Author: kontakt

Description:
For the usage on de:Wikisource we would like to upload DjVu-Files (for our
scanned images, e.g. all scans of
http://commons.wikimedia.org/wiki/Rechenbuch_des_Andreas_Reinhard) on Commons.
Today DjVu does not belong to the accepted file types on Commons. Nevertheless
it is a popular file type in the academic world, especially for compressed scans
of historical books in digital libraries.

DjVu is an Open File Format. The file format specification is published as well
as source code for the reference library. In 2002 the DjVu file format was
chosen by the Internet archive as the format in which its Million Book Project
provides scanned public domain books online.

We propose to add DjVu (.djvu) to the list of accepted file formats for all
mediawiki projects.


Version: unspecified
Severity: enhancement
URL: http://commons.wikimedia.org/wiki/Special:Upload

Details

Reference
bz6131

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 9:17 PM
bzimport set Reference to bz6131.

This is not terribly likely to happen unless we can provide useful
software support for it.

kontakt wrote:

We only want to store the files on commons. DjVu is just an alternative to PDF.

kontakt wrote:

How does Wikimedia provide software support for PDF files?

Hm, the question appears to be if "archive" files (as opposed to things that can
be shown "inline") are useful to Wikimedia. We already support several such
formats (PDF and several OpenOffice formats), so why not DjVu, if it is an open
and accepted format? I agreee that such files are not imediately useful to
Wikipedia, but they may be handy for Wikisource and Wikibooks (and, in
concequence, Commons).

kontakt wrote:

DjVu has several advantages to offer.

  • it has a high compression ratio (very small files)
  • it loads very fast (in contrast to PDF you can regard the first pages while

the others are loading)

  • each page can be adressed individually
  • the compression and encoding algorithms were published under the GNU licence

Therefore a pretty good number of research libraries are already offering DjVu
files. Some examples:

http://www.jnul.huji.ac.il/eng/digibook.html (Jewish National and University
Library )
http://kpbc.umk.pl/dlibra (Kujawsko-Pomorska Digital Library)
http://www.numdam.org (Numérisation de documents anciens mathématiques)
http://www.archive.org/details/millionbooks (Million Book Project)

It would be a great step towards a greater acceptance of Wiksource in the
academic world.

Can you provide some references for supporting software?
Being able to automatically generate thumbnail and medium-size
inline JPEGs from these would I suspect be very helpful.

kontakt wrote:

The Open Source DjVu library "DjVuLibre" is an implementation of DjVu, including
viewers, browser plugins, decoders, simple encoders, and utilities.

Among other things it includes a set of decoders to convert DjVu to other formats.

You can download it via
http://djvulibre.djvuzone.org/

I just found an entry on the ImageMagick mailing list that does not sound so
good:
http://studio.imagemagick.org/pipermail/magick-users/2003-April/008701.html.
Quote:

The open source DjVu is crippled by offering less efficient compression

algorithms.

You have to buy commercial product in order to use the really cool DjVu

compression

algorithms.

Maybe this has changed over the last three years?

klausgraf wrote:

I support Frank Schulenburg. Internet Archive is also using Djvu for the
thousands of books it is digitizing.

ezyang wrote:

Marking enhancement.

Support for uploading DjVu files with file type validation and
width/height detection added in r14994. Will enable shortly.

Thumbnailing support not yet present, but can be added later.

.djvu files may now be uploaded.

kontakt wrote:

GREAT! Thank you very much! Greetings, Frank