general editing

2015-03-20 11:30:19 -07:00 · 2015-03-20 11:30:19 -07:00 · be21c4bbf3
commit be21c4bbf3
parent 592cdc1cdb
1 changed files with 44 additions and 46 deletions
--- a/README.md
+++ b/README.md
@ -1,15 +1,16 @@
 img2pdf
 =======
-Lossless conversion of images to PDF without unnecessarily re-encoding JPEG and
+Losslessly convert images to PDF without unnecessarily re-encoding JPEG and
-JPEG2000 files. Thus, no loss of quality and no unnecessary large output file.
+JPEG2000 files.  Image quality is retained without unnecessarily increasing
 file size.
 Background
 ----------
-PDF is able to embed JPEG and JPEG2000 images as they are without re-encoding
+Quality loss can be avoided when converting JPEG and JPEG2000 images to
-them (and hence losing quality) but I was missing a tool to do this
+PDF by embedding them without re-encoding.  I wrote this piece of python code.
-automatically, thus I wrote this piece of python code.
+because I was missing a tool to do this automatically.
 If you know how to embed JPEG and JPEG2000 images into a PDF container without
 recompression, using existing tools, please contact me so that I can put this
@ -18,43 +19,41 @@ code into the garbage bin :D
 Functionality
 -------------
-The program will take image filenames from commandline arguments and output a
+This program will take a list of images and produce a PDF file with the
-PDF file with them embedded into it. If the input image is a JPEG or JPEG2000
+images embedded in it.  JPEG and JPEG2000 images will be included without
-file, it will be included as-is without any processing. If it is in any other
+recompression.  Images in other formats will be included with zip/flate
-format, the image will be included as zip-encoded RGB. As a result, this tool
+encoding.  As a result, this tool is able to losslessly wrap any image
-will be able to lossless wrap any image into a PDF container while performing
+into a PDF container with a quality-filesize ratio that is typically better
-better (in terms of quality/filesize ratio) than existing tools in case the
+than that of existing tools.
 input image is a JPEG or JPEG2000 file.
-For example, imagemagick will re-encode the input JPEG image and thus change
+For example, imagemagick will re-encode the input JPEG image (thus changing
-its content:
+its content):
 	$ convert img.jpg img.pdf
 	$ pdfimages img.pdf img.extr # not using -j to be extra sure there is no recompression
 	$ compare -metric AE img.jpg img.extr-000.ppm null:
 	1.6301e+06
-If one wants to do a lossless conversion from any format to PDF with
+If one wants to losslessly convert from any format to PDF with
-imagemagick then one has to use zip-encoding:
+imagemagick, one has to use zip compression:
 	$ convert input.jpg -compress Zip output.pdf
 	$ pdfimages img.pdf img.extr # not using -j to be extra sure there is no recompression
 	$ compare -metric AE img.jpg img.extr-000.ppm null:
 	0
-The downside is, that using imagemagick like this will make the resulting PDF
+However, this approach will result in PDF files that are a few times larger
-files a few times bigger than the input JPEG or JPEG2000 file and can also not
+than the input JPEG or JPEG2000 file.
 output a multipage PDF.
-img2pdf is able to output a PDF with multiple pages if more than one input
+img2pdf is able to losslessly embed JPEG and JPEG2000 files into a PDF
-image is given, losslessly embed JPEG and JPEG2000 files into a PDF container
+container without additional overhead (aside from the PDF structure itself),
-without adding more overhead than the PDF structure itself and will save all
+save other graphics formats using lossless zip compression,
-other graphics formats using lossless zip-compression.
+and produce multi-page PDF files when more than one input image is given.
-Another nifty advantage: Since no re-encoding is done in case of JPEG images,
+Also, since JPEG and JPEG2000 images are not reencoded, conversion  with
-the conversion is many (ten to hundred) times faster with img2pdf compared to
+img2pdf is several (ten to hundred) times faster than with imagemagick.
-imagemagick. While a run of above convert command with a 2.8MB JPEG takes 27
+While the above convert command with a 2.8MB JPEG took 27 seconds
-seconds (on average) on my machine, conversion using img2pdf takes just a
+(on average) on my machine, conversion using img2pdf took just a
 fraction of a second.
 Commandline Arguments
@ -81,27 +80,26 @@ More help is available with the -h or --help option.
 Bugs
 ----
-If you find a JPEG or JPEG2000 file that, when embedded can not be read by the
+If you find a JPEG or JPEG2000 file that, when embedded cannot be read
-Adobe Acrobat Reader, please contact me.
+by the Adobe Acrobat Reader, please contact me.
-For lossless conversion of other formats than JPEG or JPEG2000 files, zip/flate
+For lossless conversion of formats other than JPEG or JPEG2000, zip/flate
-encoding is used.  This choice is based on a number of tests I did on images.
+encoding is used.  This choice is based on tests I did with a number of images.
-I converted them into PDF using imagemagick and all compressions it has to
+I converted them into PDF using the lossless variants of the compression
-offer and then compared the output size of the lossless variants. In all my
+formats offered by imagemagick.  In all my tests, zip/flate encoding performed
-tests, zip/flate encoding performed best. You can verify my findings using the
+best.  You can verify my findings using the test_comp.sh script with any input
-test_comp.sh script with any input image given as a commandline argument. If
+image given as a commandline argument.  If you find an input file that is
-you find an input file that is outperformed by another lossless compression,
+outperformed by another lossless compression method, contact me.
 contact me.
-I have not yet figured out how to read the colorspace from jpeg2000 files.
+I have not yet figured out how to determine the colorspace of JPEG2000 files.
-Therefor jpeg2000 files use DeviceRGB per default. If your jpeg2000 files are
+Therefore JPEG2000 files use DeviceRGB by default. For JPEG2000 files with
-of any other colorspace you must force it using the --colorspace option.
+other colorspaces, you must force it using the `--colorspace` option.
 Like -C L for DeviceGray.
 Installation
 ------------
-On a Debian/Ubuntu based OS, the following dependencies are needed:
+On a Debian- and Ubuntu-based systems, dependencies may be installed
 with the following command:
 	apt-get install python python-pil python-setuptools
@ -109,17 +107,17 @@ Or for Python 3:
 	apt-get install python3 python3-pil python3-setuptools
-You can install the package using:
+You can then install the package using:
 	$ pip install img2pdf
-If you want to install from source code simply use:
+If you prefer to install from source code use:
 	$ cd img2pdf/
 	$ pip install .
 To test the console script without installing the package on your system,
-simply use virtualenv:
+use virtualenv:
 	$ cd img2pdf/
 	$ virtualenv ve
@ -129,7 +127,7 @@ You can then test the converter using:
 	$ ve/bin/img2pdf -o test.pdf src/tests/test.jpg
-Note that the package can also be used as a library as follows:
+The package can also be used as a library:
 	import img2pdf
 	pdf_bytes = img2pdf.convert(['test.jpg'])