josch/img2pdf

Fork 10

version 0.2.1 doesn't allow passing of lists #31

New issue

Closed

opened 2021-04-25 19:57:51 +00:00 by josch · 0 comments

josch commented

2021-04-25 19:57:51 +00:00

Owner

By Steven McKay on 2017-01-18T15:36:15.043Z

I apologize if I'm putting this in the wrong place. My question has to do with using img2pdf as a module.
In img2pdf 0.1.5, convert expects to receive a list of file names. This has worked well for me, as I pass it a list of filenames, and it gives me a pdf.
However, when I try to do it in 0.2.1 it gives me the error: "TypeError: 'list' does not support the buffer interface". The offending line is

            outputpdf = img2pdf.convert(filelist[start:end])

where filelist is just a list of filenames. In looking at the code, it appears that convert has been changed to allow more types of input than just lists of filenames, but it doesn't work as before. Do I need to change the code, or is this a bug?

By josch on 2017-01-18T15:39:11.941Z

Can you supply more detail?

what is the content of your filelist variable
where exactly is your TypeError thrown?

By Steven McKay on 2017-01-18T16:53:08.358Z

my filelist variable is a list containing strings. Each string is a separate filename.

Here is the error I get:
Traceback (most recent call last):
File "/Volumes/mckay/Google Drive/work/code/python/Grading/gradepages3.py", line 255, in
ES.main()
File "/Volumes/mckay/Google Drive/work/code/python/Grading/gradepages3.py", line 135, in main
self.creategradepage(key, self.filenames[key])
File "/Volumes/mckay/Google Drive/work/code/python/Grading/gradepages3.py", line 244, in creategradepage
outputpdf = img2pdf.convert(filelist[start:end])
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/img2pdf.py", line 948, in convert
in read_images(rawdata, colorspace, first_frame_only):
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/img2pdf.py", line 598, in read_images
im = BytesIO(rawdata)
TypeError: 'list' does not support the buffer interface

Line 244 is the line given in my original message. As I said before, this worked fine with img2pdf 0.1.5 and python 2.7, but I am trying to upgrade to img2pdf 0.2.1 and python 3.4. I have attached the original code, if that would help, but you won't be able to run it, as this uses a database in auto-multiple-choice.

gradePages3.py

Thanks for taking time to look into this.

By josch on 2017-01-18T19:46:18.211Z

Are you sure that all files in your filelist exist?

img2pdf supports three different types of input:

anything that comes with a read() function like file-like objects
filenames that can be opened with open()
raw in-memory data

img2pdf will try to interpret what you give it as one of these three variants one after another. If it doesn't have a read() function it will try to open() it and if that fails it will treat it like a string of raw data. Maybe you are passing paths to it that open() cannot open? In that case, it would try to interpret your path as raw data and fail.

Can you confirm that all the files that you pass to it actually exist?

Maybe add a print() function above line 598 to see what actually gets passed to BytesIO.

By Steven McKay on 2017-01-19T06:49:53.535Z

It looks like BytesIO is getting passed the entire list (at least thats what it looks like in PyCharm when I debug it there).

I have created a shorter test case. I have uploaded a zip file that contains two python files and some jpg files. The only difference between the python files is whether python (2.7 - standard install in osx) or python3 (installed via macports - version 3.4) is used. I installed img2pdf 0.1.5 using python a year or more ago. I recently installed img2pdf 0.2.1 using pip3. My assumption is that if I run python, it will pick up img2pdf 0.15 and if I run python3 it will pick up img2pdf 0.2.1. (I hope that assumption is correct).

Anyway, the code, including use of img2pdf.convert() is identical between the two, yet the python 2 code runs fine, and the python 3 code fails with the same message before. Either I'm not passing the list correctly in img2pdf, or I'm somehow using the wrong library with python3, or there's a bug :-)

Thanks for looking at this.
test.zip

By josch on 2017-01-19T07:13:59.211Z

Wonderful! I am now able to reproduce your problem. Thanks for providing your test case!

By josch on 2017-01-19T07:22:11.256Z

Okay, I now know your problem. It seems that img2pdf 0.2.1 introduced a backwards incompatible change. Before, the function definition looked like this:

def convert(images, ...)

and now it's this:

def convert(*images, ...)

The change was motivated by the images argument being the only positional argument to the function. So it didn't feel very pythonic to pass lists to the function even if one had only one image to convert.

Now there are multiple ways how we can fix this. Either you change your code to pass either the list packed or unpacked to the function depending on the img2pdf version or I add some code that in case only one argument was passed, it checks if that argument was a list and then tries to unpack that list.

What do you think?

By Steven McKay on 2017-01-19T16:39:13.675Z

I changed the line

imgout = img2pdf.convert(files)

imgout = img2pdf.convert(*files)

and it seems to work. I did not try to pack the list, as I don't exactly know what that means. I've never done that, though I could look it up.

If you are asking for my opinion, I think it would be better to be backwards compatible. I appear to be the only person affected by this change, (or others are smarter about figuring out the problem), but it might affect people in the future. However, you shouldn't check to see if only one argument is passed, because they may be using other options. That is your call, however. I'm perfectly ok (now that I know what to do) to leave it as is. I do think it would be better to document this, however. In your readme, you only pass one file name. The fact that you can pass lists, file like objects, or raw data, and how to do it, would be a good addition to the readme.

By the way, this package is a lifesaver. I convert hundreds of pages at a time into a pdf file for my graders to use, and if I had to use ghostscript, the file sizes would be much too big. Thanks for providing it.

By josch on 2017-01-20T06:08:46.394Z

Python differentiates between two types of function arguments. Positional arguments and keyword arguments. The convert function has always only accepted the images as the positional arguments, so the length of that is what I'll be checking. This is completely unrelated to how many keyword arguments are passed.

I made a new release which should fix your problems. Thanks!

By josch on 2017-01-20T08:28:47.693Z

I now also updated the README with more examples. Did I forget anything?

By Steven McKay on 2017-01-20T14:47:55.550Z

I think the README update looks good. I'll check out the new release. Thanks for your help.

By Steven McKay on 2017-01-20T15:12:22.586Z

I verified the new version which allows both *files and files (as a list) to be passed. I think this issue can be closed. I'll hit the close issue button, but I tried it before, so maybe I don't have enough permissions :-)

By josch on 2017-01-20T15:13:53.132Z

Nope, that seems to be a bug with my gitlab setup... sigh... :(

Thanks for your feedback!

By josch on 2017-01-21T07:49:39.807Z

Status changed to closed

*By Steven McKay on 2017-01-18T15:36:15.043Z* I apologize if I'm putting this in the wrong place. My question has to do with using img2pdf as a module. In img2pdf 0.1.5, convert expects to receive a list of file names. This has worked well for me, as I pass it a list of filenames, and it gives me a pdf. However, when I try to do it in 0.2.1 it gives me the error: "TypeError: 'list' does not support the buffer interface". The offending line is outputpdf = img2pdf.convert(filelist[start:end]) where filelist is just a list of filenames. In looking at the code, it appears that convert has been changed to allow more types of input than just lists of filenames, but it doesn't work as before. Do I need to change the code, or is this a bug? --- *By josch on 2017-01-18T15:39:11.941Z* --- Can you supply more detail? * what is the content of your `filelist` variable * where exactly is your `TypeError` thrown? --- *By Steven McKay on 2017-01-18T16:53:08.358Z* --- my filelist variable is a list containing strings. Each string is a separate filename. Here is the error I get: Traceback (most recent call last): File "/Volumes/mckay/Google Drive/work/code/python/Grading/gradepages3.py", line 255, in <module> ES.main() File "/Volumes/mckay/Google Drive/work/code/python/Grading/gradepages3.py", line 135, in main self.creategradepage(key, self.filenames[key]) File "/Volumes/mckay/Google Drive/work/code/python/Grading/gradepages3.py", line 244, in creategradepage outputpdf = img2pdf.convert(filelist[start:end]) File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/img2pdf.py", line 948, in convert in read_images(rawdata, colorspace, first_frame_only): File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/img2pdf.py", line 598, in read_images im = BytesIO(rawdata) TypeError: 'list' does not support the buffer interface Line 244 is the line given in my original message. As I said before, this worked fine with img2pdf 0.1.5 and python 2.7, but I am trying to upgrade to img2pdf 0.2.1 and python 3.4. I have attached the original code, if that would help, but you won't be able to run it, as this uses a database in auto-multiple-choice. [gradePages3.py](/uploads/8da27d0832ca853337c6fa5b1fb387c0/gradePages3.py) Thanks for taking time to look into this. --- *By josch on 2017-01-18T19:46:18.211Z* --- Are you sure that all files in your `filelist` exist? img2pdf supports three different types of input: * anything that comes with a `read()` function like file-like objects * filenames that can be opened with `open()` * raw in-memory data img2pdf will try to interpret what you give it as one of these three variants one after another. If it doesn't have a `read()` function it will try to `open()` it and if that fails it will treat it like a string of raw data. Maybe you are passing paths to it that `open()` cannot open? In that case, it would try to interpret your path as raw data and fail. Can you confirm that all the files that you pass to it actually exist? Maybe add a `print()` function above line 598 to see what actually gets passed to `BytesIO`. --- *By Steven McKay on 2017-01-19T06:49:53.535Z* --- It looks like BytesIO is getting passed the entire list (at least thats what it looks like in PyCharm when I debug it there). I have created a shorter test case. I have uploaded a zip file that contains two python files and some jpg files. The only difference between the python files is whether python (2.7 - standard install in osx) or python3 (installed via macports - version 3.4) is used. I installed img2pdf 0.1.5 using python a year or more ago. I recently installed img2pdf 0.2.1 using pip3. My assumption is that if I run python, it will pick up img2pdf 0.15 and if I run python3 it will pick up img2pdf 0.2.1. (I hope that assumption is correct). Anyway, the code, including use of img2pdf.convert() is identical between the two, yet the python 2 code runs fine, and the python 3 code fails with the same message before. Either I'm not passing the list correctly in img2pdf, or I'm somehow using the wrong library with python3, or there's a bug :-) Thanks for looking at this. [test.zip](/uploads/f3c3d65178069b6546bfccca2cceec8c/test.zip) --- *By josch on 2017-01-19T07:13:59.211Z* --- Wonderful! I am now able to reproduce your problem. Thanks for providing your test case! --- *By josch on 2017-01-19T07:22:11.256Z* --- Okay, I now know your problem. It seems that img2pdf 0.2.1 introduced a backwards incompatible change. Before, the function definition looked like this: def convert(images, ...) and now it's this: def convert(*images, ...) The change was motivated by the `images` argument being the only positional argument to the function. So it didn't feel very pythonic to pass lists to the function even if one had only one image to convert. Now there are multiple ways how we can fix this. Either you change your code to pass either the list packed or unpacked to the function depending on the img2pdf version or I add some code that in case only one argument was passed, it checks if that argument was a list and then tries to unpack that list. What do you think? --- *By Steven McKay on 2017-01-19T16:39:13.675Z* --- I changed the line imgout = img2pdf.convert(files) to imgout = img2pdf.convert(*files) and it seems to work. I did not try to pack the list, as I don't exactly know what that means. I've never done that, though I could look it up. If you are asking for my opinion, I think it would be better to be backwards compatible. I appear to be the only person affected by this change, (or others are smarter about figuring out the problem), but it might affect people in the future. However, you shouldn't check to see if only one argument is passed, because they may be using other options. That is your call, however. I'm perfectly ok (now that I know what to do) to leave it as is. I do think it would be better to document this, however. In your readme, you only pass one file name. The fact that you can pass lists, file like objects, or raw data, and how to do it, would be a good addition to the readme. By the way, this package is a lifesaver. I convert hundreds of pages at a time into a pdf file for my graders to use, and if I had to use ghostscript, the file sizes would be much too big. Thanks for providing it. --- *By josch on 2017-01-20T06:08:46.394Z* --- Python differentiates between two types of function arguments. Positional arguments and keyword arguments. The `convert` function has always only accepted the images as the positional arguments, so the length of that is what I'll be checking. This is completely unrelated to how many keyword arguments are passed. I made a new release which should fix your problems. Thanks! --- *By josch on 2017-01-20T08:28:47.693Z* --- I now also updated the README with more examples. Did I forget anything? --- *By Steven McKay on 2017-01-20T14:47:55.550Z* --- I think the README update looks good. I'll check out the new release. Thanks for your help. --- *By Steven McKay on 2017-01-20T15:12:22.586Z* --- I verified the new version which allows both *files and files (as a list) to be passed. I think this issue can be closed. I'll hit the close issue button, but I tried it before, so maybe I don't have enough permissions :-) --- *By josch on 2017-01-20T15:13:53.132Z* --- Nope, that seems to be a bug with my gitlab setup... sigh... :( Thanks for your feedback! --- *By josch on 2017-01-21T07:49:39.807Z* --- Status changed to closed