Wildcard support in filenames for Windows #25

Closed
opened 3 years ago by josch · 0 comments
josch commented 3 years ago
Owner

By ComFreek on 2015-11-17T11:14:13.068Z

Currently, the following command fails on Windows because cmd.exe and PowerShell do not perform file name expansion, as opposed to most *NIX shells.

python.exe -m img2pdf myFolder/*.jpg -o output.pdf

PS: You may be interested in the post and its comments on StackExchange where this feature request was initially proposed: http://softwarerecs.stackexchange.com/a/26102/583


By josch on 2015-12-09T13:17:11.312Z


Thanks! Unfortunately I do not have access to any Windows machine to test any such feature.

A pull request from anybody who can test a patch enabling this would be most welcome.


By josch on 2017-01-21T07:54:46.128Z


Closing due to lack of activity from original submitter.


By josch on 2017-01-21T07:54:46.492Z


Status changed to closed


By Matlock42 on 2017-04-18T22:26:52.651Z


I am willing to help troubleshoot this issue, however, I have only limited knowledge of python.

The following is the error I get when trying to run with a wildcard on windows:

...\img2pdf-master\src>python img2pdf.py --output out.pdf -S Letter --border 2cm:2.5cm --fit shrink *.jpg
Traceback (most recent call last):
  File "img2pdf.py", line 1730, in <module>
    main()
  File "img2pdf.py", line 1673, in main
    args = parser.parse_args()
  File "C:\Users\username\AppData\Local\Programs\Python\Python35-32\lib\argparse.py", line 1726, in parse_args
    args, argv = self.parse_known_args(args, namespace)
  File "C:\Users\username\AppData\Local\Programs\Python\Python35-32\lib\argparse.py", line 1758, in parse_known_args
    namespace, args = self._parse_known_args(args, namespace)
  File "C:\Users\username\AppData\Local\Programs\Python\Python35-32\lib\argparse.py", line 1967, in _parse_known_args
    stop_index = consume_positionals(start_index)
  File "C:\Users\username\AppData\Local\Programs\Python\Python35-32\lib\argparse.py", line 1923, in consume_positionals
    take_action(action, args)
  File "C:\Users\username\AppData\Local\Programs\Python\Python35-32\lib\argparse.py", line 1816, in take_action
    argument_values = self._get_values(action, argument_strings)
  File "C:\Users\username\AppData\Local\Programs\Python\Python35-32\lib\argparse.py", line 2271, in _get_values
    value = [self._get_value(action, v) for v in arg_strings]
  File "C:\Users\username\AppData\Local\Programs\Python\Python35-32\lib\argparse.py", line 2271, in <listcomp>
    value = [self._get_value(action, v) for v in arg_strings]
  File "C:\Users\username\AppData\Local\Programs\Python\Python35-32\lib\argparse.py", line 2286, in _get_value
    result = type_func(arg_string)
  File "img2pdf.py", line 1217, in input_images
    if os.path.getsize(path) == 0:
  File "C:\Users\username\AppData\Local\Programs\Python\Python35-32\lib\genericpath.py", line 50, in getsize
    return os.stat(filename).st_size
OSError: [WinError 123] The filename, directory name, or volume label syntax is incorrect: '*.jpg'

A little bit of searching found this stack exchange post that looks like it might address the issue: http://stackoverflow.com/questions/12501761/passing-multple-files-with-asterisk-to-python-shell-in-windows

Specifically, the following snippet looks promising:

import glob
if '*' in sys.argv[-1]:
     sys.argv[-1:] = glob.glob(sys.argv[-1])
continue...

However, I don't really know how to specifically implement this into your python script.


By Matlock42 on 2017-04-18T22:37:39.612Z


As a workaround, I created a batch script that allows you to drag and drop images onto the batch file and it will run the python script with each file called out in quotes. You can change the script call at the end based on your needs.

@echo off
set FILES = ""
:LOOP
rem check first argument whether it is empty and quit loop in cancel
if "%~1"=="" goto :END
rem Argument is added to others in string with quotes around it. The extra space at the end when concatenating is IMPORTANT
set FILE="%~1"
set FILES=%FILES%%FILE% 
rem `shift` makes the second argument to be the first, etc.
shift
goto :LOOP
:END
REM ask user to name the output file
set /p FNAME=Enter Output Filename: 
@echo on
python C:\Users\username\Documents\04_Scripts\python\img2pdf-master\src\img2pdf.py --output %FNAME%.pdf -S Letter --border 1cm:1cm --fit shrink %FILES%

By josch on 2017-04-20T05:05:27.588Z


My argument here is the following:

When I type program *.ext on a typical Linux shell then the shell will expand *.ext into all the files matching the glob expression.

When I type the same in the cmd.exe that Windows ships, then it will try to find a file named exactly *.ext (including the asterisk) which doesn't exist and isn't even a valid filename on Windows (the asterisk character is not allowed).

So expanding a glob into the files it matches is the task of the shell and not the task of the program you execute. So my first thought about that is, that if you want a globbing expression to be expanded into the files it matches, then you should just start using a more powerful shell than cmd.exe.

Think about it also the following way: if I now integrate the following code snippet you proposed:

if '*' in sys.argv[-1]:
    sys.argv[-1:] = glob.glob(sys.argv[-1])

Then that would mean that even under more powerful shells like the ones you typically find on Linux systems, you could not pass files to img2pdf anymore that contain an asterisk because then filenames including an asterisk would always be treated as if they were meant as globs. So adding code like the above would actually remove functionality for users.

Though, surely, above code snippet could be made specific to Windows and not be executed for users on Linux. But since this is a problem with the shell you run img2pdf from, it's not the right way to make this dependent on the operating system. For example imagine a user on Windows who is actually using a powerful shell that allows them to use globbing. Then those users would have functionality removed.

So instead of moving functionality that your shell should provide into img2pdf, it would be far easier and less painful for anybody else if those users who need globbing to work would just use a shell that supports globbing.

If you disagree with my arguments I am open to hear yours.


By Matlock42 on 2017-04-21T22:10:45.995Z


I have been contemplating your response to this and doing a bit of research. From what I can tell windows has always relied on the programs to work out globbed arguments when they are passed (source). This is true for both windows shells cmd and PowerShell (ps). I tested img2pdf with both shells and got the same error. Not to say that there are not ways around this by making a shell script to do the work and pass it correctly into the program. But that adds another layer outside the python program.

Conversely, your argument is valid from the fact that it makes the program more complicated for only a portion of the potential userbase to see an advantage. However, img2pdf --helpdoes list using a globbing argument as valid syntax and the result of said syntax on windows using either cmd or ps results in an error. That is why I commented on this issue.

My opinion is to have the snippet run on windows only and not executed for linux users.


By josch on 2017-04-23T07:06:27.410Z


I checked the superuser source you cite but there people say that globbing works with powershell. Still you claim that it doesn't. Can you explain which information is the correct one?

The img2pdf --help text shows *.jpg in the example section. But it also lets each line start with a dollar sign which is an indicator for a POSIX shell. Otherwise you could also argue that it's wrong to list just img2pdf as the right command in the example section because under windows you'd have to call python img2pdf instead. So if this is your complaint, then what you actually want to report is that the img2pdf --help output should list examples for different shells. So that there would be examples for bash as well as for cmd.exe. If that's what you are argueing, please open a separate bug about this issue. This issue is about wildcard support for img2pdf.


By Oliver on 2019-02-09T21:53:47.997Z


The behavior of the shell on Unix is rather special. It takes care of all the expansion of shell globs. For Windows programs this can be sort of emulated by linking against a special object file provided by Visual Studio. You can find detailed instructions here. This allows for easier porting of POSIX/SUS programs. Hope this helps.

*By ComFreek on 2015-11-17T11:14:13.068Z* Currently, the following command fails on Windows because cmd.exe and PowerShell do not perform file name expansion, as opposed to most *NIX shells. `python.exe -m img2pdf myFolder/*.jpg -o output.pdf` PS: You may be interested in the post and its comments on StackExchange where this feature request was initially proposed: http://softwarerecs.stackexchange.com/a/26102/583 --- *By josch on 2015-12-09T13:17:11.312Z* --- Thanks! Unfortunately I do not have access to any Windows machine to test any such feature. A pull request from anybody who can test a patch enabling this would be most welcome. --- *By josch on 2017-01-21T07:54:46.128Z* --- Closing due to lack of activity from original submitter. --- *By josch on 2017-01-21T07:54:46.492Z* --- Status changed to closed --- *By Matlock42 on 2017-04-18T22:26:52.651Z* --- I am willing to help troubleshoot this issue, however, I have only limited knowledge of python. The following is the error I get when trying to run with a wildcard on windows: ``` ...\img2pdf-master\src>python img2pdf.py --output out.pdf -S Letter --border 2cm:2.5cm --fit shrink *.jpg Traceback (most recent call last): File "img2pdf.py", line 1730, in <module> main() File "img2pdf.py", line 1673, in main args = parser.parse_args() File "C:\Users\username\AppData\Local\Programs\Python\Python35-32\lib\argparse.py", line 1726, in parse_args args, argv = self.parse_known_args(args, namespace) File "C:\Users\username\AppData\Local\Programs\Python\Python35-32\lib\argparse.py", line 1758, in parse_known_args namespace, args = self._parse_known_args(args, namespace) File "C:\Users\username\AppData\Local\Programs\Python\Python35-32\lib\argparse.py", line 1967, in _parse_known_args stop_index = consume_positionals(start_index) File "C:\Users\username\AppData\Local\Programs\Python\Python35-32\lib\argparse.py", line 1923, in consume_positionals take_action(action, args) File "C:\Users\username\AppData\Local\Programs\Python\Python35-32\lib\argparse.py", line 1816, in take_action argument_values = self._get_values(action, argument_strings) File "C:\Users\username\AppData\Local\Programs\Python\Python35-32\lib\argparse.py", line 2271, in _get_values value = [self._get_value(action, v) for v in arg_strings] File "C:\Users\username\AppData\Local\Programs\Python\Python35-32\lib\argparse.py", line 2271, in <listcomp> value = [self._get_value(action, v) for v in arg_strings] File "C:\Users\username\AppData\Local\Programs\Python\Python35-32\lib\argparse.py", line 2286, in _get_value result = type_func(arg_string) File "img2pdf.py", line 1217, in input_images if os.path.getsize(path) == 0: File "C:\Users\username\AppData\Local\Programs\Python\Python35-32\lib\genericpath.py", line 50, in getsize return os.stat(filename).st_size OSError: [WinError 123] The filename, directory name, or volume label syntax is incorrect: '*.jpg' ``` A little bit of searching found this stack exchange post that looks like it might address the issue: http://stackoverflow.com/questions/12501761/passing-multple-files-with-asterisk-to-python-shell-in-windows Specifically, the following snippet looks promising: ```python import glob if '*' in sys.argv[-1]: sys.argv[-1:] = glob.glob(sys.argv[-1]) continue... ``` However, I don't really know how to specifically implement this into your python script. --- *By Matlock42 on 2017-04-18T22:37:39.612Z* --- As a workaround, I created a batch script that allows you to drag and drop images onto the batch file and it will run the python script with each file called out in quotes. You can change the script call at the end based on your needs. ```dos @echo off set FILES = "" :LOOP rem check first argument whether it is empty and quit loop in cancel if "%~1"=="" goto :END rem Argument is added to others in string with quotes around it. The extra space at the end when concatenating is IMPORTANT set FILE="%~1" set FILES=%FILES%%FILE% rem `shift` makes the second argument to be the first, etc. shift goto :LOOP :END REM ask user to name the output file set /p FNAME=Enter Output Filename: @echo on python C:\Users\username\Documents\04_Scripts\python\img2pdf-master\src\img2pdf.py --output %FNAME%.pdf -S Letter --border 1cm:1cm --fit shrink %FILES% ``` --- *By josch on 2017-04-20T05:05:27.588Z* --- My argument here is the following: When I type `program *.ext` on a typical Linux shell then the shell will expand `*.ext` into all the files matching the glob expression. When I type the same in the `cmd.exe` that Windows ships, then it will try to find a file named exactly `*.ext` (including the asterisk) which doesn't exist and isn't even a valid filename on Windows (the asterisk character is not allowed). So expanding a glob into the files it matches is the task of the *shell* and not the task of the program you execute. So my first thought about that is, that if you want a globbing expression to be expanded into the files it matches, then you should just start using a more powerful shell than `cmd.exe`. Think about it also the following way: if I now integrate the following code snippet you proposed: if '*' in sys.argv[-1]: sys.argv[-1:] = glob.glob(sys.argv[-1]) Then that would mean that even under more powerful shells like the ones you typically find on Linux systems, you could not pass files to img2pdf anymore that contain an asterisk because then filenames including an asterisk would always be treated as if they were meant as globs. So adding code like the above would actually *remove* functionality for users. Though, surely, above code snippet could be made specific to Windows and not be executed for users on Linux. But since this is a problem with the *shell* you run img2pdf from, it's not the right way to make this dependent on the operating system. For example imagine a user on Windows who is actually using a powerful shell that allows them to use globbing. Then those users would have functionality removed. So instead of moving functionality that your shell should provide into img2pdf, it would be far easier and less painful for anybody else if those users who need globbing to work would just use a shell that supports globbing. If you disagree with my arguments I am open to hear yours. --- *By Matlock42 on 2017-04-21T22:10:45.995Z* --- I have been contemplating your response to this and doing a bit of research. From what I can tell windows has always relied on the programs to work out globbed arguments when they are passed ([source](https://superuser.com/questions/460598/is-there-any-way-to-get-the-windows-cmd-shell-to-expand-wildcard-paths)). This is true for both windows shells `cmd ` and `PowerShell` (ps). I tested img2pdf with both shells and got the same error. Not to say that there are not ways around this by making a shell script to do the work and pass it correctly into the program. But that adds another layer outside the python program. Conversely, your argument is valid from the fact that it makes the program more complicated for only a portion of the potential userbase to see an advantage. However, `img2pdf --help`does list using a globbing argument as valid syntax and the result of said syntax on windows using either cmd or ps results in an error. That is why I commented on this issue. My opinion is to have the snippet run on windows only and not executed for linux users. --- *By josch on 2017-04-23T07:06:27.410Z* --- I checked the superuser source you cite but there people say that globbing works with powershell. Still you claim that it doesn't. Can you explain which information is the correct one? The `img2pdf --help` text shows `*.jpg` in the example section. But it also lets each line start with a dollar sign which is an indicator for a POSIX shell. Otherwise you could also argue that it's wrong to list just `img2pdf` as the right command in the example section because under windows you'd have to call `python img2pdf` instead. So if this is your complaint, then what you actually want to report is that the `img2pdf --help` output should list examples for different shells. So that there would be examples for `bash` as well as for `cmd.exe`. If that's what you are argueing, please open a separate bug about this issue. This issue is about wildcard support for img2pdf. --- *By Oliver on 2019-02-09T21:53:47.997Z* --- The behavior of the shell on Unix is rather special. It takes care of all the expansion of shell globs. For Windows programs this can be sort of emulated by linking against a special object file provided by Visual Studio. You can find detailed instructions [here](https://docs.microsoft.com/en-us/cpp/c-language/expanding-wildcard-arguments?view=vs-2017). This allows for easier porting of POSIX/SUS programs. Hope this helps.
josch closed this issue 3 years ago
Sign in to join this conversation.
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: josch/img2pdf#25
Loading…
There is no content yet.