Solving Path Validation in Python CLI Tools
argparse --help
I feel that the
argparse
module in the standard Python library is very much neglected on the
internet. This concerns both blog posts / other social media and the
source code I've been reading lately. But recently I was building quite
a few CLI tools with argparse and wanted to save here one thing I
found.
argparse --validate "my arguments"
I think that anyone who have ever build a CLI tool in Python and have
used argparse, or was once familiar with the tutorials, knows about
the
type
parser argument. I don't know why, but I am pretty sure that it is not
a common knowledge that type parameter can accept not only standard
Python types but also it can be a callable that accepts a single
string or name of the registered
type.
from argparse import ArgumentTypeError
def validator(raw: str) -> str:
stoplist = {"not", "skip", "off"}
words = "".join([c for c in raw if c.isalpha()]).split(" ")
if any(word in stoplist for word in words):
raise ArgumentTypeError(
"In Soviet Union you don't turn off validation - "
"validation turns off you"
)
return words
parser.add_argument(
"--validate",
type=validator,
help="Validated parameter! Do not try to skip it"
)
I love this argument, it helped me to reinvent too many wheels in any CLI project my hands have been on to. But today I also found this interesting addition to parameter validation in the docs
argparse --file path/to/file
Yes! One of my most repeated usecases for type safety is already implemented in the library - FileType!
parser = argparse.ArgumentParser()
parser.add_argument('infile', type=argparse.FileType('r'))
parser.parse_args(['-'])
And it is great and it works! Unless you are only limited to files. No directories, no symlinks. After some extensive and exhausting googling for 30 seconds I found this great patch from more than 10 years ago. I just wanted to thank Dan Lenski for the idea, but it definetely deserves a modern touch up with another great stdlib member - pathlib
argparse --path path/to/anything
Wihout any introduction, please see:
from argparse import ArgumentTypeError
from pathlib import Path
from typing import Literal
STDIO_PATH = Path("-")
class PathType:
"""
Custom argparse type for validating and processing file system paths.
Validates paths based on their existence, type (file, directory, symlink),
and whether they are absolute or relative. Supports using "-" for stdin.
Args:
exists (bool):
- True: path must exist.
- False: path must not exist, but must have a parent directory.
- None: no validation.
path_type (Literal["file", "dir", "symlink"] | None):
Type of path to validate ("file", "dir", "symlink", or None).
dash_ok (bool): Allow "-" for stdin/stdout (default is True).
resolve (bool): Resolve to an absolute path (default is False).
Example:
parser.add_argument(
'--input',
type=PathType(exists=True, path_type="file")
)
Inspired by:
https://mail.python.org/pipermail/stdlib-sig/2015-July/000990.html
"""
def __init__(
self,
exists: bool | None = True,
path_type: Literal["file", "dir", "symlink"] | None = "file",
dash_ok: bool = True,
resolve: bool = False,
):
self.exists = exists
self.path_type = path_type
self.dash_ok = dash_ok
self.resolve = resolve
def __call__(self, string: str) -> Path:
if string == "-":
if not self.dash_ok:
raise ArgumentTypeError(
'The "-" symbol is not allowed for this argument.'
)
return STDIO_PATH
path = Path(string)
if self.exists:
if not path.exists():
raise ArgumentTypeError(f"Path does not exist: {path}")
if self.path_type == "file" and not path.is_file():
raise ArgumentTypeError(f"Path is not a file: {path}")
if self.path_type == "dir" and not path.is_dir():
raise ArgumentTypeError(f"Path is not a dir: {path}")
if self.path_type == "symlink" and not path.is_symlink():
raise ArgumentTypeError(f"Path is not a symlink: {path}")
elif self.exists is False:
if path.exists():
raise ArgumentTypeError(f"Path exists: {path}")
parent = path.parent
if not parent.exists() or not parent.is_dir():
raise ArgumentTypeError(
f"Parent directory does not exist: {parent}"
)
if self.resolve:
path = path.resolve()
return path
And usage of this module is pretty simple:
import argparse
from pathtype import PathType
parser = argparse.ArgumentParser()
parser.add_argument(
"-p",
"--path",
type=PathType(exists=True, path_type=None, dash_ok=True),
help="Path or Dir or Symlink, I really don't care",
)
args = parse_args()
I hope this small custom class will save you a ton of time for reinventing other wheels, because this seems to be rolling fine now!