argparse --help
At some point in a project's lifecycle, scripts tend to be reused across different contexts. In my experience, without a consistent interface, people often start to directly modify the source code for each new context, which makes synchronization and updates painfully slow and error-prone.
Also I love a good CLI. The one that knows what it wants, how it wants it, andcommunicates it clearly. And I want to record a few tricks up my sleeve after spending some time "in the fields".
argparse --validate
I think it is a common knowledge that argparse is a Python library for convinient handling of CLI arguments. And it is also a common knowledge that each parameter supports custom validation through the type parameter. What is less commonly emphasized, is that it can accpet any callable that takes a single string and returs a transformed value.
This opens an opportunity to implement custom validation logic at the CLI boundary, instead of manually post-processing arguments after parsing.
For example, we can define a simple validator that rejects certain input patterns:
def validator(value: str):
stoplist = {
"not", "skip", "off", "don't", 'don"t', "no", "false"
}
words = "".join([c for c in value if c.isalpha()]).lower().split(" ")
if any(word in stoplist for word in words):
raise ArgumentTypeError(
"In Soviet Union you don't turn off validation, "
"but validation turns you off."
)
return value
This function can then be attached directly to an argument definition:
parser.add_argument(
"--validate",
type=validator,
help="Caution! Validated parameter! Do not skip (wink)",
) The idea is that validation becomes part of the interface itself, rather than an afterthought inside the application, making everything cleaner and simpler.
argparse --input
Another feature of argparse that often slides unnoticed is built-in FileType
It looks trivial: converts a string path into an already-open file object during the argument parsing, but that removes so much boilerplate in everyday CLI tools and is so easy to add, I am genuienly surprised why it is so unpopular.
While perfect for simple examples, I've started to think why there is no equivalent for directories?
argparse --path
Once you start thinking about file inputs, it becomes clear that FileType only solves a very limited set of problems. And I am not the only one who saw this! In fact, there was already a proposal to extend this idea further in the standard library:
stdlib-sig discussion: PathType proposal
The idea behind the patch is very simple - extend the functionality on directories, symlinks, and even the standard input. I found it a very interesting idea with a slightly aged implementation. pathlib instead of os would look here much better.
from argparse import ArgumentTypeError
from pathlib import Path
from typing import Literal
import argparse Kind = Literal["file", "directory", "symlink"] class PathType:
def __init__(
self,
must_exist: bool | None = True,
kind: Kind | tuple[Kind] | None = None,
allow_stdio: bool = False,
resolve: bool = False,
):
"""
Argparse type for validating filesystem path arguments.
Converts a string input into a pathlib.Path object or stdin sentinel.
Supports existens checks, path kind filtering, and path resolution.
Args:
must_exist (bool | None): Controls existence validation.
True requires the path to exist.
False requires the path not to exist.
None disables existence checking.
kind(Kind | tuple[Kind] | None): Allowed path types. Valid values are "file",
"directory", and "symlink". If None, no type filtering is applied.
allow_stdio (bool): If True, "-" is treated as stdin and returned
unchanged.
resolve (bool): If True, resolves the path before validation.
Returns:
Path | str: Validated filesystem path or stdin sentinel "-".
Raises:
ArgumentTypeError: If validation fails.
"""
self.must_exist = must_exist
self.allow_stdio = allow_stdio
self.resolve = resolve
if isinstance(kind, str):
kind = (kind,)
self.kind = frozenset(kind) if kind is not None else None
def __call__(self, value: str) -> Path | str:
if value == "-" and self.allow_stdio:
return value
path = Path(value)
if self.resolve:
path = path.resolve()
if self.must_exist is True and not path.exists():
raise ArgumentTypeError(f"Path does not exist: {path}")
if self.must_exist is False and path.exists():
raise ArgumentTypeError(f"Path does exist: {path}")
if self.kind is not None and path.exists():
if not (
("file" in self.kind and path.is_file())
or ("directory" in self.kind and path.is_dir())
or ("symlink" in self.kind and path.is_symlink())
):
raise ArgumentTypeError(
f"Path does not match allowed kinds {self.kind}: {path}"
)
return path
A little copying is better than a little dependency. (Rob Pike)