Ivan Tikhonov

A collection of post-its I couldn't have found on the internet

Solving Path Validation in Python CLI Tools

argparse --help

I feel that the argparse module in the standard Python library is very much neglected on the internet. This concerns both blog posts / other social media and the source code I've been reading lately. But recently I was building quite a few CLI tools with argparse and wanted to save here one thing I found.

argparse --validate "my arguments"

I think that anyone who have ever build a CLI tool in Python and have used argparse, or was once familiar with the tutorials, knows about the type parser argument. I don't know why, but I am pretty sure that it is not a common knowledge that type parameter can accept not only standard Python types but also it can be a callable that accepts a single string or name of the registered type.

from argparse import ArgumentTypeError

def validator(raw: str) -> str:
    stoplist = {"not", "skip", "off"}
    words = "".join([c for c in raw if c.isalpha()]).split(" ")
    if any(word in stoplist for word in words):
        raise ArgumentTypeError(
            "In Soviet Union you don't turn off validation - "
            "validation turns off you"
        )
    return words


parser.add_argument(
    "--validate",
    type=validator,
    help="Validated parameter! Do not try to skip it"
)

I love this argument, it helped me to reinvent too many wheels in any CLI project my hands have been on to. But today I also found this interesting addition to parameter validation in the docs

argparse --file path/to/file

Yes! One of my most repeated usecases for type safety is already implemented in the library - FileType!

parser = argparse.ArgumentParser()
parser.add_argument('infile', type=argparse.FileType('r'))
parser.parse_args(['-'])

And it is great and it works! Unless you are only limited to files. No directories, no symlinks. After some extensive and exhausting googling for 30 seconds I found this great patch from more than 10 years ago. I just wanted to thank Dan Lenski for the idea, but it definetely deserves a modern touch up with another great stdlib member - pathlib

argparse --path path/to/anything

Wihout any introduction, please see:

from argparse import ArgumentTypeError
from pathlib import Path
from typing import Literal

STDIO_PATH = Path("-")


class PathType:
    """
    Custom argparse type for validating and processing file system paths.

    Validates paths based on their existence, type (file, directory, symlink),
    and whether they are absolute or relative. Supports using "-" for stdin.

    Args:
        exists (bool):
            - True: path must exist.
            - False: path must not exist, but must have a parent directory.
            - None: no validation.
        path_type (Literal["file", "dir", "symlink"] | None):
            Type of path to validate ("file", "dir", "symlink", or None).
        dash_ok (bool): Allow "-" for stdin/stdout (default is True).
        resolve (bool): Resolve to an absolute path (default is False).

    Example:
        parser.add_argument(
            '--input',
            type=PathType(exists=True, path_type="file")
        )

    Inspired by:
        https://mail.python.org/pipermail/stdlib-sig/2015-July/000990.html
    """

    def __init__(
        self,
        exists: bool | None = True,
        path_type: Literal["file", "dir", "symlink"] | None = "file",
        dash_ok: bool = True,
        resolve: bool = False,
    ):
        self.exists = exists
        self.path_type = path_type
        self.dash_ok = dash_ok
        self.resolve = resolve

    def __call__(self, string: str) -> Path:
        if string == "-":
            if not self.dash_ok:
                raise ArgumentTypeError(
                    'The "-" symbol is not allowed for this argument.'
                )
            return STDIO_PATH

        path = Path(string)

        if self.exists:
            if not path.exists():
                raise ArgumentTypeError(f"Path does not exist: {path}")
            if self.path_type == "file" and not path.is_file():
                raise ArgumentTypeError(f"Path is not a file: {path}")
            if self.path_type == "dir" and not path.is_dir():
                raise ArgumentTypeError(f"Path is not a dir: {path}")
            if self.path_type == "symlink" and not path.is_symlink():
                raise ArgumentTypeError(f"Path is not a symlink: {path}")
        elif self.exists is False:
            if path.exists():
                raise ArgumentTypeError(f"Path exists: {path}")
            parent = path.parent
            if not parent.exists() or not parent.is_dir():
                raise ArgumentTypeError(
                    f"Parent directory does not exist: {parent}"
                )

        if self.resolve:
            path = path.resolve()

        return path

And usage of this module is pretty simple:

import argparse
from pathtype import PathType

parser = argparse.ArgumentParser()
parser.add_argument(
    "-p",
    "--path",
    type=PathType(exists=True, path_type=None, dash_ok=True),
    help="Path or Dir or Symlink, I really don't care",
)
args = parse_args()

I hope this small custom class will save you a ton of time for reinventing other wheels, because this seems to be rolling fine now!