The part where I glaze Python
This time, I'll confess: As any other website owner, I am obsessed with my own creation. And another habit of my own, at least, when it comes to programming, is that I tend to have a very weird fixation on everything being minimal and tight.
"minimal and tight" is especially funny, I guess, since my language of choice is Python.
And don't get me wrong. Python is a terrible language in terms of computer resource efficiency. But every time I come across a problem in my life I need to solve, exactly this language comes and saves the day in a few hundreds lines of code.
Especially surprising to me is the size of the Python's standard library: it is huge! And while many people criticize it for many fair reasons, I still cannot find a language I could safely call offline-ready without having to reinvent every single thing.
And the more I spend time reading the documentation, the more I realize, how many boring parts are being solved, leaving the room for the fun stuff.
The part where I am inspired
I hope the previous part wasn't that annoying.
A year or more so ago I came across this post on my RSS feed: Minimum viable blog by Carl Öst Wilkens. It seeded in me a thought that I can build a thing that I completely own from the ground up. No dependencies, 100% compatibility with my use cases, and absolute freedom of expression. Then, I've started seeing little DSLs for personal small web all around. Most notable examples are:
The part where I am generating HTML
Since this is an already solved problem, I would just love to share some code snippets... If only there was a simple way to just dynamically import it, like if the post itself was the Python module...
Well, jokes aside, this post is indeed a python module using a small DSL translation module I modestly called dom.py
The main API is exposed via this small dunder accessor in the Tag method:
class Tag:
def __getattr__(self, name: str):
if name not in TAGS:
raise AttributeError(f"Unknown tag: {name}")
def wrapper(*children, **attrs):
if name in VOID and children:
raise TypeError(f"<{name}> is void and cannot have children")
return Element(name, list(children), attrs)
return wrapper
The rest of the module is just HTML-and-not-so escaping, attribute population and support for iterables:
def render_child(self, node: Renderable) -> str:
match node:
case None:
return ""
case Raw(text=text):
return text
case Element():
return node.render()
case list() | tuple() | set():
return " ".join(self.render_child(x) for x in node)
case _:
return html.escape(str(node))
That said, the module itself is very easy to use, and can be simply dropped in any project and used for HTML generation. In my case, I've started from implementing these snippets:
from datetime import date
from dom import t
import inspect
def header():
links = [
item
for k, v in {
"home": "index.html",
"git": "https://codeberg.org/itikhonov",
"rss": "rss.xml",
}.items()
for item in (t.a(k, href=v), " | ")
][:-1]
return t.header(
t.h1("Ivan Tikhonov's Blog"),
t.p("A collection of post-its I couldn't have found on the internet"),
t.nav(links),
t.hr()
)
def footer(year: int | date | None = None):
return t.footer(
t.hr(),
t.p(f"Copyright {year or date.today().year}"),
)
def page(title, content):
return t.html(
t.head(
t.title(title),
t.link(rel="icon", href="data:,"),
t.link(rel="stylesheet", href="/assets/style.css")
),
t.body(
header(),
t.h1(title),
t.article(content),
footer()
)
)
def source(callable):
return t.pre(t.code(inspect.getsource(callable)))
The function above is used to present all the internals of this project, so the code is going to be updated with each new iteration and build cycle :)
The part where I love pathlib
But blogs aren't just HTML converters - they are also automating file and resource management, orchestration and indexation of the whole thing! This is where pathlib comes super handy to me
Sure, it's slow and way overkill for my purpose, but it is just so pleasant to use it, and when it comes to hobby projects, I tend to use whatever brings most joy to solve the problem.
So modules... I mean, blogposts! Yes, blogposts have metadata I often also include code... Very often it is Python, so I need to take that into account...
This is exactly why I've decided to just use importlib:
def load_post(path: Path):
spec = importlib.util.spec_from_file_location(path.stem, path)
mod = importlib.util.module_from_spec(spec)
sys.modules[spec.name] = mod
spec.loader.exec_module(mod)
return {
"title": getattr(mod, "title", "Untitled"),
"date": getattr(mod, "date"),
"tags": getattr(mod, "tags", []),
"content": getattr(mod, "content")(),
"path": path
}
And so you see how simple now HTML generation, content inclusion and, basically, anything is. We just need a little of pathlib magic in the build step
def build():
if ASSETS_DIR.exists():
shutil.copytree(
ASSETS_DIR,
BUILD_DIR / ASSETS_DIR.name,
dirs_exist_ok=True
)
posts = []
for post_path in CONTENT_DIR.glob("*.py"):
if post_path.name == "__init__.py":
continue
post = load_post(post_path)
posts.append(post)
html = page(post["title"], post["content"]).render()
out_file = BUILD_DIR / post_path.with_suffix(".html").name
out_file.write_text(html, encoding="utf-8")
posts.sort(key=lambda p: p["date"], reverse=True)
index_html = page(
None,
t.ul([
t.li((
t.a(
post["title"],
href=post["path"].with_suffix(".html").name
),
f"[{post['date']}]"
))
for post in posts
])
).render()
(BUILD_DIR / "index.html").write_text(index_html, encoding="utf-8")
rss_file = BUILD_DIR / "rss.xml"
feed(posts, site_name=SITE_NAME, site_url=BASE_URL, rss_file=rss_file)
Overall, my "engine" is 3 module files of ~100 lines of code, while remaining stdlib-only. Isn't it nice?
The part where I reflect
Why not markdown + yaml but python modules?
I think that markdown is not the best fit for me. It is not consistent and just having another intermediate representation felt bad to me. Also to properly support markdown, we need to introduce even more complexity, as it supports HTML, and yaml itself is just harmful.
Could this project exist as comfortably in another language?
Yes, I know the answer! I think I even have a few of them! Raku and some kind of Lisp would work. Honestly, I can't wait to finish writing this post and actually to try out rewriting this whole thing in it. It is wild, and everyone should at least read about it.
Is this engineering, or just procrastination with taste?
In the age of Hugo and just plain HTML, like I've been doing this whole time before, I would consider calling it plain procrastination. Of course, I've learnt a lot of stuff on the Python's stdlib, and I will apply these lessons in any next project my hands will get on, and of course, any project in the future. Like I said, pathlib is now one of my favorite libraries in the standard library, and I already use it extensively in my pandas file ops
The part where I sum things up
Python is fun! So is its standard library. Also inventing your own wheels which roll exactly on your weird roads is great.