EPUB Multirenamer for Mac
brought to you by dzhome.de
Last updated: 08.April 2026
0. Introduction
This script lets you edit multiple EPUB files at the same time. In the YAML file, you can define which terms should be searched for and how they should be changed. For example, this makes it possible to quickly adapt an EPUB file, at least in part, to the newer German spelling rules. I also provide my own YAML file here as a download, and I update and expand it on a regular basis. There is also an German version of this guide.
I used ChatGPT on 28.03.2026 to create this Python script.
1. Project structure
The workflow is always the same:
- Put your EPUB files into
EPUB-INPUT - Edit
replacements.yamlif needed - Run the script in Terminal
- Check the results in
EPUB-OUTPUT
2. Requirements on Mac
2.1 Open Terminal
Open Terminal using Spotlight: Cmd + Space, then type Terminal.
2.2 Check Python
python3 --version
If a Python version is displayed, you can continue with step 2.4.
If you see command not found, install Python as described in step 2.3 below.
2.3 Install Python on Mac
2.3.1 Install the Command Line Tools
xcode-select --install
sudo rm -rf /Library/Developer/CommandLineTools
xcode-select --install
2.3.2 Install Homebrew
Homebrew is the easiest way to install Python on a Mac.
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
2.3.3 Install Python
brew install python
2.3.4 Verify the installation
python3 --version
Python 3.12.x or Python 3.13.x,
Python has been installed correctly.
2.4 Install PyYAML
python3 -m pip install pyyaml
PyYAML package is required so that the script can read the
replacements.yaml file.
2.5 Which files inside the EPUB are processed?
An EPUB file is similar to a ZIP archive. Inside it, there are several files.
By default, files with the following extensions are processed:
.xhtml.html.htm
3. Create the project folder
For example, create a folder named EPUB-Multirenamer on your desktop.
4. The Python script
Save the following script as epub_multi_replace.py.
#!/usr/bin/env python3
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
from __future__ import annotations
import argparse
import re
import sys
import tempfile
import zipfile
from dataclasses import dataclass
from pathlib import Path
import xml.etree.ElementTree as ET
import yaml
TEXT_EXTENSIONS = {".xhtml", ".html", ".htm"}
INPUT_DIR_NAME = "EPUB-INPUT"
OUTPUT_DIR_NAME = "EPUB-OUTPUT"
@dataclass
class ReplacementRule:
pattern: str
replacement: str
regex: bool = True
flags: int = 0
description: str = ""
def log(msg: str) -> None:
print(msg)
def is_text_file(path: Path) -> bool:
return path.is_file() and path.suffix.lower() in TEXT_EXTENSIONS
def parse_flags(flag_names: list[str] | None) -> int:
if not flag_names:
return 0
flag_map = {
"IGNORECASE": re.IGNORECASE,
"MULTILINE": re.MULTILINE,
"DOTALL": re.DOTALL,
}
value = 0
for name in flag_names:
if name not in flag_map:
raise ValueError(f"Unknown regex flag: {name}")
value |= flag_map[name]
return value
def load_replacements(yaml_path: Path) -> list[ReplacementRule]:
try:
data = yaml.safe_load(yaml_path.read_text(encoding="utf-8"))
except FileNotFoundError as exc:
raise FileNotFoundError(f"YAML file not found: {yaml_path}") from exc
except Exception as exc:
raise ValueError(f"Failed to read YAML file: {exc}") from exc
if not isinstance(data, dict):
raise ValueError("YAML file does not have a valid root object.")
items = data.get("replacements")
if not isinstance(items, list):
raise ValueError("YAML file must contain a key 'replacements' with a list.")
rules: list[ReplacementRule] = []
for idx, item in enumerate(items, start=1):
if not isinstance(item, dict):
raise ValueError(f"Entry #{idx} in 'replacements' is not an object.")
pattern = item.get("pattern")
replacement = item.get("replacement", "")
regex = item.get("regex", True)
description = item.get("description", "")
flags = parse_flags(item.get("flags"))
if not isinstance(pattern, str):
raise ValueError(f"Entry #{idx}: 'pattern' is missing or is not a string.")
if not isinstance(replacement, str):
raise ValueError(f"Entry #{idx}: 'replacement' is not a string.")
if not isinstance(regex, bool):
raise ValueError(f"Entry #{idx}: 'regex' must be true or false.")
if not isinstance(description, str):
raise ValueError(f"Entry #{idx}: 'description' is not a string.")
rules.append(
ReplacementRule(
pattern=pattern,
replacement=replacement,
regex=regex,
flags=flags,
description=description,
)
)
return rules
def extract_epub(epub_path: Path, target_dir: Path) -> None:
with zipfile.ZipFile(epub_path, "r") as zf:
zf.extractall(target_dir)
def apply_replacements(text: str, rules: list[ReplacementRule]) -> tuple[str, int]:
total_changes = 0
updated = text
for rule in rules:
if rule.regex:
def repl(match: re.Match[str]) -> str:
result = rule.replacement
def upper_first(s: str) -> str:
return s[:1].upper() + s[1:] if s else s
def lower_first(s: str) -> str:
return s[:1].lower() + s[1:] if s else s
def repl_u(m: re.Match[str]) -> str:
return upper_first(match.group(int(m.group(1))))
def repl_l(m: re.Match[str]) -> str:
return lower_first(match.group(int(m.group(1))))
def repl_U(m: re.Match[str]) -> str:
return match.group(int(m.group(1))).upper()
def repl_L(m: re.Match[str]) -> str:
return match.group(int(m.group(1))).lower()
result = re.sub(r"\\U\\(\d+)", repl_U, result)
result = re.sub(r"\\L\\(\d+)", repl_L, result)
result = re.sub(r"\\u\\(\d+)", repl_u, result)
result = re.sub(r"\\l\\(\d+)", repl_l, result)
result = match.expand(result)
return result
updated, count = re.subn(
rule.pattern,
repl,
updated,
flags=rule.flags,
)
else:
count = updated.count(rule.pattern)
if count:
updated = updated.replace(rule.pattern, rule.replacement)
total_changes += count
return updated, total_changes
def should_skip_element_text(tag_name: str) -> bool:
tag_name = tag_name.lower()
return tag_name in {"script", "style"}
def local_name(tag: str) -> str:
if "}" in tag:
return tag.split("}", 1)[1]
return tag
def apply_replacements_to_xhtml(
path: Path,
rules: list[ReplacementRule],
dry_run: bool = False,
) -> int:
parser = ET.XMLParser()
tree = ET.parse(path, parser=parser)
root = tree.getroot()
total_replacements = 0
for elem in root.iter():
tag = local_name(elem.tag) if isinstance(elem.tag, str) else ""
if not should_skip_element_text(tag):
if elem.text:
new_text, count = apply_replacements(elem.text, rules)
if count:
elem.text = new_text
total_replacements += count
if elem.tail:
parent_tag = local_name(elem.tag) if isinstance(elem.tag, str) else ""
if not should_skip_element_text(parent_tag):
new_tail, count = apply_replacements(elem.tail, rules)
if count:
elem.tail = new_tail
total_replacements += count
if total_replacements and not dry_run:
ET.register_namespace("", "http://www.w3.org/1999/xhtml")
tree.write(path, encoding="utf-8", xml_declaration=True)
return total_replacements
def process_extracted_files(
root_dir: Path,
rules: list[ReplacementRule],
dry_run: bool = False,
) -> tuple[int, int]:
changed_files = 0
total_replacements = 0
for path in sorted(root_dir.rglob("*")):
if not is_text_file(path):
continue
try:
replacements_in_file = apply_replacements_to_xhtml(path, rules, dry_run=dry_run)
except ET.ParseError as exc:
log(f"Skipped (invalid XHTML/XML): {path.relative_to(root_dir)} -> {exc}")
continue
except Exception as exc:
log(f"Error while processing: {path.relative_to(root_dir)} -> {exc}")
continue
if replacements_in_file > 0:
changed_files += 1
total_replacements += replacements_in_file
log(f" Modified: {path.relative_to(root_dir)} ({replacements_in_file} matches)")
return changed_files, total_replacements
def create_epub(source_dir: Path, output_epub: Path) -> None:
mimetype_path = source_dir / "mimetype"
with zipfile.ZipFile(output_epub, "w") as zf:
if mimetype_path.exists():
zf.write(
mimetype_path,
arcname="mimetype",
compress_type=zipfile.ZIP_STORED,
)
for path in sorted(source_dir.rglob("*")):
if not path.is_file():
continue
if path == mimetype_path:
continue
arcname = path.relative_to(source_dir)
zf.write(path, arcname=str(arcname), compress_type=zipfile.ZIP_DEFLATED)
def validate_extracted_epub_structure(work_dir: Path) -> None:
mimetype_path = work_dir / "mimetype"
if not mimetype_path.exists():
raise ValueError("Invalid EPUB: file 'mimetype' is missing.")
try:
mimetype = mimetype_path.read_text(encoding="utf-8").strip()
except Exception as exc:
raise ValueError(f"Invalid EPUB: could not read 'mimetype': {exc}") from exc
if mimetype != "application/epub+zip":
raise ValueError(
f"Invalid EPUB: 'mimetype' has unexpected content: {mimetype!r}"
)
container_xml = work_dir / "META-INF" / "container.xml"
if not container_xml.exists():
raise ValueError("Invalid EPUB: 'META-INF/container.xml' is missing.")
def process_single_epub(
input_epub: Path,
output_epub: Path,
rules: list[ReplacementRule],
dry_run: bool = False,
) -> tuple[int, int]:
with tempfile.TemporaryDirectory(prefix="epub_replace_") as tmp:
work_dir = Path(tmp)
try:
extract_epub(input_epub, work_dir)
except zipfile.BadZipFile:
raise ValueError(f"Invalid EPUB/ZIP archive: {input_epub.name}")
validate_extracted_epub_structure(work_dir)
changed_files, total_replacements = process_extracted_files(
work_dir,
rules,
dry_run=dry_run,
)
if not dry_run:
output_epub.parent.mkdir(parents=True, exist_ok=True)
create_epub(work_dir, output_epub)
return changed_files, total_replacements
def main() -> int:
parser = argparse.ArgumentParser(
description=(
"Processes all EPUB files from 'EPUB-INPUT' and writes "
"the results to 'EPUB-OUTPUT'. Only visible text "
"in XHTML/HTML files is modified."
)
)
parser.add_argument(
"--rules",
default="ersetzungen.yaml",
help="Path to the YAML file with replacements (default: ersetzungen.yaml)",
)
parser.add_argument(
"--dry-run",
action="store_true",
help="Only show what would be changed without writing new EPUB files",
)
args = parser.parse_args()
script_dir = Path(__file__).resolve().parent
input_dir = script_dir / INPUT_DIR_NAME
output_dir = script_dir / OUTPUT_DIR_NAME
rules_path = Path(args.rules).expanduser().resolve()
dry_run = args.dry_run
try:
rules = load_replacements(rules_path)
except Exception as exc:
log(f"Error loading YAML file: {exc}")
return 1
if not rules:
log("Error: no replacements found in the YAML file.")
return 1
if not input_dir.exists():
log(f"Error: input directory not found: {input_dir}")
log(f"Please create a folder next to the script named '{INPUT_DIR_NAME}'.")
return 1
if not input_dir.is_dir():
log(f"Error: {input_dir} is not a directory.")
return 1
epub_files = sorted(input_dir.glob("*.epub"))
if not epub_files:
log(f"No EPUB files found in {input_dir}.")
return 0
if not dry_run:
output_dir.mkdir(parents=True, exist_ok=True)
log(f"Loaded rules: {len(rules)}")
log(f"Input directory: {input_dir}")
log(f"Output directory: {output_dir}")
log(f"Found EPUB files: {len(epub_files)}")
log("")
processed_count = 0
failed_count = 0
total_changed_files = 0
total_replacements = 0
for input_epub in epub_files:
output_epub = output_dir / input_epub.name
log(f"Processing: {input_epub.name}")
try:
changed_files, replacements = process_single_epub(
input_epub=input_epub,
output_epub=output_epub,
rules=rules,
dry_run=dry_run,
)
total_changed_files += changed_files
total_replacements += replacements
processed_count += 1
if dry_run:
log(
f" Dry run finished: {changed_files} files with changes, "
f"{replacements} replacements"
)
else:
log(
f" Done: {output_epub.name} "
f"({changed_files} files with changes, {replacements} replacements)"
)
except Exception as exc:
failed_count += 1
log(f" Error in {input_epub.name}: {exc}")
log("")
log("Summary")
log("--------------")
log(f"Successfully processed: {processed_count}")
log(f"Failed: {failed_count}")
log(f"Modified files across all EPUBs: {total_changed_files}")
log(f"Total replacements: {total_replacements}")
return 0 if failed_count == 0 else 1
if __name__ == "__main__":
sys.exit(main())
5. The YAML file
Save the rules as replacements.yaml.
replacements:
# ---------------------- EXAMPLES ------------------#
# ----------------------
# Photo -> Foto / photo -> foto
# ----------------------
- pattern: "Photo"
replacement: "Foto"
regex: true
- pattern: "photo"
replacement: "foto"
regex: true
# ----------------------
# Sulphat -> Sulfat / (Natrium)sulphat -> (Natrium)sulfat
# ----------------------
- pattern: "([Ss])ulphat"
replacement: "\\1ulfat"
regex: true
# ----------------------
# Schußlig -> Schusslig / schußlig -> schusslig
# ----------------------
- pattern: "\\b(S|s)chußlig"
replacement: "\\1chusslig"
regex: true
# ----------------------
# Capitalize forms of address
# ----------------------
- pattern: "\\b(du|dich|dir|euch|euer)\\b"
replacement: "\\u\\1"
regex: true
flags:
- IGNORECASE
description: "Capitalize forms of address"
- pattern: "\\bdein(e|er|es|em|en)?\\b"
replacement: "Dein\\1"
regex: true
flags:
- IGNORECASE
description: "Capitalize Dein forms"
# ----------------------
# Typography, spaces, and line breaks
# ----------------------
# 0) Convert ". . ." into an ellipsis
- pattern: '\.\s*\.\s*\.'
replacement: '…'
regex: true
description: 'Convert three dots into an ellipsis'
# 1) Remove spaces before punctuation
- pattern: '\s+([:,.;!?])'
replacement: '\1'
regex: true
description: 'Remove spaces before punctuation'
# 2) Add spaces after punctuation,
# but not before a closing quotation mark
- pattern: '([:,.;!?])([^\s"“”‚‘])'
replacement: '\1 \2'
regex: true
description: 'Add spaces after punctuation'
# 3) A single word in "..." -> upper double quotation marks
- pattern: '"([A-Za-zÄÖÜäöüß-]+)"'
replacement: '“\1”'
regex: true
description: 'Single word in upper double quotation marks'
# 4) General double quotes -> German typographic quotes
- pattern: '"([^"\n]+)"'
replacement: '„\1“'
regex: true
description: 'Straight double quotes -> German typographic quotes'
# 5) A single word in '\''...'\'' -> upper single quotation marks
- pattern: "'([A-Za-zÄÖÜäöüß-]+)'"
replacement: '‘\1’'
regex: true
description: 'Single word in upper single quotation marks'
# 6) General single quotes
- pattern: "'([^'\n]+)'"
replacement: '‚\1‘'
regex: true
description: 'Straight single quotes -> typographic quotes'
# 7) Typographic apostrophes
- pattern: "'"
replacement: "’"
regex: false
description: "Apostrophe"
# 8) Optional: smooth out hard line breaks
- pattern: "\\n{3,}"
replacement: "\\n\\n"
regex: true
description: "Maximum of two blank lines"
\\b, \\1, or \\u\\1.
6. What the script does
- It processes all
.epubfiles inEPUB-INPUT. - It writes the edited files to
EPUB-OUTPUT. - The originals in
EPUB-INPUTremain unchanged. - It supports regex rules and simple replacements.
- It also understands
\u\1,\l\1,\U\1, and\L\1for upper/lowercase transformations.
7. Put EPUB files into the input folder
8. Change into the project folder
cd ~/Desktop/EPUB-Multirenamer
9. Run a test
python3 epub_multi_replace.py --rules replacements.yaml --dry-run
10. Process the EPUB files
python3 epub_multi_replace.py --rules replacements.yaml
11. Typical workflow
- Copy new EPUB files into
EPUB-INPUT - Open
replacements.yamlin a code editor, for example Visual Studio Code, and adjust it if needed - Change into the project folder in Terminal
- Run the dry run
- Then start the real processing
- Check the results in
EPUB-OUTPUT - Test the processed EPUBs in Apple Books, Calibre, or Sigil
12. Useful Terminal commands
Show the current folder
pwd
List the files in the folder
ls
List files in subfolders as well
ls -R
Check the Python path
which python3
13. Common errors
Python cannot be found
python3 --version
PyYAML is missing
python3 -m pip install pyyaml
No EPUB files found
Make sure the files are actually in the EPUB-INPUT folder
and that they have the .epub extension.
The YAML file contains errors
Make sure the indentation is correct. YAML is very sensitive to spaces. Using two spaces per level is a good habit.
The script was started in the wrong folder
Use pwd to check whether you are really in EPUB-Multirenamer.
14. Summary
Once your project folder has been set up, the daily workflow is simple: put EPUB files in, check the rules, run a dry run, then process the files. If you have any questions or suggestions, feel free to leave a comment on mobileread.com.