EPUB Multirenamer

EPUB Multirenamer for Mac

brought to you by dzhome.de

Last updated: 08.April 2026

0. Introduction

This script lets you edit multiple EPUB files at the same time. In the YAML file, you can define which terms should be searched for and how they should be changed. For example, this makes it possible to quickly adapt an EPUB file, at least in part, to the newer German spelling rules. I also provide my own YAML file here as a download, and I update and expand it on a regular basis. There is also an German version of this guide.

I used ChatGPT on 28.03.2026 to create this Python script.

Download EPUB Multirenamer
In the two large code blocks further down the page, you will find a Copy button in the top-right corner. This lets you copy the script or the YAML file directly to your clipboard.

1. Project structure

EPUB-Multirenamer/ ├── epub_multi_replace.py ├── replacements.yaml ├── EPUB-INPUT/ └── EPUB-OUTPUT/

The workflow is always the same:

  1. Put your EPUB files into EPUB-INPUT
  2. Edit replacements.yaml if needed
  3. Run the script in Terminal
  4. Check the results in EPUB-OUTPUT

2. Requirements on Mac

2.1 Open Terminal

Open Terminal using Spotlight: Cmd + Space, then type Terminal.

2.2 Check Python

python3 --version

If a Python version is displayed, you can continue with step 2.4. If you see command not found, install Python as described in step 2.3 below.

2.3 Install Python on Mac

2.3.1 Install the Command Line Tools

xcode-select --install
If Terminal only shows a message and no window opens, a reset may help:
sudo rm -rf /Library/Developer/CommandLineTools
xcode-select --install

2.3.2 Install Homebrew

Homebrew is the easiest way to install Python on a Mac.

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

2.3.3 Install Python

brew install python

2.3.4 Verify the installation

python3 --version
If you see something like Python 3.12.x or Python 3.13.x, Python has been installed correctly.

2.4 Install PyYAML

python3 -m pip install pyyaml
The PyYAML package is required so that the script can read the replacements.yaml file.

2.5 Which files inside the EPUB are processed?

An EPUB file is similar to a ZIP archive. Inside it, there are several files.

By default, files with the following extensions are processed:

3. Create the project folder

For example, create a folder named EPUB-Multirenamer on your desktop.

EPUB-Multirenamer/ ├── epub_multi_replace.py ├── replacements.yaml ├── EPUB-INPUT/ └── EPUB-OUTPUT/

4. The Python script

Save the following script as epub_multi_replace.py.

#!/usr/bin/env python3

        #!/usr/bin/env python3
# -*- coding: utf-8 -*-

from __future__ import annotations

import argparse
import re
import sys
import tempfile
import zipfile
from dataclasses import dataclass
from pathlib import Path
import xml.etree.ElementTree as ET

import yaml


TEXT_EXTENSIONS = {".xhtml", ".html", ".htm"}
INPUT_DIR_NAME = "EPUB-INPUT"
OUTPUT_DIR_NAME = "EPUB-OUTPUT"


@dataclass
class ReplacementRule:
    pattern: str
    replacement: str
    regex: bool = True
    flags: int = 0
    description: str = ""


def log(msg: str) -> None:
    print(msg)


def is_text_file(path: Path) -> bool:
    return path.is_file() and path.suffix.lower() in TEXT_EXTENSIONS


def parse_flags(flag_names: list[str] | None) -> int:
    if not flag_names:
        return 0

    flag_map = {
        "IGNORECASE": re.IGNORECASE,
        "MULTILINE": re.MULTILINE,
        "DOTALL": re.DOTALL,
    }

    value = 0
    for name in flag_names:
        if name not in flag_map:
            raise ValueError(f"Unknown regex flag: {name}")
        value |= flag_map[name]
    return value


def load_replacements(yaml_path: Path) -> list[ReplacementRule]:
    try:
        data = yaml.safe_load(yaml_path.read_text(encoding="utf-8"))
    except FileNotFoundError as exc:
        raise FileNotFoundError(f"YAML file not found: {yaml_path}") from exc
    except Exception as exc:
        raise ValueError(f"Failed to read YAML file: {exc}") from exc

    if not isinstance(data, dict):
        raise ValueError("YAML file does not have a valid root object.")

    items = data.get("replacements")
    if not isinstance(items, list):
        raise ValueError("YAML file must contain a key 'replacements' with a list.")

    rules: list[ReplacementRule] = []

    for idx, item in enumerate(items, start=1):
        if not isinstance(item, dict):
            raise ValueError(f"Entry #{idx} in 'replacements' is not an object.")

        pattern = item.get("pattern")
        replacement = item.get("replacement", "")
        regex = item.get("regex", True)
        description = item.get("description", "")
        flags = parse_flags(item.get("flags"))

        if not isinstance(pattern, str):
            raise ValueError(f"Entry #{idx}: 'pattern' is missing or is not a string.")
        if not isinstance(replacement, str):
            raise ValueError(f"Entry #{idx}: 'replacement' is not a string.")
        if not isinstance(regex, bool):
            raise ValueError(f"Entry #{idx}: 'regex' must be true or false.")
        if not isinstance(description, str):
            raise ValueError(f"Entry #{idx}: 'description' is not a string.")

        rules.append(
            ReplacementRule(
                pattern=pattern,
                replacement=replacement,
                regex=regex,
                flags=flags,
                description=description,
            )
        )

    return rules


def extract_epub(epub_path: Path, target_dir: Path) -> None:
    with zipfile.ZipFile(epub_path, "r") as zf:
        zf.extractall(target_dir)


def apply_replacements(text: str, rules: list[ReplacementRule]) -> tuple[str, int]:
    total_changes = 0
    updated = text

    for rule in rules:
        if rule.regex:

            def repl(match: re.Match[str]) -> str:
                result = rule.replacement

                def upper_first(s: str) -> str:
                    return s[:1].upper() + s[1:] if s else s

                def lower_first(s: str) -> str:
                    return s[:1].lower() + s[1:] if s else s

                def repl_u(m: re.Match[str]) -> str:
                    return upper_first(match.group(int(m.group(1))))

                def repl_l(m: re.Match[str]) -> str:
                    return lower_first(match.group(int(m.group(1))))

                def repl_U(m: re.Match[str]) -> str:
                    return match.group(int(m.group(1))).upper()

                def repl_L(m: re.Match[str]) -> str:
                    return match.group(int(m.group(1))).lower()

                result = re.sub(r"\\U\\(\d+)", repl_U, result)
                result = re.sub(r"\\L\\(\d+)", repl_L, result)
                result = re.sub(r"\\u\\(\d+)", repl_u, result)
                result = re.sub(r"\\l\\(\d+)", repl_l, result)

                result = match.expand(result)
                return result

            updated, count = re.subn(
                rule.pattern,
                repl,
                updated,
                flags=rule.flags,
            )
        else:
            count = updated.count(rule.pattern)
            if count:
                updated = updated.replace(rule.pattern, rule.replacement)

        total_changes += count

    return updated, total_changes


def should_skip_element_text(tag_name: str) -> bool:
    tag_name = tag_name.lower()
    return tag_name in {"script", "style"}


def local_name(tag: str) -> str:
    if "}" in tag:
        return tag.split("}", 1)[1]
    return tag


def apply_replacements_to_xhtml(
    path: Path,
    rules: list[ReplacementRule],
    dry_run: bool = False,
) -> int:
    parser = ET.XMLParser()
    tree = ET.parse(path, parser=parser)
    root = tree.getroot()

    total_replacements = 0

    for elem in root.iter():
        tag = local_name(elem.tag) if isinstance(elem.tag, str) else ""

        if not should_skip_element_text(tag):
            if elem.text:
                new_text, count = apply_replacements(elem.text, rules)
                if count:
                    elem.text = new_text
                    total_replacements += count

        if elem.tail:
            parent_tag = local_name(elem.tag) if isinstance(elem.tag, str) else ""
            if not should_skip_element_text(parent_tag):
                new_tail, count = apply_replacements(elem.tail, rules)
                if count:
                    elem.tail = new_tail
                    total_replacements += count

    if total_replacements and not dry_run:
        ET.register_namespace("", "http://www.w3.org/1999/xhtml")
        tree.write(path, encoding="utf-8", xml_declaration=True)

    return total_replacements


def process_extracted_files(
    root_dir: Path,
    rules: list[ReplacementRule],
    dry_run: bool = False,
) -> tuple[int, int]:
    changed_files = 0
    total_replacements = 0

    for path in sorted(root_dir.rglob("*")):
        if not is_text_file(path):
            continue

        try:
            replacements_in_file = apply_replacements_to_xhtml(path, rules, dry_run=dry_run)
        except ET.ParseError as exc:
            log(f"Skipped (invalid XHTML/XML): {path.relative_to(root_dir)} -> {exc}")
            continue
        except Exception as exc:
            log(f"Error while processing: {path.relative_to(root_dir)} -> {exc}")
            continue

        if replacements_in_file > 0:
            changed_files += 1
            total_replacements += replacements_in_file
            log(f"  Modified: {path.relative_to(root_dir)} ({replacements_in_file} matches)")

    return changed_files, total_replacements


def create_epub(source_dir: Path, output_epub: Path) -> None:
    mimetype_path = source_dir / "mimetype"

    with zipfile.ZipFile(output_epub, "w") as zf:
        if mimetype_path.exists():
            zf.write(
                mimetype_path,
                arcname="mimetype",
                compress_type=zipfile.ZIP_STORED,
            )

        for path in sorted(source_dir.rglob("*")):
            if not path.is_file():
                continue
            if path == mimetype_path:
                continue

            arcname = path.relative_to(source_dir)
            zf.write(path, arcname=str(arcname), compress_type=zipfile.ZIP_DEFLATED)


def validate_extracted_epub_structure(work_dir: Path) -> None:
    mimetype_path = work_dir / "mimetype"
    if not mimetype_path.exists():
        raise ValueError("Invalid EPUB: file 'mimetype' is missing.")

    try:
        mimetype = mimetype_path.read_text(encoding="utf-8").strip()
    except Exception as exc:
        raise ValueError(f"Invalid EPUB: could not read 'mimetype': {exc}") from exc

    if mimetype != "application/epub+zip":
        raise ValueError(
            f"Invalid EPUB: 'mimetype' has unexpected content: {mimetype!r}"
        )

    container_xml = work_dir / "META-INF" / "container.xml"
    if not container_xml.exists():
        raise ValueError("Invalid EPUB: 'META-INF/container.xml' is missing.")


def process_single_epub(
    input_epub: Path,
    output_epub: Path,
    rules: list[ReplacementRule],
    dry_run: bool = False,
) -> tuple[int, int]:
    with tempfile.TemporaryDirectory(prefix="epub_replace_") as tmp:
        work_dir = Path(tmp)

        try:
            extract_epub(input_epub, work_dir)
        except zipfile.BadZipFile:
            raise ValueError(f"Invalid EPUB/ZIP archive: {input_epub.name}")

        validate_extracted_epub_structure(work_dir)

        changed_files, total_replacements = process_extracted_files(
            work_dir,
            rules,
            dry_run=dry_run,
        )

        if not dry_run:
            output_epub.parent.mkdir(parents=True, exist_ok=True)
            create_epub(work_dir, output_epub)

    return changed_files, total_replacements


def main() -> int:
    parser = argparse.ArgumentParser(
        description=(
            "Processes all EPUB files from 'EPUB-INPUT' and writes "
            "the results to 'EPUB-OUTPUT'. Only visible text "
            "in XHTML/HTML files is modified."
        )
    )
    parser.add_argument(
        "--rules",
        default="ersetzungen.yaml",
        help="Path to the YAML file with replacements (default: ersetzungen.yaml)",
    )
    parser.add_argument(
        "--dry-run",
        action="store_true",
        help="Only show what would be changed without writing new EPUB files",
    )

    args = parser.parse_args()

    script_dir = Path(__file__).resolve().parent
    input_dir = script_dir / INPUT_DIR_NAME
    output_dir = script_dir / OUTPUT_DIR_NAME
    rules_path = Path(args.rules).expanduser().resolve()
    dry_run = args.dry_run

    try:
        rules = load_replacements(rules_path)
    except Exception as exc:
        log(f"Error loading YAML file: {exc}")
        return 1

    if not rules:
        log("Error: no replacements found in the YAML file.")
        return 1

    if not input_dir.exists():
        log(f"Error: input directory not found: {input_dir}")
        log(f"Please create a folder next to the script named '{INPUT_DIR_NAME}'.")
        return 1

    if not input_dir.is_dir():
        log(f"Error: {input_dir} is not a directory.")
        return 1

    epub_files = sorted(input_dir.glob("*.epub"))

    if not epub_files:
        log(f"No EPUB files found in {input_dir}.")
        return 0

    if not dry_run:
        output_dir.mkdir(parents=True, exist_ok=True)

    log(f"Loaded rules: {len(rules)}")
    log(f"Input directory: {input_dir}")
    log(f"Output directory: {output_dir}")
    log(f"Found EPUB files: {len(epub_files)}")
    log("")

    processed_count = 0
    failed_count = 0
    total_changed_files = 0
    total_replacements = 0

    for input_epub in epub_files:
        output_epub = output_dir / input_epub.name
        log(f"Processing: {input_epub.name}")

        try:
            changed_files, replacements = process_single_epub(
                input_epub=input_epub,
                output_epub=output_epub,
                rules=rules,
                dry_run=dry_run,
            )

            total_changed_files += changed_files
            total_replacements += replacements
            processed_count += 1

            if dry_run:
                log(
                    f"  Dry run finished: {changed_files} files with changes, "
                    f"{replacements} replacements"
                )
            else:
                log(
                    f"  Done: {output_epub.name} "
                    f"({changed_files} files with changes, {replacements} replacements)"
                )

        except Exception as exc:
            failed_count += 1
            log(f"  Error in {input_epub.name}: {exc}")

        log("")

    log("Summary")
    log("--------------")
    log(f"Successfully processed: {processed_count}")
    log(f"Failed: {failed_count}")
    log(f"Modified files across all EPUBs: {total_changed_files}")
    log(f"Total replacements: {total_replacements}")

    return 0 if failed_count == 0 else 1


if __name__ == "__main__":
    sys.exit(main())
  
  

5. The YAML file

Save the rules as replacements.yaml.

replacements:

  # ---------------------- EXAMPLES ------------------#
  
  # ----------------------
  # Photo -> Foto / photo -> foto
  # ----------------------
  - pattern: "Photo"
    replacement: "Foto"
    regex: true

  - pattern: "photo"
    replacement: "foto"
    regex: true

  # ----------------------
  # Sulphat -> Sulfat / (Natrium)sulphat -> (Natrium)sulfat
  # ----------------------
  - pattern: "([Ss])ulphat"
    replacement: "\\1ulfat"
    regex: true

  # ----------------------
  # Schußlig -> Schusslig / schußlig -> schusslig
  # ----------------------
  - pattern: "\\b(S|s)chußlig"
    replacement: "\\1chusslig"
    regex: true

  # ----------------------
  # Capitalize forms of address
  # ----------------------
  - pattern: "\\b(du|dich|dir|euch|euer)\\b"
    replacement: "\\u\\1"
    regex: true
    flags:
      - IGNORECASE
    description: "Capitalize forms of address"

  - pattern: "\\bdein(e|er|es|em|en)?\\b"
    replacement: "Dein\\1"
    regex: true
    flags:
      - IGNORECASE
    description: "Capitalize Dein forms"

  # ----------------------
  # Typography, spaces, and line breaks
  # ----------------------

  # 0) Convert ". . ." into an ellipsis
  - pattern: '\.\s*\.\s*\.'
    replacement: '…'
    regex: true
    description: 'Convert three dots into an ellipsis'

  # 1) Remove spaces before punctuation
  - pattern: '\s+([:,.;!?])'
    replacement: '\1'
    regex: true
    description: 'Remove spaces before punctuation'

  # 2) Add spaces after punctuation,
  # but not before a closing quotation mark
  - pattern: '([:,.;!?])([^\s"“”‚‘])'
    replacement: '\1 \2'
    regex: true
    description: 'Add spaces after punctuation'

  # 3) A single word in "..." -> upper double quotation marks
  - pattern: '"([A-Za-zÄÖÜäöüß-]+)"'
    replacement: '“\1”'
    regex: true
    description: 'Single word in upper double quotation marks'

  # 4) General double quotes -> German typographic quotes
  - pattern: '"([^"\n]+)"'
    replacement: '„\1“'
    regex: true
    description: 'Straight double quotes -> German typographic quotes'

  # 5) A single word in '\''...'\'' -> upper single quotation marks
  - pattern: "'([A-Za-zÄÖÜäöüß-]+)'"
    replacement: '‘\1’'
    regex: true
    description: 'Single word in upper single quotation marks'

  # 6) General single quotes
  - pattern: "'([^'\n]+)'"
    replacement: '‚\1‘'
    regex: true
    description: 'Straight single quotes -> typographic quotes'

  # 7) Typographic apostrophes
  - pattern: "'"
    replacement: "’"
    regex: false
    description: "Apostrophe"

  # 8) Optional: smooth out hard line breaks
  - pattern: "\\n{3,}"
    replacement: "\\n\\n"
    regex: true
    description: "Maximum of two blank lines"
In YAML, backslashes must be doubled, for example \\b, \\1, or \\u\\1.

6. What the script does

7. Put EPUB files into the input folder

EPUB-Multirenamer/ ├── epub_multi_replace.py ├── replacements.yaml ├── EPUB-INPUT/ │ ├── book1.epub │ ├── book2.epub │ └── book3.epub └── EPUB-OUTPUT/

8. Change into the project folder

cd ~/Desktop/EPUB-Multirenamer

9. Run a test

python3 epub_multi_replace.py --rules replacements.yaml --dry-run
This is the recommended first step before you generate the edited files.

10. Process the EPUB files

python3 epub_multi_replace.py --rules replacements.yaml

11. Typical workflow

  1. Copy new EPUB files into EPUB-INPUT
  2. Open replacements.yaml in a code editor, for example Visual Studio Code, and adjust it if needed
  3. Change into the project folder in Terminal
  4. Run the dry run
  5. Then start the real processing
  6. Check the results in EPUB-OUTPUT
  7. Test the processed EPUBs in Apple Books, Calibre, or Sigil

12. Useful Terminal commands

Show the current folder

pwd

List the files in the folder

ls

List files in subfolders as well

ls -R

Check the Python path

which python3

13. Common errors

Python cannot be found

python3 --version

PyYAML is missing

python3 -m pip install pyyaml

No EPUB files found

Make sure the files are actually in the EPUB-INPUT folder and that they have the .epub extension.

The YAML file contains errors

Make sure the indentation is correct. YAML is very sensitive to spaces. Using two spaces per level is a good habit.

The script was started in the wrong folder

Use pwd to check whether you are really in EPUB-Multirenamer.

14. Summary

Once your project folder has been set up, the daily workflow is simple: put EPUB files in, check the rules, run a dry run, then process the files. If you have any questions or suggestions, feel free to leave a comment on mobileread.com.