# BookImport Module Overview `BookImport` contains the format-specific logic for extracting metadata and covers from incoming files. It supports EPUB and PDF sources out of the box and returns a lightweight `ImportResult` that the rest of the pipeline uses to build `Book` instances. ## Responsibilities - Ensure the covers cache directory exists (`~/.local/share/bibliotheca/covers`). - Extract title/author metadata from EPUB and PDF files. - Render or extract cover art, saving a PNG (or original asset) into the covers directory. - Expose a single `import_book_assets(path, bookIdHex)` function that dispatches to the correct handler based on file extension. ## Structure ``` BookImport.cpp ├── ensure_covers_dir() // filesystem helper ├── import_epub(...) // libzip + tinyxml2 ├── import_pdf(...) // poppler-glib + cairo └── import_book_assets(...) // public dispatcher ``` ## ImportResult ```cpp struct ImportResult { std::string title; std::string author; std::string coverPngPath; // empty if no cover extracted }; ``` If a format fails to provide metadata or a cover, the corresponding fields are left empty; the caller (usually `BibliothecaWindow`) merges these with defaults (e.g., falls back to the filename for the title). ## EPUB pipeline 1. Open the `.epub` as a ZIP archive via libzip. 2. Parse `META-INF/container.xml` to locate the OPF package. 3. Read the OPF document with TinyXML2, extracting ``, ``, and the cover manifest entry (`cover-image` property or meta `name="cover"`). 4. If a cover asset exists, copy it into the covers directory (preserving the original extension); otherwise leave `coverPngPath` empty. ## PDF pipeline 1. Open the document with Poppler (`poppler_document_new_from_file`). 2. Pull title/author metadata using Poppler's getters. 3. Render the first page to an ARGB32 Cairo surface scaled to ~1000px tall. 4. Write the rendered surface to `${coversDir}/${bookId}.png`. Errors (e.g., corrupt files) throw `std::runtime_error`. Callers typically catch these during batch imports, log the failure, and skip the problematic file. ## Extension points - **Additional formats**: add new `is_xyz()` helpers and `import_xyz()` methods, then extend `import_book_assets()` to dispatch accordingly. - **Metadata enrichment**: augment the result with series information, tags, or the table of contents if formats expose them. - **Cover sizing**: adjust the Cairo scale if you want smaller thumbnails. ## Expected usage `BibliothecaWindow` computes a SHA-256 id for each selected file, calls `import_book_assets()` in a worker thread, and combines the `ImportResult` with fallback metadata before enqueuing `BookList::upsertMany()`. Because cover files live in a shared directory addressed by book id, repeated imports overwrite previous covers, ensuring consistency across sessions.