kepub

package
v4.0.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 9, 2022 License: MIT Imports: 23 Imported by: 1

Documentation

Overview

Package kepub converts EPUBs to KEPUBs.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Converter

type Converter struct {
	// contains filtered or unexported fields
}

Converter converts EPUB2/EPUB3 books to Kobo's KEPUB format.

func NewConverter

func NewConverter() *Converter

NewConverter creates a new Converter. By default, no options are applied.

func NewConverterWithOptions

func NewConverterWithOptions(opts ...ConverterOption) *Converter

NewConverterWithOptions is like NewConverter, with options.

func (*Converter) Convert

func (c *Converter) Convert(ctx context.Context, w io.Writer, r fs.FS) error

Convert converts the EPUB root r into a new EPUB written to w. If r is a (*zip.Reader) (from archive/zip by default, or from github.com/pgaskin/kepubify/_/go116-zip.go117/archive/zip if the zip117 build tag is used (even on Go 1.17)), the original zip metadata is preserved where possible, and additional optimizations are applied to prevent re-compressing unchanged data where possible. If processing untrusted EPUBs, r should not point to an unrestricted on-disk filesystem since paths are not sanitized; it should point to a (*zip.Reader) or other in-memory or synthetic filesystem.

func (*Converter) TransformContent

func (c *Converter) TransformContent(w io.Writer, r io.Reader) error

TransformContent transforms an HTML4/HTML5/XHTML1.1 document for a KEPUB.

  • [important] parses the XHTML with XHTML/XML/HTML4/HTML5-compatible rules Quite a few books have invalid XHTML, and this prevents the markup from being mangled any more than absolutely necessary. This also has the side effect of fixing bad markup when combined with the render step at the end. The intention is to match or exceed the kepub renderer's leniency. This lenient parsing is also why kepubify often works better with badly-formed HTML than Calibre. See the documentation in the x/net/html fork for more information about how this works.

    The most important changes to default HTML5 parsing rules are to allow more tags to be self-closing, to ignore UTF-8 byte order marks, and to preserve XML instructions.

  • [mandatory] add Kobo style tweaks To match official KEPUBs.

  • [mandatory] add Kobo div wrappers To match official KEPUBs. Kobo wraps the body with two div tags, `div#book-columns > div#book-inner`, to provide a target for applying pagination styles.

  • [mandatory] add Kobo spans To match official KEPUBs. Kobo adds spans surrounding each fragment (see the regexp and matching logic) to provide better references to chunks of text. Highlighting, bookmarking, and other related features don't work without this.

  • [optional] add extra CSS For customization or to fix common issues.

  • [optional] smarten punctuation A common tweak to improve badly-formatted books.

  • [extra] content cleanup Removes Adept tags, extraneous MS Office tags, Unicode replacement chars, etc.

  • [important] renders the HTML as polyglot XHTML/HTML4/HTML5 The HTML is rendered for maximum compatibility and to be as close to the original HTML as possible. See the documentation in the x/net/html fork for more information about how this works.

    The most important aspects are: the use of &#160; for non-breaking spaces, always specifying xmlns on html/math/svg, always specifying a type on script and style, always specifying a value for boolean attributes, always adding a closing slash to void elements, never self-closing non-void elements, only using XML-defined named escapes `<>&`, only using HTML-style comments, ensuring table contents are well-formed, and preserving the XML declaration if in the original code.

  • [optional] find/replace To allow users to apply quick one-off fixes to the generated HTML.

  • [important] ensure charset is UTF-8 EPUBs (and KEPUBs by extension) must be UTF-8/UTF-16.

func (*Converter) TransformDummyTitlepage

func (c *Converter) TransformDummyTitlepage(epub fs.FS, opfF string, opf *bytes.Buffer) (string, io.Reader, bool, error)

TransformDummyTitlepage adds a dummy titlepage if forced or the heuristic determines that is is necessary. If there was an error determining if the titlepage is required, false and an error is returned. If it is not required, false is returned. If it is required, but there was an error when adding it, true and an error is returned. If it was added successfully, the filename and contents of the content document to add to the epub and true is returned.

The heuristic determines whether the first content file in the spine is a title page without other content, and if it isn't (or if force is true), it adds a blank content document and modifies the OPF to add it to the manifest and start of the spine. This is required because Kobo will treat the first spine entry specially (e.g. no margins) for full-screen book covers. See #33.

Note that the heuristic is subject to change between kepubify versions.

func (*Converter) TransformFileFilter

func (c *Converter) TransformFileFilter(fn string) bool

TransformFileFilter returns true if a file should be filtered from the EPUB.

  • [extra] remove calibre_bookmarks.txt

  • [extra] remove iBooks metadata

  • [extra] remove macOS metadata

  • [extra] remove Windows metadata

func (*Converter) TransformOPF

func (c *Converter) TransformOPF(w io.Writer, r io.Reader) error

TransformOPF transforms the OPF document for a KEPUB.

  • [mandatory] add the cover-image property to the cover. Kobo only supports the standardized EPUB3-style method of specifying the cover (`manifest>item[properties="cover-image"]`), but most older EPUBs will reference the manifest item with a meta element like `meta[name="cover"][content="{manifest-item-id}"]`. or just set the manifest item ID to `cover` instead of using `properties`.

  • [extra] remove unnecessary Calibre metadata. Removes extraneous metadata elements commonly added by Calibre.

type ConverterOption

type ConverterOption func(*Converter)

ConverterOption configures a Converter.

func ConverterOptionAddCSS

func ConverterOptionAddCSS(css string) ConverterOption

ConverterOptionAddCSS adds CSS code to a book.

func ConverterOptionCharset added in v4.0.4

func ConverterOptionCharset(charset string) ConverterOption

ConverterOptionCharset overrides the charset for all content documents. Use "auto" to automatically detect the charset.

func ConverterOptionDummyTitlepage

func ConverterOptionDummyTitlepage(add bool) ConverterOption

ConverterOptionDummyTitlepage force-enables or force-disables the fix which adds a dummy titlepage to the start of the book to fix layout issues on certain books. If not set, a heuristic is used to determine whether it should be added.

func ConverterOptionFindReplace

func ConverterOptionFindReplace(find, replace string) ConverterOption

ConverterOptionFindReplace replaces a raw string in the transformed HTML.

func ConverterOptionFullScreenFixes

func ConverterOptionFullScreenFixes() ConverterOption

ConverterOptionFullScreenFixes applies fullscreen fixes for firmware versions older than 4.19.11911.

func ConverterOptionHyphenate

func ConverterOptionHyphenate(hyphenate bool) ConverterOption

ConverterOptionHyphenate force-enables or force-disables hyphenation. If not set, no specific state is enforced by kepubify.

func ConverterOptionSmartypants

func ConverterOptionSmartypants() ConverterOption

ConverterOptionSmartypants enables smart punctuation.

Directories

Path Synopsis
Command kobotest tests kepub span logic (only, not divs or other kepub stuff, which is pretty straightforward anyways) against other kepubs.
Command kobotest tests kepub span logic (only, not divs or other kepub stuff, which is pretty straightforward anyways) against other kepubs.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL