fb2text

package module
v0.0.0-...-9be8135 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 31, 2019 License: MIT Imports: 8 Imported by: 0

README

fb2text

A library to convert FB2 book to text file. Demo includes a ready to use very simple converter that gets a raw FB2 file or zipped FB2 and makes a text file with defined text width. The conversion result can be saved to file or printed to terminal

Library Functions

IsZipFile(filePath string) bool

Retunrs if the file is zipped FB2 or raw xml one. There is no check if the file is valid FB2, so if filePath points to a file that is neither FB2 nor archive, the function returns false

Justify(s string, maxWidth int) string

Expands a string to a width maxWidth by adding extra spaces between words. If the string is longer than maxWidth or does not contain space then the function return original string.

Examples:

  • Justify("a b c", 7) ==> "a b c"
  • Justify("a b c d", 8) ==> "a b c d"
  • Justify("abcde", 10) ==> "abcde"
ParseBook(fileName string, parseBody bool) (BookInfo, []string)

Reads FB2 file(zipped FB2 is unpacked automatically) and converts it into internal format. Please see more about internal format in function description.

  • parseBody - defines if the caller wants only information about book or book information and the whole converted text. Setting parseBody to false can speed up book parsing if you need only information about book since the information is always in the beginning of FB2

Returns:

  • BookInfo - information about book (only the most important one like title, author, and sequence)
  • []string - parsed book text in internal format. Please description of function ParseBook in source file for details
FormatBook(parsed []string, maxWidth int, justify bool) []string

The default formatter. The function gets parsed book in internal format and returns a regular text with each string limited to maxWidth width.

  • parsed - text in internal format. Please see the function ParseBook for details
  • maxWidth - no line of text exceeds this limit
  • justify - add extra spaces between words to make all lines, except the last line of each paragraph, the same width Compare the same text with

justify = false

Formatted text without justification

justify = true

Justified formatted text

A Demo Application

A demo application is a simple converter FB2 to txt. It saves the result to a file or prints the text to terminal if output file is not set.

Usage:

   fb2text [-w N] [-j=0/1] inputFile [outputFile]

Program arguments:

  • -w N - limit the maximum width of a text line to N. N is a number between 30 and 400. The default value is 70
  • -j=0/1 - disable or enable expanding extra spaces to lines to make all of them the same width. Please see description of function FormatBook for examples. Default value is 0 - disable justification
  • inputFile - the only required parameter - FB2 file to convert. It can be either raw FB2 or zipped one, zipped FB2 is detected and unpacked automatically
  • outputFile - file name to save the result. If output filename is omitted than the converter just prints the converted text to terminal(stdout)

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func FormatBook

func FormatBook(parsed []string, maxWidth int, justify bool) []string

FormatBook is a default converter a book from internal format to simple text one. Rules of internal representaion and how default format converts it

please read in the description of function 'ParseBook'

parsed - a book in internal format maxWidth - 'screen width', a maximum width of text line in resulting

text

justify - whether add extra space between words to make every string except

the last strings of every paragraph the same width

Retuns a text ready to display for reading

func IsZipFile

func IsZipFile(filePath string) bool

IsZipFile checks if the file is ZIP archive. Returns true is the file is ZIP or GZIP archive and false otherwise

func Justify

func Justify(s string, maxWidth int) string

Justify expand a string up to maxWidth length by adding spaces between words. Words are separate by spaces. If the string is longer than maxWidth or contains no spaces than the original string is returned. Examples:

Justify("a b c", 7)  ==> "a  b  c"
Justify("a b c d", 8) ==> "a b  c d"
Justify("abcde", 10) ==> "abcde"

Types

type BookInfo

type BookInfo struct {
	FirstName string
	LastName  string
	Title     string
	Sequence  string
	Language  string
	Genre     string
}

BookInfo is a short information about FB2 book. It supports few tags only: book title, first and last author names, sequence, genre, and text language (not the original book language)

func ParseBook

func ParseBook(fileName string, parseBody bool) (BookInfo, []string)

ParseBook converts FB2 file to a simple list of strings with some extra information to display the text correctly. So, the parsed text is not for immediate display. It should be preformatted before showing to a user.

fileName - path to file contains FB2 formatted text. It can be ZIP archive,

the function automatically unpack zip files

parseBody - if parseBody is false the function stops right after it hits the

first 'body' tag. By this time all book information is read. The parameter
can be used for quick read of book properties without parsing the entire
file

Returns information about book[see BookInfo structure] and (if parseBody equals

true) the parsed FB2 text in internal format. Please read more about format
below.

All tags are enclosed in double curly brackets, like "{{section}}" Since terminal is not rich with GUI features, only few FB2 tags are added to output text. Existing internal tags: The following tags are always at the very beginning of the line: {{section}} - defines section start. Default format adds extra empty line {{title}} - defines title line. There can be several title lines in a row.

Default format justify the title in the center of screen if title length is
smaller than screen width. Otherwise it is displayed as regular paragraph

{{epi}} - defines ephigraph start. Default format takes all consecutive epigraph

lines, calculates the maximal width and then format all epigraph lines to make
them right justified in such way that the longest string ends at the right
edge of the screen

{{epiauth}} - defines author of the epigraph text start. Default format treats

this tag as if it is {{epi}} one.

The following tags can be in any place of the string, that is why thay have starting and ending markers: {{emon}} and {{emoff}} - defines emphasized text started. Default format skips these tags and does nothing. In original FB2 two tags are mapped to {{emon}}: <strong> and <emphasis>

If a parsed string does not start with "{{" it means the string is regular paragraph of text. Default format separates the section to lines not longer than screen width. If a string is longer and do not have spaces then the string just divided at screen width position. If option 'justify' is set then all string of the paragraph(except the last one) are expanded with extra spaces to make all string the same width

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL