formatter

package
v0.1.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 18, 2021 License: BSD-3-Clause Imports: 17 Imported by: 0

README

Formatting Styled Text on Monospaced Output Devices

Package formatter formats styled text on output devices with fixed-width fonts. It is intended for situations where the application is responsible for the visual representation (as opposed to output to a browser, which usually addresses the complications of text by itself, transparently for applications). Think of this package in terms of fmt.Println for styled, bi-directional text.

Documentation

Overview

Package formatter formats styled text on output devices with fixed-width fonts. It is intended for situations where the application is responsible for the visual representation (as opposed to output to a browser, which usually addresses the complications of text by itself, transparently for applications). Think of this package in terms of `fmt.Println` for styled, bi-directional text.

Output of styled text differs in many aspectes from simple string output. Not only do we need an output device which is capable of displaying text styles, but we need to consider line-breaking and the handling of bi-directional (Bidi) text as well. This package helps performing the following tasks:

▪︎ Select a formatter for a given (monospaced) output device

▪︎ Create a suitable formatting configuration

▪︎ Format a styled paragraph of possibly bi-directional text and output it to the device

Formatting and output needs to perform a couple of steps to produce a correct visual representation. These steps are in a large part covered by various Unicode Annexes and in general it's non-trivial to get them right (https://raphlinus.github.io/text/2020/10/26/text-layout.html). Package formatter will apply rules from UAX#9 (bidi), UAX#14 (line breaking), UAX#29 (graphemes) and UAX#11 (character width), as well as some heuristics depending on the output device.

This package does not constitute a typesetter. We will not deal with fonts, glyphing, variable text widths, elaborate line-breaking algorithms, etc. In particular we will not handle issues having to do with fonts or with locale-specific glyphs missing for an output device.

The Problems it Solves

As an application developer most of the time you do not have a need to consider the fine points of styled and bidirectional text. Most applications deal with strings, not text (https://mortoray.com/2014/03/17/strings-and-text-are-not-the-same/).

However, if you happen to really need it, support for text as a data structure is sparse in system developement languages like Go (Rust is about to prove me wrong on this), and dealing with bidi text is sometimes complicated. What's more: libraries for text have peculiar problems during test, as there is no easy output target, except browsers and terminals. And browsers are – of all applications – among the best when dealing with text styles and bidi. That makes it sometimes hard to test your own bidi- or styling algorithms, as it will interfere with the browser logic. And terminals have their own kinds of challenges with bidi, making it often difficult to pinpoint an error.

API

Clients select an instance of type formatter.Format and possibly configure it to their needs. As soon as a piece of styled text is to be output, it has to be broken up into paragraphs. This is due to the fact that the Unicode Bidi Algorithm works on paragraphs. Breaking up into paragraphs may be done by the client explicitely, or a formatter may be able to do the paragraph-splitting itself.

text := styled.TextFromString("The quick brown fox jumps over the כלב עצלן!")
text.Style(inline.BoldStyle, 4, 9)  // want 'quick' in boldface
para, _ := styled.ParagraphFromText(text, 0, text.Raw().Len(), bidi.LeftToRight, nil)

console := NewLocalConsoleFormat()
console.Print(para, nil)

formatter.Format is an interface type and this package offers two implementations, one for console output (like in the example above) and one for HTML output.

Status

Work in progress, especially the HTML formatter is in it's infancy. Needs a lot more testing. API not stable.

_________________________________________________________________________

BSD 3-Clause License

All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Index

Examples

Constants

This section is empty.

Variables

View Source
var EmptyCodes = ControlCodes{
	Preamble:  []byte{},
	Postamble: []byte{},
	LTR:       []byte{},
	RTL:       []byte{},
	Newline:   []byte{'\n'},
}

EmptyCodes is the default set of control codes.

View Source
var StandardCodes = ControlCodes{
	Preamble:  []byte{27, '[', '8', 'l'},
	Postamble: []byte{},
	LTR:       []byte{27, '[', '1', ' ', 'k'},
	RTL:       []byte{27, '[', '2', ' ', 'k'},
	Newline:   []byte{'\n'},
}

StandardCodes is the set of control codes for standards conforming terminals. See https://terminal-wg.pages.freedesktop.org/bidi/recommendation/escape-sequences.html

View Source
var WindowsCodes = ControlCodes{
	Preamble:  []byte{},
	Postamble: []byte{},
	LTR:       []byte{},
	RTL:       []byte{},
	Newline:   []byte{'\r', '\n'},
}

WindowsCodes is the default set of control codes for Windows.

Functions

func Output

func Output(para *styled.Paragraph, out io.Writer, config *Config, format Format) error

Output formats a paragraph of style text using a given formatter.

Neither of the arguments may be nil. However, it is safe to have config.Context set to nil. In this case, uax11.LatinContext is used.

TODO do not consume para

func T

func T() tracing.Trace

T traces to a global core-tracer.

Types

type Config

type Config struct {
	LineWidth int            // line width in terms of ‘en’s, i.e. fixed character width
	Justify   bool           // require output lines to be fully justified
	Debug     bool           // output additional information for debugging
	Context   *uax11.Context // language context
}

Config represents a set of configuration parameters for formatting.

func ConfigFromTerminal

func ConfigFromTerminal() *Config

ConfigFromTerminal is a simple helper for creating a formatting Config. It checks whether stdout is a terminal, and if so it reads the terminal's width and sets the Config.LineWidth parameter accordingly.

type ConsoleFixedWidth

type ConsoleFixedWidth struct {
	Codes        *ControlCodes // escape sequences, usually set by constructur
	NeedsReorder ReorderFlag   // re-ordering hint, usually set by constructor
	// contains filtered or unexported fields
}

ConsoleFixedWidth is a type for outputting formatted text to a console with a fixed width font.

Console/Terminal output is notoriously tricky for bi-directional text and for scripts other than Latin. To fully appreciate the difficulties behind this, refer for example to https://terminal-wg.pages.freedesktop.org/bidi/bidi-intro/why-terminals-are-special.html

As long as there is not widely accepted standard for Bidi-handling in terminals, we have to rely on heuristics and explicitly set device-dependent configuration. This is unfortunate for applications which are supposed to run in multi-platform and multi-regional environments. However, it is no longer acceptable for applications to be content with handling Latin text only.

Example

Example for bi-directional text and line-breaking according to the Unicode Bidi algorithm. We set up an unusual console format to make newlines visible in the Godoc documentation. Then we configure for a line length of 40 'en's, which will ensure a line-break between the two words in hebrew script.

Please note that this is in a sense a contrieved example, as it has to work from godoc in the browser. The browser will do the right thing with Bidi anyway. However, the example shows a typical use case and has a chance to work on different terminals with varying support for bidi text.

console := NewLocalConsoleFormat()
console.Codes.Newline = []byte("<nl>\n") // just to please godoc
config := &Config{LineWidth: 40}         // format into narrow lines
//
text := styled.TextFromString("The quick brown fox jumps over the כלב עצלן!")
para, _ := styled.ParagraphFromText(text, 0, text.Raw().Len(), bidi.LeftToRight, nil)
console.Print(para, config)
Output:

The quick brown fox jumps over the כלב <nl>
עצלן!<nl>

func NewConsoleFixedWidthFormat

func NewConsoleFixedWidthFormat(codes *ControlCodes, colors map[styled.Style]*color.Color,
	reorder ReorderFlag) *ConsoleFixedWidth

NewConsoleFixedWidthFormat creates a new formatter. It is to be used for consoles with a fixed width font.

codes is a table of escape sequences to control Bidi behaviour of the console and may be nil. colors is a map from the styled.Styles to colors, used for display. It may contain just a subset of the styles used in the texts which will be handled by this formatter.

This API is for clients having a need for fine control over the formatter. Often it is enough to call `NewLocalConsoleFormat`.

func NewLocalConsoleFormat

func NewLocalConsoleFormat() *ConsoleFixedWidth

NewLocalConsoleFormat creates a formatter for the terminal stdout is connected to. It uses various heuristics to identify the correct settings.

func (*ConsoleFixedWidth) LTR

func (fw *ConsoleFixedWidth) LTR(w io.Writer)

LTR signals to w that a bidi.LeftToRight sequence is to be output. (Part of interface Format)

func (*ConsoleFixedWidth) Line

func (fw *ConsoleFixedWidth) Line(length int, linelength int, w io.Writer)

Line is a signal from the output driver that a new line is to be output. length is the total width of the characters that will be formatted, measured in “en”s, i.e. fixed width positions. linelength is the target line length to wrap long lines. (Part of interface Format)

func (*ConsoleFixedWidth) NeedsReordering

func (fw *ConsoleFixedWidth) NeedsReordering() ReorderFlag

NeedsReordering signals to the formatting driver what kind of support the console needs with Bidi text.

func (*ConsoleFixedWidth) Newline

func (fw *ConsoleFixedWidth) Newline(w io.Writer)

Newline will be called at the end of every formatted line of text. It outputs the `Newline` escape sequence from fw.Codes. (Part of interface Format)

func (*ConsoleFixedWidth) Postamble

func (fw *ConsoleFixedWidth) Postamble(w io.Writer)

Postamble will be called after a paragraph of text has been formatted. It outputs the `Postamble` escape sequence from fw.Codes. (Part of interface Format)

func (*ConsoleFixedWidth) Preamble

func (fw *ConsoleFixedWidth) Preamble(w io.Writer)

Preamble is called by the output driver before a paragraph of text will be formatted. It outputs the `Preamble` escape sequence from fw.Codes. (Part of interface Format)

func (*ConsoleFixedWidth) Print

func (fw *ConsoleFixedWidth) Print(para *styled.Paragraph, config *Config) error

Print outputs a styled paragraph to stdout.

If parameter config is nil, a heuristic will create a config from the current terminal's properties (if stdout is interactive). Config.Context will also be created based on heuristics from the user environment.

func (*ConsoleFixedWidth) RTL

func (fw *ConsoleFixedWidth) RTL(w io.Writer)

RTL signals to w that a bidi.RightToLeft sequence is to be output. (Part of interface Format)

func (*ConsoleFixedWidth) StyledText

func (fw *ConsoleFixedWidth) StyledText(s string, style styled.Style, w io.Writer)

StyledText is called by the formatting driver to output a sequence of uniformly styled text (item). It uses colors to visualize styles. (Part of interface Format)

type ControlCodes

type ControlCodes struct {
	Preamble, Postamble []byte
	LTR, RTL            []byte
	Newline             []byte
}

ControlCodes holds certain escape sequences which a terminal uses to control Bidi behaviour.

type Format

type Format interface {
	Preamble(io.Writer)                         // output a preamble before a styled paragraph
	Postamble(io.Writer)                        // output a postamble
	StyledText(string, styled.Style, io.Writer) // output uniformly styled text run (item)
	LTR(io.Writer)                              // signal the start of a left-to-right run of text
	RTL(io.Writer)                              // signal the start of a right-to-left run of text
	Line(int, int, io.Writer)                   // signal for the start of a new line
	Newline(io.Writer)                          // output an end-of-line delimiter
	NeedsReordering() ReorderFlag               // what kind of re-ordering support does the formatter need?
}

Format is an interface for formatting drivers, given an io.Writer

type HTML

type HTML struct {
	// contains filtered or unexported fields
}

HTML is a format for simple HTML output.

func NewHTML

func NewHTML(reorder ReorderFlag) *HTML

NewHTML creates an HTML formatter.

func (*HTML) LTR

func (html *HTML) LTR(w io.Writer)

LTR signals to w that a bidi.LeftToRight sequence is to be output. It outputs a closing `</span>` if necessary, and a `<span dir="ltr">` tag. (Part of interface Format)

func (*HTML) Line

func (html *HTML) Line(length int, linelength int, w io.Writer)

Line is a signal from the output driver that a new line is to be output. length is the total width of the characters that will be formatted, measured in “en”s, i.e. fixed width positions. linelength is the target line length to wrap long lines.

Currently does nothing. (Part of interface Format)

func (*HTML) NeedsReordering

func (html *HTML) NeedsReordering() ReorderFlag

NeedsReordering signals to the formatting driver what kind of support the console needs with Bidi text.

func (*HTML) Newline

func (html *HTML) Newline(w io.Writer)

Newline will be called at the end of every formatted line of text. It outputs a `<br>` tag. (Part of interface Format)

func (*HTML) Postamble

func (html *HTML) Postamble(w io.Writer)

Postamble will be called after a paragraph of text has been formatted. It outputs a closing `</span>` if necessary, and a closing `</pre>` tag. (Part of interface Format)

func (*HTML) Preamble

func (html *HTML) Preamble(w io.Writer)

Preamble is called by the output driver before a paragraph of text will be formatted. It outputs the a `pre` tag. (Part of interface Format)

func (*HTML) Print

func (html *HTML) Print(para *styled.Paragraph, w io.Writer, config *Config) error

Print outputs a styled paragraph as HTML.

If parameter config is nil, a default configuration will be used. Config.Context will also be created based on heuristics from the user environment.

func (*HTML) RTL

func (html *HTML) RTL(w io.Writer)

RTL signals to w that a bidi.RightToLeft sequence is to be output. It outputs a closing `</span>` if necessary, and a `<span dir="rtl">` tag. (Part of interface Format)

func (*HTML) StyledText

func (html *HTML) StyledText(s string, style styled.Style, w io.Writer)

StyledText is called by the formatting driver to output a sequence of uniformly styled text (item). (Part of interface Format)

type HTMLStyle

type HTMLStyle inline.Style

HTMLStyle is a style equivalent to inline.Style, which offers some convenience functions.

func (HTMLStyle) Add

func (s HTMLStyle) Add(sty HTMLStyle) HTMLStyle

Add combines a style with another style

func (HTMLStyle) Equals

func (s HTMLStyle) Equals(other styled.Style) bool

Equals is part of interface styled.Style.

func (HTMLStyle) String

func (s HTMLStyle) String() string

type ReorderFlag

type ReorderFlag int

ReorderFlag is a hint from a Format whether it needs strings handed over reordered in some fashion.

const (
	ReorderNone      ReorderFlag = iota // formatter does reordering on its own (e.g., browser)
	ReorderWords                        // formatter will handle RTL words, but not phrases
	ReorderGraphemes                    // formatter relies on application for reordering
)

Different formatters have different capabilities regarding bidirectional text.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL