Documentation ¶
Overview ¶
Package strinterp provides a demonstration of morally correct string interpolation.
This package was created in support of a blog post about why we are still writing insecure software in 2015: http://www.jerf.org/iri/post/2942
It's the result of about 20 hours of screwing around. I meant to keep it shorter, but I started to have too much fun.
"Morally" correct means that I intend this to demonstrate a point about API and language design, and that any actual utility is a bit coincidental.
That said, as this developed it became potentially more useful than I had initially intended, because instead of expressing all the interpolations in terms of strings, they are all expressed in terms of io.Writers. Since this library also permits inputting the strings to be interpolated in the form of io.Readers, this means that this entire library is fully capable of string interpolation in the middle of streams, not just strings. Or, if you prefer, this is a *stream* interpolator. The "str" in "strinterp" is pleasingly ambiguous.
This documentation focuses on usage; for the reasoning behind the design, consult the blog post.
Using String Interpolators ¶
To use this package, create an interpolator object:
i := strinterp.NewInterpolator()
You can then use it to interpolate strings. The simplest case is concatenation:
concated, err := i.InterpStr("concatenated: %RAW;%RAW;", str1, str2)
See the blog post for a discussion of why this is deliberately a bit heavyweight and *designed* to call attention to the use of "RAW", rather than making such usage a simple and quiet default behavior.
The "format string", the first element of the call, has the following syntax:
- Begins with %, ends with unescaped ;
- Begins with the formatter/encoder name
- Which may be followed by a colon, then args for that formatter
- Which may then be followed by a pipe, and further specifications of encoders with optional arguments
You may backslash-escape any of the pipe, colon, or semicolon to pass them through as arguments to the formatter/encoder, or backslash itself to pass it through. (The formatter/encoder will of course receive the decoded bytes without the escaping backslash.) To emit a raw %, use "%%;".
Here is an example of a format string that uses all these features:
result, err := i.InterpStr("copy and paste: %json|base64:url;", obj)
This will result in the standard encoding/json encoding being used on the obj, then it will be converted to base64, which will use the encoding/base64 URLEncoding due to the "url" argument being passed. You can continue piping to further encoders indefinitely.
There are two different kinds of interpolators you can write, formatters and encoders.
Formatters ¶
A "formatter" is a routine that takes a Go value of some sort and converts it to some bytes to be written out via a provided io.Writer. A formatter has the function signature defined by the Formatter type, which is:
func (w io.Writer, arg interface{}, params []byte) error
When called, the function should first examine the parameters. If it doesn't like the parameters, it should return ErrUnknownArguments, properly filled out. (Note: It is important to be strict on the parameters; if they don't make perfect sense, this is your only chance to warn a user about that.) It should then take the arg and write it out to the io.Writer in whatever manner makes sense, then return either the error obtained during writing or nil if it was fully successful.
You want to write a Formatter when you are trying to convert something that isn't already a string, []byte, or io.Reader into output. Therefore it only makes sense in the first element of a formatter's pipeline (the "json" in the previous example), because only a formatter can handle arbitrary objects.
See the Formatter documentation below for more gritty details.
Encoders ¶
An "encoder" is a routine that receives incoming io.Writer requests, modifies them in a suitable manner, and passes them down to the next io.Writer in the chain. In other words it takes []byte and generates further []byte from them.
You want to write an Encoder when either you want to transform input going through it (like escaping), or when you know the only valid input coming in will be in the form of a string, []byte, or io.Reader, which strinterp will automatically handle feeding down the encoder pipeline.
See the Encoder documentation below for more gritty details.
Configuring Your Interpolators ¶
To configure your interpolator, you will need to add additional formatters and encoders to the interpolator so it is aware of them. NewInterpolator will return a bare *Interpolator with only the "RAW" encoder. A DefaultInterpolator is also provided that comes preconfigured for some HTML- and JSON-type-tasks. Consulting the "examples.go" file in the godoc file listing below will highlight these formatters and interpolators for your cribbing convenience.
Use the AddFormatter and AddEncoder functions to add these to your interpolator to configure it.
(Since I find people often get a sort of mental block around this, remember that, for instance, even though I provide you a default JSON streamer based on the standard encoding/json library, if you have something else you prefer, you can always specify a *different* json formatter for your own usage.)
Once configured, for maximum utility I recommend putting string interpolation into your environment object. See http://www.jerf.org/iri/post/2929 .
Direct Encoder Usage ¶
It is also possible to directly use the Encoders, as their type signature tends to imply (note how you don't have to pass them any *Interpolator or any other context). Ideally you instantiate a WriterStack around your target io.Writer and .Push encoders on top of that, as WriterStack handles some corner cases around Encoders that want to be "Close"d, then call .Finish() on the WriterStack when done, which DOES NOT close the underlying io.Writer. This is probably the maximally-performing way to do this sort of encoding in a stream.
Security Note ¶
This is true of all string interpolators, but even more so of strinterp since it can be hooked up to arbitrary formatters and encoders. You MUST NOT feed user input as the interpolation source string. In fact I'd suggest that one could make a good case that the first parameter to strinterp should always be a constant string in the source code base, and if I were going to write a local validation routine to plug into go vet or something I'd probably add that as a rule.
Again, let me emphasize, this is NOT special to strinterp. You shouldn't let users feed into the first parameter of fmt.Sprintf, or any other such string, in any language for that matter. It's possible some are "safe" to do that in, but given the wide range of havoc done over the years by letting users control interpolation strings, I would just recommend against it unconditionally. Even when "safe" it probably isn't what you mean.
Care should also be taken in the construction of filters. If they get much "smarter" than a for loop iterating over bytes/runes and doing "something" with them, you're starting to ask for trouble if user input passes through them. Generally the entire point of strinterp is to handle potentially untrusted input in a safe manner, so if you start "interpreting" user input you could be creating openings for attackers.
Contributing ¶
I'm interested in pull requests for more Formatters and Encoders for the "default Interpolator", though ideally only for things in the standard library.
Index ¶
- Variables
- func Base64(w io.Writer, args []byte) (io.Writer, error)
- func CDATA(inner io.Writer, args []byte) (io.Writer, error)
- func JSON(w io.Writer, val interface{}, params []byte) error
- type Encoder
- type ErrUnknownArguments
- type Formatter
- type Interpolator
- func (i *Interpolator) AddEncoder(format string, handler Encoder) error
- func (i *Interpolator) AddFormatter(format string, handler Formatter) error
- func (i *Interpolator) InterpStr(format string, args ...interface{}) (string, error)
- func (i *Interpolator) InterpWriter(w io.Writer, formatBytes []byte, args ...interface{}) error
- type NotGivenType
- type WriterFunc
- type WriterStack
Constants ¶
This section is empty.
Variables ¶
var ErrNotGiven = errors.New("value not given")
ErrNotGiven will be passed to a Formatter as the value it is encoding, if the caller did not give enough arguments to the InterpStr or InterpWriter calls.
This is public so your formatter can check for it.
var NotGiven = NotGivenType{}
NotGiven is the token passed to the formatters to indicate the value was not given. This distinguishes the value from "nil", which may well be perfectly legitimate.
Functions ¶
func Base64 ¶
Base64 defines an Encoder that implements base64 encoding.
It takes as a parameter either "std" or "url", to select between Standard or URL base64 encoding. If no parameter is given, Standard is chosen. Any other parameter results in ErrUnknownArguments.
func CDATA ¶
CDATA defines an HTML CDATA escaper, which is to say, the type of data that appears as "text" within HTML.
There's a lot of history and browser variations here. By default this is a very aggressive encoding function suitable for use in all the parts of HTML that permit "CDATA" that I know of, including attribute values. (Some browsers do not like literal newlines in attributes, considering it to terminate the tag.) However, this aggression may result in difficult-to-read HTML. If you are outputting HTML text as text (as opposed to attribute values), you can pass the argument "nocrlf" to avoid encoding CR and LF as entities.
Types ¶
type Encoder ¶
An Encoder is a function that takes an "inner" io.Writer and returns an io.Writer that wraps that writer, such that calls to the returned Writer will produce the desired encoding behavior. See examples.go.
In addition to conforming to the io.Writer interface, Encoders must also never cut up Unicode characters between calls. This technically means that existing io.Writer transformers *may* not conform to this interface, though most if not all probably do by accident. Encoders thus may also count on the fact that they will not receive partial Unicode characters, which may permit stateless Encoders to be written. This is facilitated with the provided WriteFunc type as well.
type ErrUnknownArguments ¶
ErrUnknownArguments is the error that is returned when you pass arguments to a formatter/encoder that it doesn't understand. This is public so your formatters and encoders can reuse it.
func (ErrUnknownArguments) Error ¶
func (ua ErrUnknownArguments) Error() string
type Formatter ¶
A Formatter is a function that takes the argument interface{} and writes the corresponding bytes to the io.Writer, based on the arguments. This is generally useful for doing non-trivial transforms on arbitrary objects, such as JSON-encoding them. If your argument is anything other than a string, []byte, or io.Reader, you'll need a Formatter.
The []byte is any additional parameters passed via the colon mechanism, containing only those extra parameters (i.e., no colon or semicolon). Interpreting them is entirely up to the function. This is nil if no colon was used. (Note this can be distinguished from blank, though that seems like a bad idea. Note also the len of a nil slice is 0, which makes that the easiest thing to check.)
interface{} is the value. If the value was not given to the interpolator at all (i.e., more format strings given than values), the value will be == NotGiven, a singleton value used for this case.
If the formatting could be completed successfully, the bytes should all be written to the io.Writer by the time the formatter returns. If the formatting could not be completed successfully, an error should be returned. In that case there are no guarantees about how much of the stream may have been written, which is fundamental to a stream-style library.
type Interpolator ¶
type Interpolator struct {
// contains filtered or unexported fields
}
An Interpolator represents an object that can perform string interpolation.
Interpolators are created via NewInterpolator.
Interpolators are designed to be used via being initialized with all desired format string handlers in a single goroutine. Once initialized, the interpolator can be freely used in any number of goroutines.
func NewDefaultInterpolator ¶
func NewDefaultInterpolator() *Interpolator
NewDefaultInterpolator returns a new Interpolator set up with some more format strings available:
json: the JSON formatter base64: the Base64 encoder cdata: the HTML CDATA encoder
More things may be added in future versions of this library. The safest long-term thing to do is to use NewInterpolator and configure it yourself. But this is convenient for demos and such.
func NewInterpolator ¶
func NewInterpolator() *Interpolator
NewInterpolator returns a new Interpolator, with only the default load of interpolation primitives.
These are:
"%": Yields a literal % without consuming an arg "RAW": interpolates the given string, []byte, or io.Reader directly (if an io.Reader, io.Copy is used)
func (*Interpolator) AddEncoder ¶
func (i *Interpolator) AddEncoder(format string, handler Encoder) error
AddEncoder adds an encoder type to the interpolator.
If the format string is already registered, an error will be returned.
func (*Interpolator) AddFormatter ¶
func (i *Interpolator) AddFormatter(format string, handler Formatter) error
AddFormatter adds a interpolation format to the interpolator.
If the format string is already registered, an error will be returned.
func (*Interpolator) InterpStr ¶
func (i *Interpolator) InterpStr(format string, args ...interface{}) (string, error)
InterpStr is a convenience function that does interpolation on a format string and returns the resulting string.
func (*Interpolator) InterpWriter ¶
func (i *Interpolator) InterpWriter(w io.Writer, formatBytes []byte, args ...interface{}) error
InterpWriter interpolates the format []byte into the passed io.Writer.
type NotGivenType ¶
type NotGivenType struct{}
NotGivenType uniquely identifies the token passed to formatters when an argument is not given for the formatter.
type WriterFunc ¶
WriterFunc is a type that wraps a function that implements the io.Writer interface with an implementation of calling it for .Write. This allows Encoders to easily return stateless functions as their implementation. See several examples in examples.go.
type WriterStack ¶
A WriterStack allows us to wrap Encoders around a given io.Writer.
WriterStack solves the problem of some of the Encoders potentially wanting to be .Close()d, even if the underlying io.Writer is not closable, or you do not wish to close the underlying writer so it can be reused later. This can, for instance, be seen in the base64 encoder shipped by this library. By calling .Finish() on this object you can safely use these Encoders. .Finish() should always be called to end a WriterStack's output.
WriterStack can be used without any other strinterp functionality.
func NewWriterStack ¶
func NewWriterStack(w io.Writer) *WriterStack
NewWriterStack returns a new *WriterStack with the argument being used as the lowest-level writer.
func (*WriterStack) Finish ¶
func (ws *WriterStack) Finish() error
Finish will finish the WriterStack's work, which may flush intermediate encoders by calling .Close() on them. This will not close the base io.Writer.