gostrutils

package module
v0.0.20 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 25, 2023 License: MIT Imports: 14 Imported by: 3

README

Go String Utilities

The following repo is a collection of string functions I have created over the years, and slowly moving them to a single package, helping me and others to enjoy them, and stop inventing the wheel every project.

Go Report Card GolangCI Build Status Coverage Status GoDoc License

Package gostrutils contains string utilities that are missing from the main strings package, as well as other encoding packages arrives by go.

The implementation of the package is set by files based on a subject that belong to.

Basics

The basic logic of Go, is to use bytes as a non numeric holder. The basics of the following package is to gain the ability to hold more support for string based on functions that are missing, while remembering that Go's strings are UTF-8.

The project itself built with files that holds the subject of what they are doing.

The helpers.go file, holds functions that are not string related functions, but help creating that string support.

Test coverage

The aim of the following library is to have close to 100% of unit test coverage, and also examples for all existed functions.

Dependencies

The package built to support golang standard library, for minimizing dependencies. There is no use of any 3rd party packages when using this package, and every contribution to the project must also take that in consideration.

License

The Following project released under MIT.

Documentation

Overview

Package gostrutils contains string utilities that are missing from the main strings package, as well as other encoding packages arrives by go.

The implementation of the package is set by files based on a subject that belong to.

Basics

The basic logic of Go, is to use bytes as a non numeric holder. The basics of the following package is to gain the ability to hold more support for string based on functions that are missing, while remembering that Go's strings are UTF-8.

The project itself built with files that holds the subject of what they are doing.

The helpers.go file, holds functions that are not string related functions, but help creating that string support.

Test coverage

The aim of the following library is to have close to 100% of unit test coverage, and also examples for all existed functions.

GSM 03.38

GSM 03.38 is a text encoding that is used with sending latin based languages with SMS.

There are several types of encoding that SMS can support, where one of them is GSM 03.38.

UTF16

UTF16 is 2 byte encoding mechanism. When a character is a single byte, it is padded with NULL.

UTF16 contains a representation of characters either as Big or Little endian. It means that bytes are either appears as `'A', 0x00"`, or `0x00, 'A'`.

In order to understand what type of encoding is in use, and it's endian, there is a BOM -> Byte Order Mark header that provides such information.

Partial content of UTF16 does not a BOM, but a starting of such, should have one. It must be the first (two) bytes of the 'string'.

At the past there was a UCS2 encoding that provided the same encoding as UTF16 Big Endian, and because of that, it was deprecated in favor of of UTF16 Big Endian.

The current support for UTF16, provides three functions:

  1. Encoding a go string that is UTF8 into UTF16.
  2. Decoding a UTF16 to go's string (UTF8).
  3. UTF16 BOM detection.

Index

Examples

Constants

View Source
const (
	BOMNone = iota
	BOMBE
	BOMLE
)

BOM Types

View Source
const DefaultEllipse = "…"

DefaultEllipse is the default char for marking abbreviation

Variables

This section is empty.

Functions

func Abbreviate

func Abbreviate(str string, startAt, maxLen int, abbrvChar string) string

Abbreviate takes 'str' and based on length returns an abbreviate string with marks that represents middle of words.

str - The original string startAt - Where to start to abbreviate inside a string maxLen - The maximum length for a string. It must be at least 4 chatrs

func ByteToStr

func ByteToStr(b []byte) string

ByteToStr converts slice of bytes to string

Example
package main

import (
	"fmt"

	"github.com/ik5/gostrutils"
)

func main() {
	b := []byte{'H', 'e', 'l', 'l', 'o', ' ', 'W', 'o', 'r', 'l', 'd'}
	s := gostrutils.ByteToStr(b)
	fmt.Println(s)
}
Output:

func CamelCaseToJavascriptCase

func CamelCaseToJavascriptCase(str string) string

CamelCaseToJavascriptCase convert CamelCase to camelCase

func CamelCaseToUnderscore

func CamelCaseToUnderscore(str string) string

CamelCaseToUnderscore converts CamelCase to camel_case

func CleanPhoneNumber

func CleanPhoneNumber(number string) string

CleanPhoneNumber Remove non numeric values from a phone number

func ClearByteChars added in v0.0.17

func ClearByteChars(buf []byte, chars []byte) []byte

ClearByteChars removes chars from buffer

func ClearHebrewPunctuationPoints

func ClearHebrewPunctuationPoints(s string) string

ClearHebrewPunctuationPoints removes Hebrew based punctuation points (Nikud) from a string. If something went wrong, the original string is returned.

func CopyRange

func CopyRange(src string, from, to int) (string, error)

CopyRange returns a copy of a string based on start and end of a rune for multi-byte chars instead of byte based chars (ASCII). 'to' is the amount of chars interested plus one. for example, for a string 'foo' extracting 'oo' by from: 1 and to 3.

func CountBidiChars

func CountBidiChars(s string) int

CountBidiChars counts how many chars belong to a bi-directional language: Arabic (and sub language), Hebrew, Urdu and Yiddish

func CountStartBidiLanguage

func CountStartBidiLanguage(s string) int

CountStartBidiLanguage returns number of chars from the beginning. If chars are not A bidi range, and not up to 0x40 char, then 0 will be returned. If only chars up to (including) char 0x40 (@) -1 will be returned

Important: result is both bidi chars and up to 0x40 count from the beginning

func DecToHex added in v0.0.19

func DecToHex(dec byte) byte

DecToHex convert decimal value to hexadecimal as long as the range is between 0 to 15. When value is bigger than 15, 0 will be returned.

func DecodeUTF16

func DecodeUTF16(b []byte) (string, error)

DecodeUTF16 get a slice of bytes and decode it to UTF-8

Example
package main

import (
	"fmt"

	"github.com/ik5/gostrutils"
)

func main() {
	// Big endian text of TEST
	text := []byte{
		0xfe, // BOM
		0xff, // BOM
		0x00,
		'T',
		0x00,
		'E',
		0x00,
		'S',
		0x00,
		'T',
	}
	test, err := gostrutils.DecodeUTF16(text)
	if err != nil {
		panic(err)
	}
	fmt.Println("we have 'TEST'? ", test)
}
Output:

func DetectMimeTypeFromContent

func DetectMimeTypeFromContent(content []byte) (contentType string)

DetectMimeTypeFromContent takes a slice of bytes, and try to detect the content type that is in use.

Example
package main

import (
	"fmt"

	"github.com/ik5/gostrutils"
)

func main() {
	hello := []byte{0x68, 0x65, 0x6c, 0x6c, 0x6f}

	fmt.Printf("Mime type of %s is: %s", hello, gostrutils.DetectMimeTypeFromContent(hello))
}
Output:

func EncodeUTF16

func EncodeUTF16(s string, bigEndian, addBom bool) []byte

EncodeUTF16 get a utf8 string and translate it into a slice of bytes

Example
package main

import (
	"fmt"

	"github.com/ik5/gostrutils"
)

func main() {
	str := gostrutils.EncodeUTF16("Windows Unicode String", true, true)
	fmt.Println(gostrutils.ByteToStr(str))
}
Output:

func GSM0338ToUTF8

func GSM0338ToUTF8(text string) string

GSM0338ToUTF8 convert a GSM0338 string to a UTF8 equivalent chars

The encoding is driven from "Extended ASCII" charsets, and as such compatible with UTF8, and does not require a slice of bytes

Example
package main

import (
	"fmt"

	"github.com/ik5/gostrutils"
)

func main() {
	msg := gostrutils.GSM0338ToUTF8("Please email me to user\x00example.com")
	fmt.Println("Got sms message: ", msg)
}
Output:

func GetBytesRuneIndexInSlice added in v0.0.17

func GetBytesRuneIndexInSlice(list []byte, needle rune) int

GetBytesRuneIndexInSlice returns the index of the slice if rune was found or -1 if not. Important, the index returned is of the rune, not of the byte!

func GetStringIndexInSlice

func GetStringIndexInSlice(list []string, needle string) int

GetStringIndexInSlice returns the index of the slice if string was found or -1 if not

func HaveNULL added in v0.0.18

func HaveNULL(str string) bool

HaveNULL look for null char, and return true if found at first place. Empty str returns false.

func HexToDec added in v0.0.19

func HexToDec(ch byte) byte

HexToDec convert hex decimal ASCII value. If ch is not in the range, the function will return 0

func HexToUTF16Runes

func HexToUTF16Runes(s string, bigEndian bool) []rune

HexToUTF16Runes takes a hex based string and converts it into UTF16 runes

Such string looks like:

"\x00H\x00e\x00l\x00l\x00o\x00 \x00W\x00o\x00r\x00l\x00d"
"\x05\xe9\x05\xdc\x05\xd5\x05\xdd\x00 \x05\xe2\x05\xd5\x05\xdc\x05\xdd"

Result of the first string is (big endian):

[U+0048 'H' U+0065 'e' U+006C 'l' U+006C 'l' U+006F 'o' U+0020 ' ' U+0057 'W' U+006F 'o' U+0072 'r' U+006C 'l' U+0064 'd'] Hello World

Result of the second string is (big endian):

[U+05E9 'ש' U+05DC 'ל' U+05D5 'ו' U+05DD 'ם' U+0020 ' ' U+05E2 'ע' U+05D5 'ו' U+05DC 'ל' U+05DD 'ם'] שלום עולם

func In64Split

func In64Split(data, sep string) []int64

In64Split get a string with separator and convert it to a slice of int64

func Int64Join

func Int64Join(list []int64, sep string) string

Int64Join takes a list of int64 and join them with a separator based on https://play.golang.org/p/KpdkVS1B4s

func IsANSI added in v0.0.18

func IsANSI(str string) bool

IsANSI returns true if the entire string contains only ANSI characters. On empty string, the function returns false.

func IsASCII added in v0.0.18

func IsASCII(str string) bool

IsASCII returns true if the entire string contains only ASCII characters. On empty string, the function returns false.

func IsE164

func IsE164(number string) bool

IsE164 validates if a given number is on e164 standard

func IsEmpty

func IsEmpty(s string) bool

IsEmpty returns true if a string with whitespace only was provided or an empty string

func IsEmptyChars

func IsEmptyChars(s string, chars []rune) bool

IsEmptyChars return true if after cleaning given chars the string is empty

func IsFloat

func IsFloat(txt string) bool

IsFloat returns true if a given text is a floating point

func IsHex added in v0.0.19

func IsHex(ch byte) bool

IsHex returns true if ch contains a hexadecimal character as a value

func IsInRange added in v0.0.15

func IsInRange(min, max int64, src string) bool

IsInRange takes an integer range and look at the string that contains only numbers tom make sure it is inside the range

func IsInteger

func IsInteger(txt string) bool

IsInteger returns true if a string is an integer

func IsIsraeliPhoneNumber

func IsIsraeliPhoneNumber(number string) bool

IsIsraeliPhoneNumber check to see if a pettern that looks like Israeli phone number is actually an Israeli phone number

func IsNumber

func IsNumber(txt string) bool

IsNumber returns true if a given string is integer or float

func IsRuneInByteSlice added in v0.0.17

func IsRuneInByteSlice(list []byte, needle rune) bool

IsRuneInByteSlice return true if needle exists in list

func IsStringInSlice

func IsStringInSlice(list []string, needle string) bool

IsStringInSlice looks for a string inside a slice and return true if it exists

func IsUFloat

func IsUFloat(txt string) bool

IsUFloat returns true if a string is unsigned floating point

func IsUInteger

func IsUInteger(txt string) bool

IsUInteger returns true if a string is unsigned integer

func IsUNumber

func IsUNumber(txt string) bool

IsUNumber returns true if a given string is unsigned integer or float

func IsValidUUID

func IsValidUUID(uuid string) bool

IsValidUUID validate if a given string is a uuid. On error it will return false.

The code is not mine, and was taken from: https://stackoverflow.com/questions/25051675/how-to-validate-uuid-v4-in-go#25051754

func KeepByteChars added in v0.0.17

func KeepByteChars(buf []byte, toKeep []byte) []byte

KeepByteChars removes all chars that are not part of toKeep and return a new slice of bytes

func PointerToStr

func PointerToStr(s *string) string

PointerToStr converts a pointer string to a normal string

If there is a need for a string to be nil, then usually there is a pointer for a string involved. The current function converts nil pointer to empty string, or return the string itself

Example
package main

import (
	"fmt"

	"github.com/ik5/gostrutils"
)

func main() {
	empty := gostrutils.PointerToStr(nil)
	hello := "hello"
	pHello := &hello
	newHello := gostrutils.PointerToStr(pHello)
	fmt.Printf("empty: %s, pHello: %s\n", empty, newHello)
}
Output:

func SliceIndex

func SliceIndex(limit int, predicate func(i int) bool) int

SliceIndex returns a slice index based on a callback value

func SmsCalculateSmsFragments

func SmsCalculateSmsFragments(message string) uint16

SmsCalculateSmsFragments calculate based on a string how many SMS messages will it take to deliver a massage

func StrToInt64

func StrToInt64(str string, def int64) int64

StrToInt64 convert a string to int64

Example
package main

import (
	"fmt"

	"github.com/ik5/gostrutils"
)

func main() {
	meaning := gostrutils.StrToInt64("42", 0)
	fmt.Println("Meaning of life universe and everything: ", meaning)
}
Output:

func StrToPointer

func StrToPointer(str string) *string

StrToPointer returns the address of string.

The function can help when there is a need to convert literal string yo be have a pointer for that content.

Example
package main

import (
	"fmt"

	"github.com/ik5/gostrutils"
)

func main() {
	hello := "hello"
	addr := gostrutils.StrToPointer(hello)
	fmt.Printf("String: %s, Address: %p\n", hello, addr)
}
Output:

func StrToUInt64

func StrToUInt64(str string, def uint64) uint64

StrToUInt64 convert string to uint64

func TagIsBidi

func TagIsBidi(tag language.Tag) bool

TagIsBidi return true if a given tag is a bidi language

func ToFloat32

func ToFloat32(field string) float32

ToFloat32 convert string to float32 without errors!

func ToFloat32Default

func ToFloat32Default(field string, defaultValue float32) float32

ToFloat32Default convert string to float32 without errors. If error returns, defaultValue is set instead.

Example
package main

import (
	"fmt"

	"github.com/ik5/gostrutils"
)

func main() {
	pi := gostrutils.ToFloat32Default("3.141592653", 3.14)
	fmt.Printf("Pi: %f\n", pi)
}
Output:

func ToFloat64

func ToFloat64(field string) float64

ToFloat64 convert string to float64 without errors!

func ToFloat6Default

func ToFloat6Default(field string, defaultValue float64) float64

ToFloat6Default convert string to float64 without errors. If error returns, default is set instead.

func ToSQLString

func ToSQLString(str string) sql.NullString

ToSQLString convert string to sql.NullString

func ToSQLStringNotNullable

func ToSQLStringNotNullable(str string) sql.NullString

ToSQLStringNotNullable convert string to sql.NullString not nullable

func Truncate

func Truncate(s string, length int) string

Truncate a string to smaller parts, UTF8 multi-byte wise.

When using s[:3] to truncate to 3 chars, it will not take runes but rather bytes. A multi-byte char (rune) can contain between one to 4 bytes, but not all of them will be returned.

The following function will return the number of runes, and not the number of bytes inside a string.

func UInt64Join

func UInt64Join(list []uint64, sep string) string

UInt64Join takes a list of uint64 and join them with a separator based on https://play.golang.org/p/KpdkVS1B4s

Example
package main

import (
	"fmt"

	"github.com/ik5/gostrutils"
)

func main() {
	list := []uint64{0, 1, 1, 2, 3, 5, 8, 13, 21, 34}
	fibonacci := gostrutils.UInt64Join(list, ", ")
	fmt.Printf("First 10 fibonacci sequence: %s\n", fibonacci)
}
Output:

func UTF16Bom

func UTF16Bom(b []byte) int8

UTF16Bom returns 0 for no BOM, 1 for Big Endian and 2 for little endian it will return -1 if b is too small for having BOM

func UTF8ToGsm0338

func UTF8ToGsm0338(text string) string

UTF8ToGsm0338 convert a UTF8 string to a gsm0338 equivalent chars

The encoding is driven from "Extended ASCII" charsets, and as such compatible with UTF8, and does not require a slice of bytes

Example
package main

import (
	"github.com/ik5/gostrutils"
)

func sendSmsMessage(msg string) bool {
	return true
}

func main() {
	msg := gostrutils.UTF8ToGsm0338("Please email me to user@example.com")
	// send the SMS message here
	if !sendSmsMessage(msg) {
		panic("Cannot send a message")
	}
}
Output:

func Uin64Split

func Uin64Split(data, sep string) []uint64

Uin64Split get a string with separator and convert it to a slice of uint64

Example
package main

import (
	"fmt"

	"github.com/ik5/gostrutils"
)

func main() {
	fibonacci := "0, 1, 1, 2, 3, 5, 8, 13, 21, 34"
	list := gostrutils.Uin64Split(fibonacci, ", ")
	fmt.Printf("List: %+v\n", list)
}
Output:

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL