xlsxreader

package module

v1.2.5 Latest Latest Go to latest Published: Jan 25, 2023 License: MIT Imports: 11 Imported by: 9

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/thedatashed/xlsxreader

Links

Open Source Insights

README ¶

xlsxreader: A Go Package for reading data from an xlsx file

Overview

A low-memory high performance library for reading data from an xlsx file.

Suitable for reading .xlsx data and designed to aid with the bulk uploading of data where the key requirement is to parse and read raw data.

The reader will read data out row by row (1->n) and has no concept of headers or data types (this is to be managed by the consumer).

The reader is currently not concerned with handling some of the more advanced cell data that can be stored in a xlsx file.

Further reading on how this came to be is available on our blog

Install

go get github.com/thedatashed/xlsxreader

Example Usage

Reading from the file system:

package main

import (
  "github.com/thedatashed/xlsxreader"
)

func main() {
    // Create an instance of the reader by opening a target file
    xl, _ := xlsxreader.OpenFile("./test.xlsx")

    // Ensure the file reader is closed once utilised
    defer xl.Close()

    // Iterate on the rows of data
    for row := range xl.ReadRows(e.Sheets[0]){
    ...
    }
}

Reading from an already in-memory source

package main

import (
  "io/ioutil"
  "github.com/thedatashed/xlsxreader"
)

func main() {

    // Preprocessing of file data
    file, _ := os.Open("./test/test-small.xlsx")
    defer file.Close()
    bytes, _ := ioutil.ReadAll(file)

    // Create an instance of the reader by providing a data stream
    xl, _ := xlsxreader.NewReader(bytes)

    // Iterate on the rows of data
    for row := range xl.ReadRows(e.Sheets[0]){
    ...
    }
}

Key Concepts

Files

The reader operates on a single file and will read data from the specified file using the OpenFile function.

Data

The Reader can also be instantiated with a byte array by using the NewReader function.

Sheets

An xlsx workbook can contain many worksheets, when reading data, the target sheet name should be passed. To process multiple sheets, either iterate on the array of sheet names identified by the reader or make multiple calls to the ReadRows function with the desired sheet names.

Rows

A sheet contains n rows of data, the reader returns an iterator that can be accessed to cycle through each row of data in a worksheet. Each row holds an index and contains n cells that contain column data.

Cells

A cell represents a row/column value and contains a string representation of that data. Currently numeric data is parsed as found, with dates parsed to ISO 8601 / RFC3339 format.

Documentation ¶

Index ¶

type Cell
- func (c Cell) ColumnIndex() int
type CellType
type Row
type XlsxFile
- func NewReader(xlsxBytes []byte) (*XlsxFile, error)
- func NewReaderZip(r *zip.Reader) (*XlsxFile, error)
- func (x *XlsxFile) ReadRows(sheet string) chan Row
type XlsxFileCloser
- func OpenFile(filename string) (*XlsxFileCloser, error)
- func OpenReaderZip(rc *zip.ReadCloser) (*XlsxFileCloser, error)
- func (xl *XlsxFileCloser) Close() error
- func (xl *XlsxFileCloser) GetSheetFileForSheetName(sheetName string) *zip.File

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Cell ¶

type Cell struct {
	Column string // E.G   A, B, C
	Row    int
	Value  string
	Type   CellType
}

Cell represents the data in a single cell as a consumable format.

func (Cell) ColumnIndex ¶ added in v1.1.6

func (c Cell) ColumnIndex() int

ColumnIndex gives a number, representing the column the cell lies beneath.

type CellType ¶ added in v1.1.7

type CellType string

CellType defines the data type of an excel cell

const (
	// TypeString is for text cells
	TypeString CellType = "string"
	// TypeNumerical is for numerical values
	TypeNumerical CellType = "numerical"
	// TypeDateTime is for date values
	TypeDateTime CellType = "datetime"
	// TypeBoolean is for true/false values
	TypeBoolean CellType = "boolean"
)

type Row ¶

type Row struct {
	Error error
	Index int
	Cells []Cell
}

Row represents a row of data read from an Xlsx file, in a consumable format

type XlsxFile ¶

type XlsxFile struct {
	Sheets []string
	// contains filtered or unexported fields
}

XlsxFile defines a populated XLSX file struct.

func NewReader ¶

func NewReader(xlsxBytes []byte) (*XlsxFile, error)

NewReader takes bytes of Xlsx file and returns a populated XlsxFile struct for it. If the file cannot be found, or key parts of the files contents are missing, an error is returned.

func NewReaderZip ¶ added in v1.2.0

func NewReaderZip(r *zip.Reader) (*XlsxFile, error)

NewReaderZip takes zip reader of Xlsx file and returns a populated XlsxFile struct for it. If the file cannot be found, or key parts of the files contents are missing, an error is returned.

func (*XlsxFile) ReadRows ¶

func (x *XlsxFile) ReadRows(sheet string) chan Row

ReadRows provides an interface allowing rows from a specific worksheet to be streamed from an xlsx file. In order to provide a simplistic interface, this method returns a channel that can be range-d over.

This method has one notable drawback however - the entire file must be consumed before the channel will be closed. Reading only some of the values will leave an orphaned goroutine and channel behind.

Notes: Xlsx sheets may omit cells which are empty, meaning a row may not have continuous cell references. This function makes no attempt to fill/pad the missing cells.

type XlsxFileCloser ¶

type XlsxFileCloser struct {
	XlsxFile
	// contains filtered or unexported fields
}

XlsxFileCloser wraps XlsxFile to be able to close an open file

func OpenFile ¶

func OpenFile(filename string) (*XlsxFileCloser, error)

OpenFile takes the name of an XLSX file and returns a populated XlsxFile struct for it. If the file cannot be found, or key parts of the files contents are missing, an error is returned. Note that the file must be Close()-d when you are finished with it.

func OpenReaderZip ¶ added in v1.2.0

func OpenReaderZip(rc *zip.ReadCloser) (*XlsxFileCloser, error)

OpenReaderZip takes the zip ReadCloser of an XLSX file and returns a populated XlsxFileCloser struct for it. If the file cannot be found, or key parts of the files contents are missing, an error is returned. Note that the file must be Close()-d when you are finished with it.

func (*XlsxFileCloser) Close ¶

func (xl *XlsxFileCloser) Close() error

Close closes the XlsxFile, rendering it unusable for I/O.

func (*XlsxFileCloser) GetSheetFileForSheetName ¶ added in v1.2.3

func (xl *XlsxFileCloser) GetSheetFileForSheetName(sheetName string) *zip.File

GetSheetFileForSheetName returns the sheet file associated with the sheet name. This is useful when you want to further process something out of the sheet, that this library does not handle. For example this is useful when trying to read the hyperlinks section of a sheet file; getting the sheet file enables you to read the XML directly.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
test

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL