gopherparse

package module
v0.0.0-...-b14ae2a Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 11, 2023 License: MIT Imports: 4 Imported by: 0

README

GopherParse

GopherParse is a Go library that provides functionalities for parsing, manipulating, and formatting HTML and XML documents. It is inspired by the popular JavaScript library Cheerio and aims to offer a fast, flexible, and elegant way to work with HTML and XML content in Go.

Installation

To use GopherParse, you need to have Go installed and set up. Then, you can install the package using the go get command:

go get github.com/theifedayo/gopherparse

Usage

Import the GopherParse package in your Go code:

import "github.com/theifedayo/gopherparse"
Parsing HTML and XML

You can parse HTML and XML documents by using the LoadHTML and LoadXML functions, respectively:

htmlContent := `
	<!DOCTYPE html>
	<html>
	<head>
		<title>Sample HTML Document</title>
	</head>
	<body>
		<h1>Hello, World!</h1>
		<p>This is a sample HTML document.</p>
	</body>
	</html>
`

gpObj, err := gopherparse.LoadHTML(htmlContent)
if err != nil {
    // Handle error
}

// Now you can work with the GopherParse object.
xmlContent := `
	<root>
		<item>Item 1</item>
		<item>Item 2</item>
	</root>
`

gpObj, err := gopherparse.LoadXML(xmlContent)
if err != nil {
    // Handle error
}

// Now you can work with the GopherParse object.
Reading from Files

GopherParse also supports reading and parsing HTML and XML content from files:

htmlFilePath := "path/to/your/html/file.html"
gpObj, err := gopherparse.LoadHTMLFile(htmlFilePath)
if err != nil {
    // Handle error
}

// Now you can work with the GopherParse object.
xmlFilePath := "path/to/your/xml/file.xml"
gpObj, err := gopherparse.LoadXMLFile(xmlFilePath)
if err != nil {
    // Handle error
}

// Now you can work with the GopherParse object.
Finding Elements

You can find elements in the parsed document using FindByTag and FindByClass methods:

h1Elements := gpObj.FindByTag("h1")
// Get all h1 elements in the document

pElements := gpObj.FindByClass("paragraph")
// Get all elements with class "paragraph" in the document
Manipulating Elements

You can manipulate elements in the document, such as setting their text content using SetText:

gpObj.SetText("h1", "Updated Heading")
// Set text content of all h1 elements to "Updated Heading"
Rendering the Document

To get the HTML representation of the document, you can use the Render method:

htmlString := gpObj.Render()
// Get the HTML string representation of the document
Contributing

Feel free to contribute to GopherParse by opening issues, submitting pull requests, or suggesting new features. Your contributions are highly appreciated!

License

This project is licensed under the MIT License - see the LICENSE file for details.

Roadmap
  • Parsing HTML and XML documents.
  • Reading and parsing content from HTML and XML files.
  • Finding elements by tag name within the document.
  • Finding elements by class name within the document.
  • Setting the text content of elements with the specified tag name.
  • Rendering the GopherParse object into an HTML string.
  • Support for Selectors: Implement a selector engine for complex queries.
  • Element Manipulation: Add, remove, and modify attributes of elements.
  • Traversing: Methods to traverse the document tree.
  • Filtering: Filter selected elements based on specific conditions.
  • Element Creation: Programmatically create and insert new elements.
  • Text Searching and Highlighting: Search for specific text and highlight matches.
  • Error Handling: Improve error messages and context for better debugging.
  • Advanced Rendering Options: Control over rendering, indentation, and minification.
  • Event Handling: Attach event listeners to elements and respond to interactions.
  • Performance Optimization: Optimize parsing and manipulation for efficiency.
  • Data Extraction: Extract data from structured documents automatically.
  • XPath Support: Add support for XPath expressions as an alternative querying method.

Please note that the roadmap is subject to change based on feedback, contributions, and evolving requirements. Contributions and ideas from golang community are highly encouraged to shape the future development of GopherParse.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type GopherParse

type GopherParse struct {
	// contains filtered or unexported fields
}

GopherParse represents a GopherParse object, similar to Cheerio.

func LoadHTML

func LoadHTML(htmlContent string) (*GopherParse, error)

LoadHTML parses an HTML document and returns a GopherParse object.

func LoadHTMLFile

func LoadHTMLFile(filePath string) (*GopherParse, error)

LoadHTMLFile reads an HTML file from the given path and returns a GopherParse object.

func LoadXML

func LoadXML(xmlContent string) (*GopherParse, error)

LoadXML parses an XML document and returns a GopherParse object.

func LoadXMLFile

func LoadXMLFile(filePath string) (*GopherParse, error)

LoadXMLFile reads an XML file from the given path and returns a GopherParse object.

func (*GopherParse) AddAttr

func (gp *GopherParse) AddAttr(tagName, key, value string)

AddAttr adds an attribute with the specified key-value pair to all elements with the given tag name within the GopherParse object.

func (*GopherParse) FindByClass

func (gp *GopherParse) FindByClass(className string) []*html.Node

FindByClass finds all elements with the specified class name within the GopherParse object.

func (*GopherParse) FindByTag

func (gp *GopherParse) FindByTag(tagName string) []*html.Node

FindByTag finds all elements with the specified tag name within the GopherParse object.

func (*GopherParse) ModifyAttr

func (gp *GopherParse) ModifyAttr(tagName, key, newValue string)

ModifyAttr modifies the value of the attribute with the specified key for all elements with the given tag name within the GopherParse object.

func (*GopherParse) RemoveAttr

func (gp *GopherParse) RemoveAttr(tagName, key string)

RemoveAttr removes the attribute with the specified key from all elements with the given tag name within the GopherParse object.

func (*GopherParse) Render

func (gp *GopherParse) Render() string

Render renders the GopherParse object into an HTML string.

func (*GopherParse) SetText

func (gp *GopherParse) SetText(tagName, text string)

SetText sets the text content of all elements with the specified tag name within the GopherParse object.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL