godensity

package module
v0.0.0-...-823ec5a Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 18, 2022 License: MIT Imports: 8 Imported by: 0

README

godensity

This repository is implematation of 📄 DOM based content extraction via text density and I just tested this code for Korean web pages.

📄 DOM based content extraction via text density 논문의 내용을 Go로 구현 한 것입니다. 한국어 페이지들을 대상으로 테스트 해 봤습니다.

image

How to run?

gh repo clone minarc/godensity
cd godensity
go test
image

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CTD

func CTD() float32

func Density

func Density()

func DensitySum

func DensitySum()

func Filtering

func Filtering(body *goquery.Selection)

func IsGIF

func IsGIF(src *url.URL, currentURL *url.URL) bool

func TD

func TD() float32

Types

type Heap

type Heap struct {
	// contains filtered or unexported fields
}

type Node

type Node struct {
	T float32
	// contains filtered or unexported fields
}

func DiveIntoDOM

func DiveIntoDOM(me *goquery.Selection, domain string) *Node

type Result

type Result struct {
	// contains filtered or unexported fields
}

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL