timenlp

package module
v0.0.0-...-fb4ecb0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 12, 2023 License: Apache-2.0 Imports: 10 Imported by: 0

README

TimeNLP 中文语句中的时间语义识别的golang版本

修复农历转阳历导致的死循环 修复下周五计算成下下周五

Go Reference Go goreleaser GitHub go.mod Go version of a Go module GoReportCard GitHub license GitHub release

使用

go get -u github.com/bububa/TimeNLP

功能说明

用于句子中时间词的抽取和转换

import (
    "log"

    "github.com/bububa/TimeNLP"
)

func main() {
    target := "Hi,all.下周一下午三点开会"
    preferFuture := true
    tn := timenlp.TimeNormalizer(preferFuture)
    ret, err := tn.Parse(target)
    if err != nil {
        log.Fatalln(err)
    }
    log.Printf("%+v\n", ret)
}

Reference

python 版本https://github.com/sunfiyes/Time-NLPY

python3 版本 https://github.com/zhanzecheng/Time_NLP

Java 版本https://github.com/shinyke/Time-NLP

PHP 版本https://github.com/crazywhalecc/Time-NLP-PHP

Javascript 版本https://github.com/JohnnieFucker/ChiTimeNLP

Documentation

Overview

Package timenlp Time-NLP的golang版本 python 版本https://github.com/sunfiyes/Time-NLPY python3 版本 https://github.com/zhanzecheng/Time_NLP Java 版本https://github.com/shinyke/Time-NLP PHP 版本https://github.com/crazywhalecc/Time-NLP-PHP Javascript 版本https://github.com/JohnnieFucker/ChiTimeNLP

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type RangeTimeEnum

type RangeTimeEnum int

RangeTimeEnum 范围时间的默认时间点

const (
	// DAY_BREAK 黎明
	DAY_BREAK RangeTimeEnum = 3
	// EARLY_MORNING 早
	EARLY_MORNING RangeTimeEnum = 8
	// MORNING 上午
	MORNING RangeTimeEnum = 10
	// NOON 中午、午间
	NOON RangeTimeEnum = 12
	// AFTERNOON 下午、午后
	AFTERNOON RangeTimeEnum = 15
	// NIGHT 晚上、傍晚
	NIGHT RangeTimeEnum = 18
	// LATE_NIGHT 晚、晚间
	LATE_NIGHT RangeTimeEnum = 20
	// MID_NIGHT 深夜
	MID_NIGHT RangeTimeEnum = 23
)

type Result

type Result struct {
	// NormalizedString 标准化后字符串
	NormalizedString string `json:"normalized_string,omitempty"`
	// Type 返回类型
	Type ResultType `json:"type,omitempty"`
	// Points 时间点
	Points []ResultPoint `json:"points,omitempty"`
}

Result 返回值

type ResultPoint

type ResultPoint struct {
	// Time 时间
	Time time.Time
	// Pos 文字位置
	Pos int `json:"pos,omitempty"`
	// Length 文字长度
	Length int `json:"length,omitempty"`
}

ResultPoint 返回值包含时间点

type ResultType

type ResultType string

ResultType 返回值类型

const (
	// DELTA 相对时间
	DELTA ResultType = "delta"
	// SPAN 时间段
	SPAN ResultType = "span"
	// TIMESTAMP 时间点
	TIMESTAMP ResultType = "timestamp"
)

type SolarTermData

type SolarTermData struct {
	// Key 索引值
	Key float64
	// Month 月份
	Month int
	// Years 年份
	Years [][]int
}

SolarTermData 阳历时间点数据

type StringPreHandler

type StringPreHandler struct{}

StringPreHandler 字符串预处理

func (StringPreHandler) DelKeyword

func (s StringPreHandler) DelKeyword(target string, rules string) string

DelKeyword 该方法删除一字符串中所有匹配某一规则字串 可用于清理一个字符串中的空白符和语气助词 :param target: 待处理字符串 :param rules: 删除规则 :return: 清理工作完成后的字符串

func (StringPreHandler) NumberTranslator

func (s StringPreHandler) NumberTranslator(target string) string

NumberTranslator 该方法可以将字符串中所有的用汉字表示的数字转化为用阿拉伯数字表示的数字 如"这里有一千两百个人,六百零五个来自中国"可以转化为 "这里有1200个人,605个来自中国" 此外添加支持了部分不规则表达方法 如两万零六百五可转化为20650 两百一十四和两百十四都可以转化为214 一六零加一五八可以转化为160+158 该方法目前支持的正确转化范围是0-99999999 该功能模块具有良好的复用性 :param target: 待转化的字符串 :return: 转化完毕后的字符串

func (StringPreHandler) WordToNum

func (s StringPreHandler) WordToNum(str string) int64

WordToNum 方法numberTranslator的辅助方法,可将[零-九]正确翻译为[0-9] :param s: 大写数字 :return: 对应的整形数,如果不是数字返回-1

type TimeNormalizer

type TimeNormalizer struct {
	// contains filtered or unexported fields
}

TimeNormalizer 时间表达式识别的主要工作类

func NewTimeNormalizer

func NewTimeNormalizer(isPreferFuture bool, timeout ...time.Duration) *TimeNormalizer

NewTimeNormalizer 新建TimeNormalizer isPreferFuture: 是否倾向使用未来时间

func (*TimeNormalizer) Parse

func (n *TimeNormalizer) Parse(target string, timeBase time.Time) (*Result, error)

Parse 是TimeNormalizer的构造方法,根据提供的待分析字符串和timeBase进行时间表达式提取

type TimePoint

type TimePoint [6]int

TimePoint 时间表达式单元规范化对应的内部类, 对应时间表达式规范化的每个字段, 六个字段分别是:年-月-日-时-分-秒, 每个字段初始化为-1

var DefaultTimePoint TimePoint = [6]int{-1, -1, -1, -1, -1, -1}

DefaultTimePoint 默认时间表达式单元

func NewTimePointFromTime

func NewTimePointFromTime(t time.Time) TimePoint

NewTimePointFromTime 基于时间新建TimePoint

func (TimePoint) ToTime

func (t TimePoint) ToTime(loc *time.Location) time.Time

ToTime 转换为time.Time

type TimeUnit

type TimeUnit struct {
	// contains filtered or unexported fields
}

TimeUnit 时间语句分析

func NewTimeUnit

func NewTimeUnit(expTime string, pos int, length int, normalizer *TimeNormalizer, tpCtx TimePoint) *TimeUnit

NewTimeUnit 新建TimeUnit

func (TimeUnit) Time

func (t TimeUnit) Time() time.Time

Time 转换为time.Time

func (TimeUnit) ToResultPoint

func (t TimeUnit) ToResultPoint() ResultPoint

ToResultPoint 转换为ResultPoint

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL