Documentation ¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
var Error404 = errors.New("Doc not found. ")
Error404 the error you get when no document was found
var ErrorNoLatestVersion = errors.New("Not latest revision.")
ERROR_NO_LATEST_VERSION error you get when trying to save an old version of a CouchDB document
Functions ¶
func IsItParsed ¶
IsItParsed checks if the given url is already parsed
func ShouldURLBeFetched ¶
ShouldURLBeFetched checks if the given url is already stored in the database
Types ¶
type CouchDoc ¶
type CouchDoc struct { ID string `json:"_id"` Rev string `json:"_rev"` URL string `json:"url"` HTML string `json:"html"` Text parse.PageStructure `json:"text"` Links []string `json:"links,omitempty"` LinksToQueue []string `json:"-"` ParsedOn time.Time `json:"parsed_on,omitempty"` FetchedOn time.Time `json:"fetched_on,omitempty"` }
CouchDoc represents a response fron CouchDB
func GetURLData ¶
GetURLData gets the data stored in Couch, does a lookup by doc id
type CouchDocCreated ¶
CouchDocCreated represents a full document
func AddURLData ¶
func AddURLData(url string, data []byte, mainURL bool) (CouchDocCreated, error)
AddURLData adds the url and data to the database. data is json encoded.
func SaveExtractedTextAndLinks ¶
func SaveExtractedTextAndLinks(id string, data []byte) (CouchDocCreated, error)
SaveExtractedTextAndLinks updates the document with extraced information like text and links
type NewSite ¶
type NewSite struct {
Site string `json:"site"`
}
NewSite is used to add a new url submitted
type StatsIndex ¶
StatsIndex Stats related to the search engine index
func IndexStats ¶
func IndexStats() *StatsIndex
IndexStats returns stats related to the index, cnt of parsed/fetched/etc