crawlptt

package module
v0.0.0-...-b994a3c Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 31, 2018 License: MIT Imports: 4 Imported by: 1

README

crawlptt

Package crawptt provides a simple crawler to crawl ptt articles.

Installation

To install, simply run in a terminal:

go get github.com/iGene/crawlptt

Usage

The following example shows how to crawl index of first 6 pages from Gossiping Board and article content of the first article in the index. It also prints out all crawled data.

package main

import (
	"fmt"

	"github.com/iGene/crawlptt"
)

func main() {
	postList, err := crawlptt.GetPostInfo("Gossiping", 5)
	if err != nil {
		fmt.Printf("Error : %v\n", err)
		return
	}
	for _, p := range postList {
		fmt.Printf("%v\n", p)
	}
	post, err := crawlptt.GetPost(postList[0].Link)
	if err != nil {
		fmt.Printf("Error : %v\n", err)
		return
	}
	fmt.Printf("%v", post.Content)
}

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type PttPost

type PttPost struct {
	Content string
}

PttPost includes content of post

func GetPost

func GetPost(url string) (post *PttPost, err error)

GetPost parse post content from post url

type PttPostInfo

type PttPostInfo struct {
	Author string
	Title  string
	Link   string
}

PttPostInfo includes information of post

func GetPostInfo

func GetPostInfo(board string, pages int) (post []*PttPostInfo, err error)

GetPostInfo parse list of post from certain board for certain pages

func GetPostInfoURL

func GetPostInfoURL(url string, pages int) (post []*PttPostInfo, err error)

GetPostInfoURL parse list of post from post index url

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL