scraperlite

command module
v0.0.0-...-d0d04af Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 20, 2022 License: BSD-3-Clause Imports: 16 Imported by: 0

README

scraperlite

Scrape text and HTML based on CSS selectors and save contents to a SQLite database.

Repeated runs save changed content and the observation timestamp.

Example

scraperlite https://go.dev \
    whyGo.html 'body > header > div > nav > div > ul > li:nth-child(1)' \
    firstEventWhenWhere.txt '#event_slide0 > div.GoCarousel-eventBody > div > div.GoCarousel-eventDate'

In a sqlite3 shell:

sqlite> select t, json_extract(content, '$.firstEventWhenWhere.txt') as when_where,
  substr(json_extract(content, '$.whyGo.html'), 1, 20) || '...' as why_go_html
  from observations join contents on (id=content_id)
  order by t;
+----------------------------------+-------------------------------+-------------------------+
|                t                 |          when_where           |       why_go_html       |
+----------------------------------+-------------------------------+-------------------------+
| 2022-02-20 14:19:34.115801-04:00 | Feb 21, 2022 | Graz,  Austria | <li class="Header-me... |
+----------------------------------+-------------------------------+-------------------------+

Documentation

The Go Gopher

There is no documentation for this package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL