googlesearch

package
v0.15.0-beta Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 25, 2024 License: MIT Imports: 14 Imported by: 0

README

---
title: "Google Search"
lang: "en-US"
draft: false
description: "Learn about how to set up a VDP Google Search connector https://github.com/instill-ai/instill-core"
---

The Google Search component is a data connector that allows users to leverage the Google Search engine.
It can carry out the following tasks:

- [Search](#search)

## Release Stage

`Alpha`

## Configuration

The component configuration is defined and maintained [here](https://github.com/instill-ai/component/blob/main/pkg/connector/googlesearch/v0/config/definition.json).

## Connection

| Field | Field ID | Type | Note |
| :--- | :--- | :--- | :--- |
| API Key (required) | `api_key` | string | API Key for the Google Custom Search API. You can create one here: https://developers.google.com/custom-search/v1/overview#api_key |
| Search Engine ID (required) | `cse_id` | string | ID of the Search Engine to use. Before using the Custom Search JSON API you will first need to create and configure your Programmable Search Engine. If you have not already created a Programmable Search Engine, you can start by visiting the Programmable Search Engine control panel https://programmablesearchengine.google.com/controlpanel/all. You can find this in the URL of your Search Engine. For example, if the URL of your search engine is https://cse.google.com/cse.js?cx=012345678910, the ID value is: 012345678910 |

## Supported Tasks

### Search

Search data via Google Search Engine.

| Input | ID | Type | Description |
| :--- | :--- | :--- | :--- |
| Task ID (required) | `task` | string | `TASK_SEARCH` |
| Query (required) | `query` | string | The search query for Google |
| Top K | `top_k` | integer | The number of results to return for each query |
| Include Link Text | `include_link_text` | boolean | Indicate whether to scrape the link and include the text of the link associated with this search result in the 'link_text' field |
| Include Link HTML | `include_link_html` | boolean | Indicate whether to scrape the link and include the raw HTML of the link associated with this search result in the 'link_html' field |

| Output | ID | Type | Description |
| :--- | :--- | :--- | :--- |
| Results | `results` | array[object] | The returned search results from Google |

Documentation

Index

Constants

View Source
const (
	// MaxResultsPerPage is the default max number of search results per page
	MaxResultsPerPage = 10
	// MaxResults is the maximum number of search results
	MaxResults = 100
)

Variables

This section is empty.

Functions

func Init

func Init(l *zap.Logger, u base.UsageHandler) *connector

func Min

func Min(x, y int) int

Min returns the smaller of x or y.

func NewService

func NewService(apiKey string) (*customsearch.Service, error)

NewService creates a Google custom search service

Types

type Result

type Result struct {
	// Title: The title of the search result, in plain text.
	Title string `json:"title"`

	// Link: The full URL to which the search result is pointing, e.g.
	// http://www.example.com/foo/bar.
	Link string `json:"link"`

	// Snippet: The snippet of the search result, in plain text.
	Snippet string `json:"snippet"`

	// LinkText: The scraped text of the search web page result, in plain text.
	LinkText string `json:"link_text"`

	// LinkHTML: The full raw HTML of the search web page result.
	LinkHTML string `json:"link_html"`
}

type SearchInput

type SearchInput struct {
	// Query: The search query.
	Query string `json:"query"`

	// TopK: The number of search results to return.
	TopK *int `json:"top_k,omitempty"`

	// IncludeLinkText: Whether to include the scraped text of the search web page result.
	IncludeLinkText *bool `json:"include_link_text,omitempty"`

	// IncludeLinkHTML: Whether to include the scraped HTML of the search web page result.
	IncludeLinkHTML *bool `json:"include_link_html,omitempty"`
}

SearchInput defines the input of the search task

type SearchOutput

type SearchOutput struct {
	// Results: The search results.
	Results []*Result `json:"results"`
}

SearchOutput defines the output of the search task

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL