Documentation ¶
Index ¶
- func CloneRepository(domain Domain, hostname, name, gitURL, index string) error
- func GetAllBlackListedRepos() map[string]string
- func GetClients() map[string]ClientAPI
- func IsRepoInBlackList(repoURL string) bool
- func RegisterClientAPIs()
- func SaveToFile(domain Domain, hostname string, name string, data []byte, index string) error
- func WalkMatch(root, pattern string) ([]string, error)
- type Bitbucket
- type BitbucketRepo
- type Blacklist
- type ClientAPI
- type Crawler
- func (c *Crawler) CrawlOrg(orgURL string, domain *Domain, pa PA)
- func (c *Crawler) CrawlPublisher(pa PA)
- func (c *Crawler) CrawlPublishers(publishers []PA) ([]string, error)
- func (c *Crawler) CrawlRepo(repoURL string, pa PA) error
- func (c *Crawler) DeleteByQueryFromES(search string) error
- func (c *Crawler) ExportForJekyll() error
- func (c *Crawler) KnownHost(link string) (*Domain, error)
- func (c *Crawler) ProcessRepo(repository Repository)
- func (c *Crawler) ProcessRepositories(repos chan Repository)
- type Domain
- type GeneratorAPIURL
- type GithubFiles
- type GithubOrgs
- type GithubRepo
- type Links
- type OrganizationHandler
- type Owner
- type PA
- type Range
- type Ranges
- type RangesData
- type Repo
- type Repository
- type SingleRepoHandler
- type Whitelist
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func CloneRepository ¶
CloneRepository clone the repository into DATADIR/repos/<hostname>/<vendor>/<repo>/gitClone
func GetAllBlackListedRepos ¶
GetAllBlackListedRepos return all blacklisted repositories
func GetClients ¶
GetClients returns a list of all registered clientAPI.
func IsRepoInBlackList ¶
IsRepoInBlackList checks whether a repo is in blacklist
func RegisterClientAPIs ¶
func RegisterClientAPIs()
RegisterClientAPIs register all the client APIs for all the clients.
func SaveToFile ¶
SaveToFile save the chosen <file_name> in DATADIR/repos/<source>/<vendor>/<repo>/<crawler_timestamp>_<file_name>.
Types ¶
type Bitbucket ¶
type Bitbucket struct { Pagelen int `json:"pagelen"` Values []struct { Scm string `json:"scm"` Website string `json:"website"` HasWiki bool `json:"has_wiki"` Name string `json:"name"` Links Links `json:"links"` ForkPolicy string `json:"fork_policy"` UUID string `json:"uuid"` Language string `json:"language"` CreatedOn string `json:"created_on"` Mainbranch struct { Type string `json:"type"` Name string `json:"name"` } `json:"mainbranch"` FullName string `json:"full_name"` HasIssues bool `json:"has_issues"` Owner struct { Username string `json:"username"` DisplayName string `json:"display_name"` Type string `json:"type"` UUID string `json:"uuid"` Links struct { Self struct { Href string `json:"href"` } `json:"self"` HTML struct { Href string `json:"href"` } `json:"html"` Avatar struct { Href string `json:"href"` } `json:"avatar"` } `json:"links"` } `json:"owner"` UpdatedOn string `json:"updated_on"` Size int `json:"size"` Type string `json:"type"` Slug string `json:"slug"` IsPrivate bool `json:"is_private"` Description string `json:"description"` Project struct { Key string `json:"key"` Type string `json:"type"` UUID string `json:"uuid"` Links struct { Self struct { Href string `json:"href"` } `json:"self"` HTML struct { Href string `json:"href"` } `json:"html"` Avatar struct { Href string `json:"href"` } `json:"avatar"` } `json:"links"` Name string `json:"name"` } `json:"project,omitempty"` Parent struct { Links struct { Self struct { Href string `json:"href"` } `json:"self"` HTML struct { Href string `json:"href"` } `json:"html"` Avatar struct { Href string `json:"href"` } `json:"avatar"` } `json:"links"` Type string `json:"type"` Name string `json:"name"` FullName string `json:"full_name"` UUID string `json:"uuid"` } `json:"parent,omitempty"` } `json:"values"` Next string `json:"next"` }
Bitbucket is the complete response for the Bitbucket all repositories list.
type BitbucketRepo ¶
type BitbucketRepo struct { Scm string `json:"scm"` Website string `json:"website"` HasWiki bool `json:"has_wiki"` Name string `json:"name"` Links Links `json:"links"` ForkPolicy string `json:"fork_policy"` UUID string `json:"uuid"` Language string `json:"language"` CreatedOn time.Time `json:"created_on"` Mainbranch struct { Type string `json:"type"` Name string `json:"name"` } `json:"mainbranch"` FullName string `json:"full_name"` HasIssues bool `json:"has_issues"` Owner struct { Username string `json:"username"` DisplayName string `json:"display_name"` Type string `json:"type"` UUID string `json:"uuid"` Links struct { Self struct { Href string `json:"href"` } `json:"self"` HTML struct { Href string `json:"href"` } `json:"html"` Avatar struct { Href string `json:"href"` } `json:"avatar"` } `json:"links"` } `json:"owner"` UpdatedOn time.Time `json:"updated_on"` Size int `json:"size"` Type string `json:"type"` Slug string `json:"slug"` IsPrivate bool `json:"is_private"` Description string `json:"description"` }
BitbucketRepo is the complete response for the Bitbucket single repository.
type Blacklist ¶
type Blacklist struct {
Repos []Repo `yaml:"repos"`
}
Blacklist contain a list of blocked repositories.
type ClientAPI ¶
type ClientAPI struct { Organization OrganizationHandler Single SingleRepoHandler APIURL GeneratorAPIURL }
ClientAPI contains all the API function in a single Client.
type Crawler ¶
type Crawler struct { DryRun bool // contains filtered or unexported fields }
Crawler is a helper class representing a crawler.
func NewCrawler ¶
NewCrawler initializes a new Crawler object, updates the IPA list and connects to Elasticsearch (if dryRun == false).
func (*Crawler) CrawlOrg ¶
CrawlOrg fetches all the repositories belonging to an org and crawls them.
func (*Crawler) CrawlPublisher ¶
CrawlPublisher delegates the work to single PA crawlers.
func (*Crawler) CrawlPublishers ¶
CrawlPublishers processes a list of publishers.
func (*Crawler) DeleteByQueryFromES ¶
DeleteByQueryFromES delete record from elasticsearch that will match search string for publiccode.url field
func (*Crawler) ExportForJekyll ¶
ExportForJekyll exports YAML data files for the Jekyll website.
func (*Crawler) KnownHost ¶
KnownHost detect the the right Domain API from the given URL and returns it. If no API is recognized will return an empty domain and an error.
func (*Crawler) ProcessRepo ¶
func (c *Crawler) ProcessRepo(repository Repository)
ProcessRepo looks for a publiccode.yml file in a repository, and if found it processes it.
func (*Crawler) ProcessRepositories ¶
func (c *Crawler) ProcessRepositories(repos chan Repository)
ProcessRepositories process the repositories channel and check the availability of the file.
type Domain ¶
type Domain struct { // Domains.yml data Host string `yaml:"host"` UseTokenFor []string `yaml:"use-token-for"` BasicAuth []string `yaml:"basic-auth"` }
Domain is a single code hosting service.
func ReadAndParseDomains ¶
ReadAndParseDomains read domainsFile and return the parsed content in a Domain slice.
type GeneratorAPIURL ¶
GeneratorAPIURL returns the url in the api correct ecosystem.
func GenerateBitbucketAPIURL ¶
func GenerateBitbucketAPIURL() GeneratorAPIURL
GenerateBitbucketAPIURL returns the api url of given Bitbucket organization link. IN: https://bitbucket.org/Soft OUT:https://api.bitbucket.org/2.0/repositories/Soft?pagelen=100
func GenerateGithubAPIURL ¶
func GenerateGithubAPIURL() GeneratorAPIURL
GenerateGithubAPIURL returns the api url of given Gitlab organization link. IN: https://github.com/italia OUT:https://api.github.com/orgs/italia/repos,https://api.github.com/users/italia/repos
func GenerateGitlabAPIURL ¶
func GenerateGitlabAPIURL() GeneratorAPIURL
GenerateGitlabAPIURL returns the api url of given Gitlab organization link. IN: https://gitlab.org/blockninja OUT:https://gitlab.com/api/v4/groups/blockninja
func GetAPIURL ¶
func GetAPIURL(clientAPI string) (GeneratorAPIURL, error)
GetAPIURL checks if the API client for the requested API url exists and return its handler.
type GithubFiles ¶
type GithubFiles []struct { Name string `json:"name"` Path string `json:"path"` Sha string `json:"sha"` Size int `json:"size"` URL string `json:"url"` HTMLURL string `json:"html_url"` GitURL string `json:"git_url"` DownloadURL string `json:"download_url"` Type string `json:"type"` Links struct { Self string `json:"self"` Git string `json:"git"` HTML string `json:"html"` } `json:"_links"` }
GithubFiles is a list of files in repository
type GithubOrgs ¶
type GithubOrgs []struct { ID int `json:"id"` Name string `json:"name"` FullName string `json:"full_name"` Owner Owner `json:"owner"` Private bool `json:"private"` HTMLURL string `json:"html_url"` Description string `json:"description"` Fork bool `json:"fork"` URL string `json:"url"` ForksURL string `json:"forks_url"` KeysURL string `json:"keys_url"` CollaboratorsURL string `json:"collaborators_url"` TeamsURL string `json:"teams_url"` HooksURL string `json:"hooks_url"` IssueEventsURL string `json:"issue_events_url"` EventsURL string `json:"events_url"` AssigneesURL string `json:"assignees_url"` BranchesURL string `json:"branches_url"` TagsURL string `json:"tags_url"` BlobsURL string `json:"blobs_url"` GitTagsURL string `json:"git_tags_url"` GitRefsURL string `json:"git_refs_url"` TreesURL string `json:"trees_url"` StatusesURL string `json:"statuses_url"` LanguagesURL string `json:"languages_url"` StargazersURL string `json:"stargazers_url"` ContributorsURL string `json:"contributors_url"` SubscribersURL string `json:"subscribers_url"` SubscriptionURL string `json:"subscription_url"` CommitsURL string `json:"commits_url"` GitCommitsURL string `json:"git_commits_url"` CommentsURL string `json:"comments_url"` IssueCommentURL string `json:"issue_comment_url"` ContentsURL string `json:"contents_url"` CompareURL string `json:"compare_url"` MergesURL string `json:"merges_url"` ArchiveURL string `json:"archive_url"` DownloadsURL string `json:"downloads_url"` IssuesURL string `json:"issues_url"` PullsURL string `json:"pulls_url"` MilestonesURL string `json:"milestones_url"` NotificationsURL string `json:"notifications_url"` LabelsURL string `json:"labels_url"` ReleasesURL string `json:"releases_url"` DeploymentsURL string `json:"deployments_url"` CreatedAt time.Time `json:"created_at"` UpdatedAt time.Time `json:"updated_at"` PushedAt time.Time `json:"pushed_at"` GitURL string `json:"git_url"` SSHURL string `json:"ssh_url"` CloneURL string `json:"clone_url"` SvnURL string `json:"svn_url"` Homepage string `json:"homepage"` Size int `json:"size"` StargazersCount int `json:"stargazers_count"` WatchersCount int `json:"watchers_count"` Language string `json:"language"` HasIssues bool `json:"has_issues"` HasProjects bool `json:"has_projects"` HasDownloads bool `json:"has_downloads"` HasWiki bool `json:"has_wiki"` HasPages bool `json:"has_pages"` ForksCount int `json:"forks_count"` MirrorURL string `json:"mirror_url"` Archived bool `json:"archived"` OpenIssuesCount int `json:"open_issues_count"` License struct { Key string `json:"key"` Name string `json:"name"` SpdxID string `json:"spdx_id"` URL string `json:"url"` } `json:"license"` Forks int `json:"forks"` OpenIssues int `json:"open_issues"` Watchers int `json:"watchers"` DefaultBranch string `json:"default_branch"` Permissions struct { Admin bool `json:"admin"` Push bool `json:"push"` Pull bool `json:"pull"` } `json:"permissions"` }
GithubOrgs is the complete result from the Github API respose for /orgs/<Name>/repos.
type GithubRepo ¶
type GithubRepo struct { ID int `json:"id"` Name string `json:"name"` FullName string `json:"full_name"` Owner Owner `json:"owner"` Private bool `json:"private"` HTMLURL string `json:"html_url"` Description string `json:"description"` Fork bool `json:"fork"` URL string `json:"url"` ForksURL string `json:"forks_url"` KeysURL string `json:"keys_url"` CollaboratorsURL string `json:"collaborators_url"` TeamsURL string `json:"teams_url"` HooksURL string `json:"hooks_url"` IssueEventsURL string `json:"issue_events_url"` EventsURL string `json:"events_url"` AssigneesURL string `json:"assignees_url"` BranchesURL string `json:"branches_url"` TagsURL string `json:"tags_url"` BlobsURL string `json:"blobs_url"` GitTagsURL string `json:"git_tags_url"` GitRefsURL string `json:"git_refs_url"` TreesURL string `json:"trees_url"` StatusesURL string `json:"statuses_url"` LanguagesURL string `json:"languages_url"` StargazersURL string `json:"stargazers_url"` ContributorsURL string `json:"contributors_url"` SubscribersURL string `json:"subscribers_url"` SubscriptionURL string `json:"subscription_url"` CommitsURL string `json:"commits_url"` GitCommitsURL string `json:"git_commits_url"` CommentsURL string `json:"comments_url"` IssueCommentURL string `json:"issue_comment_url"` ContentsURL string `json:"contents_url"` CompareURL string `json:"compare_url"` MergesURL string `json:"merges_url"` ArchiveURL string `json:"archive_url"` DownloadsURL string `json:"downloads_url"` IssuesURL string `json:"issues_url"` PullsURL string `json:"pulls_url"` MilestonesURL string `json:"milestones_url"` NotificationsURL string `json:"notifications_url"` LabelsURL string `json:"labels_url"` ReleasesURL string `json:"releases_url"` DeploymentsURL string `json:"deployments_url"` CreatedAt time.Time `json:"created_at"` UpdatedAt time.Time `json:"updated_at"` PushedAt time.Time `json:"pushed_at"` GitURL string `json:"git_url"` SSHURL string `json:"ssh_url"` CloneURL string `json:"clone_url"` SvnURL string `json:"svn_url"` Homepage string `json:"homepage"` Size int `json:"size"` StargazersCount int `json:"stargazers_count"` WatchersCount int `json:"watchers_count"` Language string `json:"language"` HasIssues bool `json:"has_issues"` HasProjects bool `json:"has_projects"` HasDownloads bool `json:"has_downloads"` HasWiki bool `json:"has_wiki"` HasPages bool `json:"has_pages"` ForksCount int `json:"forks_count"` MirrorURL interface{} `json:"mirror_url"` Archived bool `json:"archived"` OpenIssuesCount int `json:"open_issues_count"` License interface{} `json:"license"` Forks int `json:"forks"` OpenIssues int `json:"open_issues"` Watchers int `json:"watchers"` DefaultBranch string `json:"default_branch"` NetworkCount int `json:"network_count"` SubscribersCount int `json:"subscribers_count"` }
GithubRepo is a complete result from the Github API respose for a single repository.
type Links ¶
type Links struct { Watchers struct { Href string `json:"href"` } `json:"watchers"` Branches struct { Href string `json:"href"` } `json:"branches"` Tags struct { Href string `json:"href"` } `json:"tags"` Commits struct { Href string `json:"href"` } `json:"commits"` Clone []struct { Href string `json:"href"` Name string `json:"name"` } `json:"clone"` Self struct { Href string `json:"href"` } `json:"self"` Source struct { Href string `json:"href"` } `json:"source"` HTML struct { Href string `json:"href"` } `json:"html"` Avatar struct { Href string `json:"href"` } `json:"avatar"` Hooks struct { Href string `json:"href"` } `json:"hooks"` Forks struct { Href string `json:"href"` } `json:"forks"` Downloads struct { Href string `json:"href"` } `json:"downloads"` Pullrequests struct { Href string `json:"href"` } `json:"pullrequests"` }
Links is the list of Links associated to the repository.
type OrganizationHandler ¶
type OrganizationHandler func(domain Domain, url string, repositories chan Repository, pa PA) (string, error)
OrganizationHandler returns the client handler for an organization/team/group page (every domain has a different handler implementation).
func GetClientAPICrawler ¶
func GetClientAPICrawler(clientAPI string) (OrganizationHandler, error)
GetClientAPICrawler checks if the API client for the requested organization clientAPI exists and return its handler.
func RegisterBitbucketAPI ¶
func RegisterBitbucketAPI() OrganizationHandler
RegisterBitbucketAPI register the crawler function for Bitbucket API.
func RegisterGithubAPI ¶
func RegisterGithubAPI() OrganizationHandler
RegisterGithubAPI register the crawler function for Github API. It get the list of repositories on "link" url. If a next page is available return its url. Otherwise returns an empty ("") string.
func RegisterGitlabAPI ¶
func RegisterGitlabAPI() OrganizationHandler
RegisterGitlabAPI register the crawler function for Gitlab API.
type Owner ¶
type Owner struct { Login string `json:"login"` ID int `json:"id"` AvatarURL string `json:"avatar_url"` GravatarID string `json:"gravatar_id"` URL string `json:"url"` HTMLURL string `json:"html_url"` FollowersURL string `json:"followers_url"` FollowingURL string `json:"following_url"` GistsURL string `json:"gists_url"` StarredURL string `json:"starred_url"` SubscriptionsURL string `json:"subscriptions_url"` OrganizationsURL string `json:"organizations_url"` ReposURL string `json:"repos_url"` EventsURL string `json:"events_url"` ReceivedEventsURL string `json:"received_events_url"` Type string `json:"type"` SiteAdmin bool `json:"site_admin"` }
Owner of the repository.
type PA ¶
type PA struct { Name string `yaml:"name"` CodiceIPA string `yaml:"codice-iPA"` Organizations []string `yaml:"orgs"` Repositories []string `yaml:"repos"` UnknownIPA bool `yaml:"unknown-iPA"` }
PA is a Public Administration.
func ReadAndParseWhitelist ¶
ReadAndParseWhitelist read the whitelist and return the parsed content in a slice of PA.
type Ranges ¶
Ranges are the ranges for a specific parameter (userCommunity, codeActivity, releaseHistory, longevity).
type RangesData ¶
type RangesData []Ranges
RangesData contains the data loaded from vitality-ranges.yml
type Repo ¶
type Repo struct { URL string `yaml:"url"` Reason string `yaml:"reason"` Description string `yaml:"description"` }
Repo matches a single repository.
func ReadAndParseBlacklist ¶
ReadAndParseBlacklist read the blacklist and return the parsed content in a slice of PA.
type Repository ¶
type Repository struct { Name string Hostname string FileRawURL string GitCloneURL string GitBranch string Domain Domain Pa PA Headers map[string]string Metadata []byte }
Repository is a single code repository. FileRawURL contains the direct url to the raw file.
func (*Repository) CalculateRepoActivity ¶
CalculateRepoActivity return the repository activity index and the vitality slice calculated on the git clone. It follows the document https://lg-acquisizione-e-riuso-software-per-la-pa.readthedocs.io/ In reference to section: 2.5.2. Fase 2.2: Valutazione soluzioni riusabili per la PA
type SingleRepoHandler ¶
type SingleRepoHandler func(domain Domain, url string, repositories chan Repository, pa PA) error
SingleRepoHandler returns the client handler for an a single repository (every domain has a different handler implementation).
func GetSingleClientAPICrawler ¶
func GetSingleClientAPICrawler(clientAPI string) (SingleRepoHandler, error)
GetSingleClientAPICrawler checks if the API client for the requested single repository clientAPI exists and return its handler.
func RegisterSingleBitbucketAPI ¶
func RegisterSingleBitbucketAPI() SingleRepoHandler
RegisterSingleBitbucketAPI register the crawler function for single Bitbucket repository.
func RegisterSingleGithubAPI ¶
func RegisterSingleGithubAPI() SingleRepoHandler
RegisterSingleGithubAPI register the crawler function for single repository Github API. Return nil if the repository was successfully added to repositories channel. Otherwise return the generated error.
func RegisterSingleGitlabAPI ¶
func RegisterSingleGitlabAPI() SingleRepoHandler
RegisterSingleGitlabAPI register the crawler function for single Bitbucket API.