stabilityai

package
v0.15.0-beta Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 25, 2024 License: MIT Imports: 11 Imported by: 0

README

---
title: "Stability AI"
lang: "en-US"
draft: false
description: "Learn about how to set up a VDP Stability AI connector https://github.com/instill-ai/instill-core"
---

The Stability AI component is an AI connector that allows users to connect the AI models served on the Stability AI Platform.
It can carry out the following tasks:

- [Text To Image](#text-to-image)
- [Image To Image](#image-to-image)

## Release Stage

`Alpha`

## Configuration

The component configuration is defined and maintained [here](https://github.com/instill-ai/component/blob/main/pkg/connector/stabilityai/v0/config/definition.json).

## Connection

| Field | Field ID | Type | Note |
| :--- | :--- | :--- | :--- |
| API Key (required) | `api_key` | string | Fill your Stability AI API key. To find your keys, visit - https://platform.stability.ai/account/keys |

## Supported Tasks

### Text To Image

Generate a new image from a text prompt.

| Input | ID | Type | Description |
| :--- | :--- | :--- | :--- |
| Task ID (required) | `task` | string | `TASK_TEXT_TO_IMAGE` |
| Engine (required) | `engine` | string | Stability AI Engine (model) to be used. |
| Prompts (required) | `prompts` | array[string] | An array of prompts to use for generation. |
| Weights | `weights` | array[number] | An array of weights to use for generation. |
| CFG Scale | `cfg_scale` | number | How strictly the diffusion process adheres to the prompt text (higher values keep your image closer to your prompt) |
| Clip Guidance Preset | `clip_guidance_preset` | string | Clip guidance preset |
| Height | `height` | integer | The image height |
| Width | `width` | integer | The image width |
| Sampler | `sampler` | string | Which sampler to use for the diffusion process. If this value is omitted we'll automatically select an appropriate sampler for you. |
| Samples | `samples` | integer | Number of images to generate |
| Seed | `seed` | integer | Random noise seed (omit this option or use `0` for a random seed) |
| Steps | `steps` | integer | Number of diffusion steps to run. |
| Style Preset | `style_preset` | string | Pass in a style preset to guide the image model towards a particular style. This list of style presets is subject to change. |

| Output | ID | Type | Description |
| :--- | :--- | :--- | :--- |
| Images | `images` | array[string] | Generated images |
| Seeds | `seeds` | array[number] | Seeds of generated images |

### Image To Image

Modify an image based on a text prompt.

| Input | ID | Type | Description |
| :--- | :--- | :--- | :--- |
| Task ID (required) | `task` | string | `TASK_IMAGE_TO_IMAGE` |
| Engine (required) | `engine` | string | Stability AI Engine (model) to be used. |
| Prompts (required) | `prompts` | array[string] | An array of prompts to use for generation. |
| Init Image | `init_image` | string | Image used to initialize the diffusion process, in lieu of random noise. |
| Weights | `weights` | array[number] | An array of weights to use for generation. If unspecified, the model will automatically assign a default weight of 1.0 to each prompt. |
| Clip Guidance Preset | `clip_guidance_preset` | string | Clip guidance preset |
| Image Strength | `image_strength` | number | How much influence the `init_image` has on the diffusion process. Values close to `1` will yield images very similar to the `init_image` while values close to `0` will yield images wildly different than the `init_image`. The behavior of this is meant to mirror DreamStudio's "Image Strength" slider.  <br/> <br/> This parameter is just an alternate way to set `step_schedule_start`, which is done via the calculation `1 - image_strength`. For example, passing in an Image Strength of 35% (`0.35`) would result in a `step_schedule_start` of `0.65`.  |
| Cfg Scale | `cfg_scale` | number | How strictly the diffusion process adheres to the prompt text (higher values keep your image closer to your prompt) |
| Init Image Mode | `init_image_mode` | string | Whether to use `image_strength` or `step_schedule_*` to control how much influence the `init_image` has on the result. |
| Sampler | `sampler` | string | Which sampler to use for the diffusion process. If this value is omitted we'll automatically select an appropriate sampler for you. |
| Samples | `samples` | integer | Number of images to generate |
| Seed | `seed` | integer | Random noise seed (omit this option or use `0` for a random seed) |
| Step Schedule Start | `step_schedule_start` | number | Skips a proportion of the start of the diffusion steps, allowing the init_image to influence the final generated image.  Lower values will result in more influence from the init_image, while higher values will result in more influence from the diffusion steps.  (e.g. a value of `0` would simply return you the init_image, where a value of `1` would return you a completely different image.) |
| Step Schedule End | `step_schedule_end` | number | Skips a proportion of the end of the diffusion steps, allowing the init_image to influence the final generated image.  Lower values will result in more influence from the init_image, while higher values will result in more influence from the diffusion steps. |
| Steps | `steps` | integer | Number of diffusion steps to run. |
| Style Preset | `style_preset` | string | Pass in a style preset to guide the image model towards a particular style. This list of style presets is subject to change. |

| Output | ID | Type | Description |
| :--- | :--- | :--- | :--- |
| Images | `images` | array[string] | Generated images |
| Seeds | `seeds` | array[number] | Seeds of generated images |

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Init

func Init(l *zap.Logger, u base.UsageHandler) *connector

Types

type Engine

type Engine struct {
	Description string `json:"description"`
	ID          string `json:"id"`
	Name        string `json:"name"`
	Type        string `json:"type"`
}

Engine represents a Stability AI Engine.

type Image

type Image struct {
	Base64       string `json:"base64"`
	Seed         uint32 `json:"seed"`
	FinishReason string `json:"finishReason"`
}

Image represents a single image.

type ImageTaskRes

type ImageTaskRes struct {
	Images []Image `json:"artifacts"`
}

ImageTaskRes represents the response body for text-to-image API.

type ImageToImageInput

type ImageToImageInput struct {
	Task               string     `json:"task"`
	Engine             string     `json:"engine"`
	Prompts            []string   `json:"prompts"`
	InitImage          string     `json:"init_image"`
	Weights            *[]float64 `json:"weights,omitempty"`
	InitImageMode      *string    `json:"init_image_mode,omitempty"`
	ImageStrength      *float64   `json:"image_strength,omitempty"`
	StepScheduleStart  *float64   `json:"step_schedule_start,omitempty"`
	StepScheduleEnd    *float64   `json:"step_schedule_end,omitempty"`
	CfgScale           *float64   `json:"cfg_scale,omitempty"`
	ClipGuidancePreset *string    `json:"clip_guidance_preset,omitempty"`
	Sampler            *string    `json:"sampler,omitempty"`
	Samples            *uint32    `json:"samples,omitempty"`
	Seed               *uint32    `json:"seed,omitempty"`
	Steps              *uint32    `json:"steps,omitempty"`
	StylePreset        *string    `json:"style_preset,omitempty"`
}

type ImageToImageOutput

type ImageToImageOutput struct {
	Images []string `json:"images"`
	Seeds  []uint32 `json:"seeds"`
}

type ImageToImageReq

type ImageToImageReq struct {
	TextPrompts        []TextPrompt `json:"text_prompts" om:"texts[:]"`
	InitImage          string       `json:"init_image" om:"images[0]"`
	CFGScale           *float64     `json:"cfg_scale,omitempty" om:"metadata.cfg_scale"`
	ClipGuidancePreset *string      `json:"clip_guidance_preset,omitempty" om:"metadata.clip_guidance_preset"`
	Sampler            *string      `json:"sampler,omitempty" om:"metadata.sampler"`
	Samples            *uint32      `json:"samples,omitempty" om:"metadata.samples"`
	Seed               *uint32      `json:"seed,omitempty" om:"metadata.seed"`
	Steps              *uint32      `json:"steps,omitempty" om:"metadata.steps"`
	StylePreset        *string      `json:"style_preset,omitempty" om:"metadata.style_preset"`
	InitImageMode      *string      `json:"init_image_mode,omitempty" om:"metadata.init_image_mode"`
	ImageStrength      *float64     `json:"image_strength,omitempty" om:"metadata.image_strength"`
	StepScheduleStart  *float64     `json:"step_schedule_start,omitempty" om:"metadata.step_schedule_start"`
	StepScheduleEnd    *float64     `json:"step_schedule_end,omitempty" om:"metadata.step_schedule_end"`
	// contains filtered or unexported fields
}

ImageToImageReq represents the request body for image-to-image API

type TextPrompt

type TextPrompt struct {
	Text   string   `json:"text" om:"."`
	Weight *float64 `json:"weight"`
}

TextPrompt holds a prompt's text and its weight.

type TextToImageInput

type TextToImageInput struct {
	Task               string     `json:"task"`
	Prompts            []string   `json:"prompts"`
	Engine             string     `json:"engine"`
	Weights            *[]float64 `json:"weights,omitempty"`
	Height             *uint32    `json:"height,omitempty"`
	Width              *uint32    `json:"width,omitempty"`
	CfgScale           *float64   `json:"cfg_scale,omitempty"`
	ClipGuidancePreset *string    `json:"clip_guidance_preset,omitempty"`
	Sampler            *string    `json:"sampler,omitempty"`
	Samples            *uint32    `json:"samples,omitempty"`
	Seed               *uint32    `json:"seed,omitempty"`
	Steps              *uint32    `json:"steps,omitempty"`
	StylePreset        *string    `json:"style_preset,omitempty"`
}

type TextToImageOutput

type TextToImageOutput struct {
	Images []string `json:"images"`
	Seeds  []uint32 `json:"seeds"`
}

type TextToImageReq

type TextToImageReq struct {
	TextPrompts        []TextPrompt `json:"text_prompts" om:"texts[:]"`
	CFGScale           *float64     `json:"cfg_scale,omitempty" om:"metadata.cfg_scale"`
	ClipGuidancePreset *string      `json:"clip_guidance_preset,omitempty" om:"metadata.clip_guidance_preset"`
	Sampler            *string      `json:"sampler,omitempty" om:"metadata.sampler"`
	Samples            *uint32      `json:"samples,omitempty" om:"metadata.samples"`
	Seed               *uint32      `json:"seed,omitempty" om:"metadata.seed"`
	Steps              *uint32      `json:"steps,omitempty" om:"metadata.steps"`
	StylePreset        *string      `json:"style_preset,omitempty" om:"metadata.style_preset"`
	Height             *uint32      `json:"height,omitempty" om:"metadata.height"`
	Width              *uint32      `json:"width,omitempty" om:"metadata.width"`
	// contains filtered or unexported fields
}

TextToImageReq represents the request body for text-to-image API

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL