thanosql.embed()
embed
function in ThanoSQL is designed to generate embeddings for given input data using a pre-trained model. This function utilizes various engines to provide efficient and high-quality embeddings for text, images, and other types of data.
Parameter | Type | Default | Description | Options |
---|---|---|---|---|
engine | string | 'huggingface' | The engine to use for generating embeddings. | 'huggingface' : Uses models from HuggingFace.'thanosql' : Uses ThanoSQL’s native models.'openai' : Uses models from OpenAI. |
input | string | The input data based on which the embeddings will be generated. It can be text, a URL, an S3 URI, or a path to a local file. | N/A | |
model | string | The name or path of the pre-trained text generation model. | Example: 'openai/clip-vit-base-patch32' | |
model_args | json | None | JSON string representing additional arguments for the model. | N/A |
tokenizer_args | json | None | JSON string representing additional arguments for the tokenizer. | N/A |
token | string | None | Token for authentication if required by the model. | N/A |
base_url | string | None | Base URL to point the client to a different endpoint than the default OpenAI API endpoint. This is only applicable when the engine is openai . | N/A |
embed
function using a Hugging Face model:
embed
function to generate embeddings for an image using Hugging Face model:
embed
function using an OpenAI model:
embed
function with the base URL using the OpenAI Client:
embed
function as a standalone query:
embed
function with the huggingface
engine, ensure that only models compatible with the HuggingFace AutoModel and AutoTokenizer are used. Verify that the selected model is supported by the HuggingFace library to avoid compatibility issues. Even with compatible models, some models might still not work. We are actively working on improving compatibility and functionality to provide a better user experience. For more information, refer to the official Hugging Face documentation.