Using S3
Learn how to use objects from S3 in ThanoSQL functions
Objects from your S3 buckets can be used directly without having to be exported first. You can simply pass their URI as an input to ThanoSQL functions.
Prerequisites
Before you begin, ensure you have the following:
- A basic understanding of SQL and AWS.
- A valid AWS access key ID and secret access key.
- Access to a ThanoSQL Workspace and an S3 object.
Using a Credentials File
Overview
You have to keep a credentials file in the root folder of your workspace. You only need to set this up once, and then you can freely use S3 URIs in ThanoSQL functions just like any other paths or links. As long as as you have this file, no modifications are needed in your code and queries.
How to Set Up
Step 1: Prepare Your Environment
- Sign In: Log in to your ThanoSQL account.
- Choose a Workspace: Navigate to the dashboard and select a workspace you want to work in. For more information on workspaces, check out the workspace page.
- Start the Lab: Go to the
Lab
tab and wait for your Lab to load.
Step 2: Prepare Your Credentials
- Create File: In the root folder of your Lab, create a
.aws
folder if it does not exist already. Go to the.aws
folder and create a file calledcredentials
(without any extensions). You can either do this directly from a terminal using a Linux text editor or from the Lab UI by creating a file and then renaming it. - Provide Credentials: Fill in the
credentials
file using the following format:
Step 3: Test Run
You are all set! In order to check if your ThanoSQL Workspace-AWS S3 connection works smoothly, try running one of the examples provided below.
Example Usage
Example 1: Classifying Audio
Here is an example of how to classify an audio file stored in S3, after credentials have been set:
On execution, we get:
Example 2: Embedding a Table of Image Links
S3 URIs can also be stored in tables and used just like other objects and URLs. Suppose we have the following table, which contains a column with a mix of web URLs and S3 URIs:
ImageId | ResourceLink |
---|---|
1 | https://unsplash.com/photos/bygTaBey1Xk |
2 | s3://your-bucket/your-first-image-file.jpg |
3 | https://unsplash.com/photos/gXSFnk2a9V4 |
4 | s3://your-bucket/your-second-image-file.png |
5 | https://unsplash.com/photos/grg6-DNJuaU |
If we want to calculate the embeddings of the images like we do in Use Case 2, we can use the following query:
On execution, we get: