Attach cloud storage

Instructions on how to attach cloud storage using UI

In CVAT, you can use Amazon S3, Azure Blob Storage, Backblaze B2, and Google Cloud Storage storages to import and export image datasets for your tasks.

Check out:

Amazon S3

Create a bucket

To create bucket, do the following:

  1. Create an AWS account.

  2. Go to the Amazon S3 console, and select Create bucket.

    Amazon S3 interface with highlighted “Create bucket” button

  3. Specify the name and region of the bucket. You can also copy the settings of another bucket by selecting Choose bucket.

  4. Enable Block all public access. For access, you will use access key ID and secret access key.

  5. Select Create bucket.

A new bucket will appear on the list of buckets.

Upload data

You need to upload data for annotation and the manifest.jsonl file.

  1. Prepare data. For more information, refer on how to prepare the dataset.

  2. Open the bucket and select Upload.

    Amazon S3 interface with highlighted “Upload”

  3. Drag the manifest file and image folder on the page and select Upload:

Uploading data to Amazon S3

Access permissions

Authenticated access

To add access permissions, do the following:

  1. Go to IAM and select Add users.

  2. Set User name and enable Access key - programmatic access.

    Amazon S3 interface with highlighted “User name” and “Access key - programmatic access” parameters

  3. Select Next: Permissions.

  4. Select Create group, enter the group name.

  5. Use search to find and select:

    • For read-only access: AmazonS3ReadOnlyAccess.
    • For full access: AmazonS3FullAccess.

    Amazon S3 interface with creating user group step

  6. (Optional) Add tags for the user and go to the next page.

  7. Save Access key ID and Secret access key.

Amazon S3 interface with saving access credentials step

For more information, consult Creating an IAM user in your AWS account

Anonymous access

On how to grant public access to the bucket, consult Configuring block public access settings for your S3 buckets

Attach Amazon S3 storage

To attach storage, do the following:

  1. Log into CVAT and in the separate tab open your bucket page.
  2. In the CVAT, on the top menu select Cloud storages > on the opened page select +.

Fill in the following fields:

CVAT Amazon S3
Display name Preferred display name for your storage.
Description (Optional) Add description of storage.
Provider From drop-down list select Amazon S3.
Bucket name Name of the Bucket.
Authentication type Depends on the bucket setup:
  • Key id and secret access key pair: available on IAM.
  • Anonymous access: for anonymous access. Public access to the bucket must be enabled.
  • Region (Optional) Choose a region from the list or add a new one. For more information, consult Available locations.
    Prefix (Optional) Prefix is used to filter bucket content. By setting a default prefix, you ensure that only data from a specific folder in the cloud is used in CVAT. This will affect which files you see when creating a task with cloud data.
    Manifests (Optional) Select + Add manifest and enter the name of the manifest file with an extension. For example: manifest.jsonl.

    After filling in all the fields, select Submit.

    Amazon S3 manifest file

    To prepare the manifest file, do the following:

    1. Go to AWS CLI and run script for prepare manifest file.

    2. Perform the installation, following the aws-shell manual,
      You can configure credentials by running aws configure.
      You will need to enter Access Key ID and Secret Access Key as well as the region.

      aws configure
      Access Key ID: <your Access Key ID>
      Secret Access Key: <your Secret Access Key>
      
    3. Copy the content of the bucket to a folder on your computer:

      aws s3 cp <s3://bucket-name> <yourfolder> --recursive
      
    4. After copying the files, you can create a manifest file as described in prepare manifest file section:

      python <cvat repository>/utils/dataset_manifest/create.py --output-dir <yourfolder> <yourfolder>
      
    5. When the manifest file is ready, upload it to the S3 bucket:

      • For read and write permissions when you created the user, run:

        aws s3 cp <yourfolder>/manifest.jsonl <s3://bucket-name>
        
      • For read-only permissions, use the download through the browser, select upload, drag the manifest file to the page and select upload.

        Amazon S3 interface with highlighted &ldquo;Upload&rdquo;

    Video tutorial: Add Amazon S3 as Cloud Storage in CVAT

    Backblaze B2

    Backblaze B2 is an S3-compatible cloud storage service. It can be used in CVAT by selecting Amazon S3 as the provider and specifying the Backblaze B2 endpoint (for example, https://s3.us-west-004.backblazeb2.com).

    Create a bucket

    To create a B2 bucket, do the following:

    1. Create a Backblaze account or log into an existing one.
    2. In the Backblaze console, go to B2 Cloud Storage > Buckets.
    3. Select Create a Bucket.
    4. Configure your bucket:
      • Bucket Unique Name: Enter a globally unique name for your bucket.
      • Files in Bucket: Select Private for secure access or Public for anonymous access.
      • Default Encryption: (Optional) Enable server-side encryption for added security.
      • Object Lock: (Optional) Enable if you need compliance features.
    5. Select Create a Bucket.

    The new bucket will appear in your buckets list.

    Upload data

    You need to upload data for annotation and optionally the manifest.jsonl file.

    1. Prepare data. For more information, refer on how to prepare the dataset.
    2. Open the bucket and select Upload/Download.
    3. Drag and drop files or folders, or select Browse files to upload your data.

    Alternatively, you can use the Backblaze CLI or any S3-compatible tool like the AWS CLI with B2 endpoints.

    Access permissions

    To access your B2 bucket from CVAT, you need to create Application Keys:

    1. In the Backblaze console, go to App Keys.
    2. Select Add a New Application Key.
    3. Configure the key:
      • Name of Key: Enter a descriptive name (e.g., “CVAT Access”).
      • Allow access to Bucket(s): Select the specific bucket you created, or choose All for access to all buckets.
      • Type of Access: Select Read and Write for full access, or Read Only if you only need to import data.
      • Allow List All Bucket Names: Enable if you want to list all buckets.
      • File name prefix: (Optional) Restrict access to specific file prefixes.
      • Duration: (Optional) Set an expiration time for the key.
    4. Select Create New Key.
    5. Important: Save the keyID and applicationKey immediately. The applicationKey is only shown once and cannot be retrieved later.

    For more information, consult B2 Application Keys.

    Attach Backblaze B2 storage

    To attach B2 storage to CVAT, do the following:

    1. Log into CVAT.
    2. In CVAT, on the top menu select Cloud storages > on the opened page select +.

    Fill in the following fields:

    CVAT field Backblaze B2 value
    Display name Preferred display name for your storage.
    Description (Optional) Add a description of the storage.
    Provider From the drop-down list, select Amazon S3 (Backblaze B2 is S3-compatible).
    Bucket name Name of your B2 bucket.
    Authentication type Select Key ID and secret access key pair.
    Access key ID Enter the keyID from your B2 Application Key.
    Secret access key Enter the applicationKey from your B2 Application Key.
    Endpoint URL Required for B2: Enter your B2 S3 endpoint URL (for example, https://s3.us-west-004.backblazeb2.com). You can find the endpoint in your bucket details page.
    Prefix (Optional) Use to limit CVAT to a specific folder within the bucket.
    Manifests (Optional) Select + Add manifest and specify a manifest file name such as manifest.jsonl.

    After filling in all the fields, select Submit.

    Google Cloud Storage

    Create a bucket

    To create bucket, do the following:

    1. Create Google account and log into it.

    2. On the Google Cloud page, select Start Free, then enter the required data and accept the terms of service.

    3. Create a Bucket with the following parameters:

      • Name your bucket: Unique name.
      • Choose where to store your data: Set up a location nearest to you.
      • Choose a storage class for your data: Set a default class > Standard.
      • Choose how to control access to objects: Enforce public access prevention on this bucket > Uniform (default).
      • How to protect data: None

    GB

    You will be forwarded to the bucket.

    Upload data

    You need to upload data for annotation and the manifest.jsonl file.

    1. Prepare data. For more information, consult prepare the dataset.
    2. Open the bucket and from the top menu select Upload files or Upload folder (depends on how your files are organized).

    Access permissions

    To access Google Cloud Storage get a Project ID from cloud resource manager page

    Project ID shown in Google Cloud Storage interface

    And follow instructions below based on the preferable type of access.

    Authenticated access

    For authenticated access you need to create a service account and key file.

    To create a service account:

    1. On the Google Cloud platform, go to IAM & Admin > Service Accounts and select +Create Service Account.
    2. Enter your account name and select Create And Continue.
    3. Select a role, for example Basic > Viewer, and select Continue.
    4. (Optional) Give access rights to the service account.
    5. Select Done.

    Creating service account shown in Google Cloud Storage interface

    To create a key:

    1. Go to IAM & Admin > Service Accounts > select on account name > Keys.
    2. Select Add key and select Create new key > JSON
    3. Select Create. The key file will be downloaded automatically.

    Google Cloud Storage interface with highlighted steps to create a key

    For more information about keys, consult Learn more about creating keys.

    Anonymous access

    To configure anonymous access:

    1. Open the bucket and go to the Permissions tab.
    2. Click + Grant access to add new principals.
    3. In the New principals field specify allUsers, select roles: Cloud Storage Legacy > Storage Legacy Bucket Reader.
    4. Select Save.

    Google Cloud Storage interface with anonymous access configuration

    Now you can attach the Google Cloud Storage bucket to CVAT.

    Attach Google Cloud Storage

    To attach storage, do the following:

    1. Log into CVAT and in the separate tab open your bucket page.
    2. In the CVAT, on the top menu select Cloud storages > on the opened page select +.

    Fill in the following fields:

    CVAT Google Cloud Storage
    Display name Preferred display name for your storage.
    Description (Optional) Add description of storage.
    Provider From drop-down list select Google Cloud Storage.
    Bucket name Name of the bucket. You can find it on the storage browser page.
    Authentication type Depends on the bucket setup:
  • Authenticated access: Select Key file and upload the key file from computer.
    Advanced: For self-hosted solution, if the key file was not attached, then environment variable GOOGLE_APPLICATION_CREDENTIALS that was specified for an environment will be used. For more information, consult Authenticate to Cloud services using client libraries.
  • Anonymous access: for anonymous access. Public access to the bucket must be enabled.
  • Prefix (Optional) Used to filter data from the bucket. By setting a default prefix, you ensure that only data from a specific folder in the cloud is used in CVAT. This will affect which files you see when creating a task with cloud data.
    Project ID Project ID.
    For more information, consult projects page and cloud resource manager page.
    Note: Project name does not match the project ID.
    Location (Optional) Choose a region from the list or add a new one. For more information, consult Available locations.
    Manifests (Optional) Select + Add manifest and enter the name of the manifest file with an extension. For example: manifest.jsonl.

    After filling in all the fields, select Submit.

    Video tutorial: Add Google Cloud Storage as Cloud Storage in CVAT

    Microsoft Azure Blob Storage

    Create a bucket

    To create bucket, do the following:

    1. Create an Microsoft Azure account and log into it.

    2. Go to Azure portal, hover over the resource , and in the pop-up window select Create.

      Microsoft Azure interface with highlighted &ldquo;Create&rdquo; button

    3. Enter a name for the group and select Review + create, check the entered data and select Create.

    4. Go to the resource groups page, navigate to the group that you created and select Create resources.

    5. On the marketplace page, use search to find Storage account.

      Microsoft Azure interface with highlighted &ldquo;Storage account&rdquo; button

    6. Select on Storage account and on the next page select Create.

    7. On the Basics tab, fill in the following fields:

      • Storage account name: to access container from CVAT.
      • Select a region closest to you.
      • Select Performance > Standard.
      • Select Local-redundancy storage (LRS).
      • Select next: Advanced>.

      Microsoft Azure interface with basic settings for storage account

    8. On the Advanced page, fill in the following fields:

      • (Optional) Disable Allow enabling public access on containers to prohibit anonymous access to the container.
      • Select Next > Networking.

      Microsoft Azure interface with advanced settings for storage account

    9. On the Networking tab, fill in the following fields:

      • If you want to change public access, enable Public access from all networks.

      • Select Next>Data protection.

        You do not need to change anything in other tabs until you need some specific setup.

    10. Select Review and wait for the data to load.

    11. Select Create. Deployment will start.

    12. After deployment is over, select Go to resource.

    Microsoft Azure interface with highlighted &ldquo;Go to resource&rdquo; button

    Create a container

    To create container, do the following:

    1. Go to the containers section and on the top menu select +Container

      Microsoft Azure interface with highlighted &ldquo;+Container&rdquo; button

    2. Enter the name of the container.

    3. (Optional) In the Public access level drop-down, select type of the access.
      Note: this field will inactive if you disabled Allow enabling public access on containers.

    4. Select Create.

    Upload data

    You need to upload data for annotation and the manifest.jsonl file.

    1. Prepare data. For more information, refer on how to prepare the dataset.

    2. Go to container and select Upload.

    3. Select Browse for files and select images.

    4. Select Upload.

    Microsoft Azure interface with highlighted &ldquo;Upload&rdquo; button and upload settings

    SAS token and connection string

    Use the SAS token or connection string to grant secure access to the container.

    To configure the credentials:

    1. Go to Home > Resource groups > You resource name > Your storage account.
    2. On the left menu, select Shared access signature.
    3. Change the following fields:
      • Allowed services: Enable Blob . Disable all other fields.
      • Allowed resource types: Enable Container and Object. Disable all other fields.
      • Allowed permissions: Enable Read, Write, and List. Disable all other fields.
      • Start and expiry date: Set up start and expiry dates.
      • Allowed protocols: Select HTTPS and HTTP
      • Leave all other fields with default parameters.
    4. Select Generate SAS and connection string and copy SAS token or Connection string.

    Microsoft Azure interface with highlighted &ldquo;SAS token&rdquo; field

    Personal use

    For personal use, you can use the Access Key from your storage account in the CVAT SAS Token field.

    To get the Access Key:

    1. In the Azure Portal, go to the Security + networking > Access Keys
    2. Select Show and copy the key.

    Microsoft Azure interface with highlighted elements to get an access key

    Attach Azure Blob Storage

    To attach storage, do the following:

    1. Log into CVAT and in the separate tab open your bucket page.
    2. In the CVAT, on the top menu select Cloud storages > on the opened page select +.

    Fill in the following fields:

    CVAT Azure
    Display name Preferred display name for your storage.
    Description (Optional) Add description of storage.
    Provider From drop-down list select Azure Blob Storage.
    Container name` Name of the cloud storage container.
    Authentication type Depends on the container setup.
    Account name and SAS token:
    • Account name enter storage account name.
    • SAS token is located in the Shared access signature section of your Storage account.
    . Anonymous access: for anonymous access Allow enabling public access on containers must be enabled.
    Prefix (Optional) Used to filter data from the bucket. By setting a default prefix, you ensure that only data from a specific folder in the cloud is used in CVAT. This will affect which files you see when creating a task with cloud data.
    Manifests (Optional) Select + Add manifest and enter the name of the manifest file with an extension. For example: manifest.jsonl.

    After filling in all the fields, select Submit.

    Video tutorial: Add Microsoft Azure Blob Storage as Cloud Storage in CVAT

    Prepare the dataset

    For example, the dataset is The Oxford-IIIT Pet Dataset:

    1. Download the archive with images.
    2. Unpack the archive into the prepared folder.
    3. Create a manifest. For more information, consult Dataset manifest:
    python <cvat repository>/utils/dataset_manifest/create.py --output-dir <your_folder> <your_folder>