Amazon S3_ How to Connect and Collect
  • 26 Jun 2024
  • 3 Minutes to read
  • Dark
    Light
  • PDF

Amazon S3_ How to Connect and Collect

  • Dark
    Light
  • PDF

Article summary

conn

In this article:

  • Amazon S3 Overview

  • Amazon S3 Requirements

  • How to Connect and Collect Using Amazon S3

Amazon S3 Overview

Amazon S3 is a cloud storage service provided by Amazon Web Services. Companies can store data in S3 from multiple services. Onna connects directly with S3's API. All files available through the S3 API and linked to the specific bucket are synced, including data stored in the bucket from other sources, historical information, and related metadata. You can use an existing bucket from within the organization's S3 or create a new one specifically for Onna.

Because customers can upload any type of data into an S3 bucket, Onna's S3 integration is meant to work as a bridge to applications that Onna does not directly integrate with.

Connector Features

Authorized Connection Required? No

Is identity mapping supported? No

Audit logs available? Yes

Admin Access? No

Supports a full archive? No

Custodian based collections? No

Preserve in place with ILH? No

Resumable sync supported? No

Supports Onna preservation? No

Syncs future users automatically? No

Sync modes supported:

  • One-time sync

  • Auto-sync

Is file versioning supported? No

Types of Data Collected

Metadata Collected

  • File Title

  • File creation

  • File last modified

  • Extension

  • Size

  • MD5 hash

  • Creators

  • File URL in source

  • S3 Bucket name

Amazon S3 Considerations

  • For the Amazon S3 connector you must select a date when setting up your sync.

Amazon S3 Requirements

To create a bucket or access information on existing buckets in Amazon S3, you must have access to the Amazon Management Console. Learn more about creating a bucket by visiting Amazon's bucket documentation.

To integrate Amazon S3 in Onna, you will need:

  • Bucket name

  • Access Key ID

  • Secret Access Key

  • AWS region

For security purposes, we recommend working with your AWS admin and creating a role with access to specific buckets. For more information on how to do that, visit Amazon's IAM Role documentation.

Note: The below permissions are required for each Amazon S3 bucket.

  • s3:ListAllMyBuckets

  • s3:ListBucket

  • s3:GetBucketLocation

  • s3:GetObject

Create New AWS User and Grant Necessary Permissions

To create a new user in Amazon S3 and grant the necessary permissions, follow the steps below:

Step 1

Sign into the AWS Management Console and open the Amazon IAM console at https://console.aws.amazon.com/iam/.

Then, navigate to the ‘Add user’ page. On this page, enter the user name (a), grant Programmatic access (b), and click the ‘Next: permissions’ button in the bottom right corner (c).

Step 2

On the ‘Set permissions’ screen, click the ‘Create policy’ button in the middle of the page.

Step 3

Then, select Service S3, and grant the access level listed below for ‘List’:

  • ListAllMyBuckets (a)

  • ListBucket (b)

Next, grant the access level listed below for ‘Read’:

  • GetBucketLocation (c)

  • GetObject (d)

Step 4

Under resources select ‘Specific’(a) and click ‘Add ARN’ for Bucket (b). Here you can manually provide the list of ARNs for each bucket.

Then, click ‘Add ARN’ under ‘Object’ (c)

Step 5

Here you can manually provide the list of ARNs for each bucket. For the field object provide an asterisk (*) within the field to grant access to all objects in the bucket.

Then, provide a name and description for the policy, then finally click create policy.

Step 6

Navigate back to the user tab from the Amazon IAM console and click on the user name that should have access to the policy. Attach the policy to the appropriate user by clicking on ‘Add Permissions’. Once the policy has been successfully attached it will be listed under ‘Permissions policies’ for the user.

How to Connect and Collect Using Amazon S3

To set up a new Amazon S3 collection, follow the steps below:

Step 1

Click on ‘Workspaces’ in the main menu (a), then click on the workspace where you’d like to add a new sync (b).

Step 2

Click on the ‘+’ icon in the upper right corner to add a new source

Step 3

Select the Amazon S3 connector from your list of available connectors.

Step 4

On the next screen you’ll configure your sync.

  1. In the name field, add a name for your new sync.

  2. In the Access ID field add the Access ID provided by your AWS admin

  3. In the Access secret field enter the key provided by your AWS admin

  4. In the Bucket name field, add the names of buckets you’d like to collect from. You’re able to add multiple buckets (one per line). Please note: The names must match the names found in S3 exactly in order to collect data from that bucket.

  5. In the Region field enter the exact region theS3 bucket is present in.

  6. In the Synchronization mode select either one-time sync or auto-sync. Learn more about Onna Sync Modes

  7. Select your sync start date (required).

  8. Click the blue ‘Sync’ button.

Step 5

You’ll now see your new source appear alphabetically in the list of ‘Connected sources’ in your workspace.


ESC

Eddy AI, facilitating knowledge discovery through conversational intelligence