Connect and Collect FAQs
  • 26 Jun 2024
  • 8 Minutes to read
  • Dark
    Light
  • PDF

Connect and Collect FAQs

  • Dark
    Light
  • PDF

Article summary

What if I need to collect from a different source that is not among the list you currently support?

Let us know! Your data source may already be on our roadmap for adding integrations.

In addition, with our new Platform API, you can build on top of Onna to seamlessly integrate with existing and new ecosystem tools.

Interested in learning more about our Platform API? Explore our Developer Hub or ask the community.

How long does it take to collect a source?

Collection times for any given source can differ, from minutes to a few hours, based on the number of files to be synced, the types of files to be synced, and the integration’s specific connection characteristics such as rate limits and throttling. Read our best practice articles on collecting sources.

When does Onna begin to process the files that are collected?

We begin processing files as soon as they are collected so that data is available as soon as possible.

Does Onna identify duplicate documents?

Yes, Onna provides duplicate identification by calculating MD5 hash values.

Onna calculates and assigns MD5 hash values at processing. MD5 hash values can be thought of as a unique fingerprint based on a document's content. Any change to the file, other than the filename/path, will result in the byte value of the file contents changing. This includes adding a title to the metadata, changing any text or font, and other changes to metadata.

Opening and closing a document, or even saving it with no changes does not change the document's fingerprint.

Does Onna OCR all image-based files?

Yes, Onna OCRs files as part of its processing pipeline. Currently, OCR is only done in the English language.

Will syncs attempt until there is a result or it is manually stopped?

The source will attempt to sync an infinite amount of times if there continues to be an error. Read “What is happening to constitute a failed sync?” to learn more about why failed syncs may occur.

In the event you require assistance stopping a sync immediately please reach out to the Onna Support team.

Is there an option to view all unprocessed files for a workspace?

Yes, within a workspace execute a blank search by clicking on the magnifying glass found in the basic search bar. Once the search results have returned, enable filters. The filter ‘EXTENSION TYPES’ should be enabled. In the event there are any unprocessed files within the workspace the value unprocessed will be available under the filter file type. The value may not appear if all files were processed into Onna.

Why did a resource fail to process into Onna?

Please visit our Help Center to learn about Onna's processing exceptions. Unprocessed files are automatically reviewed by the Onna platform every six hours. Please contact support if you require additional information on why a resource could not be processed into Onna.

Does Onna capture processing exceptions?

Yes, Onna does capture processing exceptions for individual files. Please visit our Help Center to learn more about Onna's processing exceptions.

If the password of an account is changed will I have to update it in Authorized connections?

Onna connections that authorize through OAuth do not need change when a password is updated. It creates a token that allows access to the account. You will not need to update the password unless you go through the OAuth flow again.

Is there an Expiration Date for an Authorized Connection?

No, currently an Authorized Connection will be in place until you manually revoke the permission in Onna.

What do we do if the source owner who created the authorized connection leaves the organization? What will happen to the data sources using the authorized connection?

In the event that the source owner leaves the organization, resulting in the authorized connection becoming invalid, you can create a new authorized connection for your enterprise source. Afterward, please contact Onna support with the Enterprise source and creator email address for the new authorized connection. The Onna support can transfer the invalid authorized connections to the new one for the affected data sources.

To minimize this issue, we recommend that you avoid using personal accounts to set up your authorized connections in Onna.

Can Onna collect from the Recoverable Items folder in M365?

Currently, Onna does not collect the contents in the "Recoverable Items" folder. If this is a feature that you would like to see added in the future please submit a Product idea through the community.

Can you pause a sync?

Once a source sync starts, it cannot be paused or modified. However, you can delete a source during the initial sync if you need to start it over.

After the initial sync completes, you can stop further syncs from starting if a source is set to auto-sync. You will not be able to re-sync the source again once this option is selected. To stop auto-syncing, click on the source's options icon and select stop further syncs:

What is an MD5 value?

MD5 values are similar to digital fingerprints for data and are created using a special formula called an MD5 Hashing Algorithm. Putting any data, like a file or a message, into the MD5 algorithm crunches the data and gives you a unique fixed-size code of 32 characters using numbers and letters. Here’s an example MD5 value: 3e25960a79dbc69b674cd4ec67a72c62

MD5 values are used to make sure data hasn't been tampered with. For example, if you have a file and want to check if it's the same as the original one, you can calculate its MD5 value. If the MD5 value matches the one provided for the original file, it means the file is the same and hasn't been changed. But if the MD5 values don't match, the file might have been altered or corrupted somehow. It helps to confirm if the data is reliable and unchanged.

Any change to the file, other than the filename/path, will result in the byte value of the file contents changing. This includes adding a title to the metadata, changing any text or font, and other changes to metadata. Opening and closing a document, or even saving it with no changes, does not change the document's fingerprint.

When are MD5 values created in Onna?

In Onna, MD5 values are calculated and stored during the “processing” stage of the resource’s life cycle and are generated at the file level. For emails, MD5 values are generated with a combination of headers, attachment binaries and the body of the email.

  • Native files: Unless an API supplies Onna with an MD5 value, Onna calculates and assigns MD5 hash values across the native file when data is processed.

  • Near-native files: Onna generates a JSON representation of the object for near-native files (such as Slack conversations) and then calculates an MD5 value of that JSON string.

MD5 values can be found under the files detail for each file processed within Onna.

How are MD5 values used in Onna?

MD5 values are used to identify duplicates in search and export. You can choose to display documents that have matching MD5 hash values in your search results. Learn more in our Help Center. If you choose to exclude duplicates during export, an entry will be added to the "skipped files" log accompanying the export to indicate which files were skipped.

Onna also compares the MD5 value of resources against the NIST NSRL database. If the resource is found to be in this database, a true/false value is recorded in the resource metadata and it can be included in your export.

When mapping MD5 values to another tool, please note that some LSPs create their own MD5 value, which will differ from the values stored in Onna.

Can Onna collect data from an unlicensed enterprise M365 user account?

Yes, Onna can collect unlicensed users for a Microsoft Outlook Enterprise collection.

Onna connects directly with the Microsoft Graph API to extract all data and metadata found on Outlook accounts in O365 organizations. This integration is offered to enterprise accounts through the admin dashboard as part of the enterprise license.

  • Any business license with access to Microsoft Graph API

  • This includes E3 and E5

Learn more in the article “Microsoft Outlook Enterprise: How to Connect and Collect.”

Why am I unable to delete a workspace or a source?

If you cannot delete a workspace or a source, please verify your permissions. Workspace "Manage" permissions are required to delete a workspace.

In addition, the workspace and any data sources within the workspace can't be associated with any active legal holds, or Onna will prevent you from deleting the workspace. You can remove a source from a legal hold by editing the preservation. Once the source is removed from the preservation(s), it should be allowed for deletion.

We have a full sync of our Slack workspace in Onna. After I create a preservation, when should I expect messages to appear in the preservation? Will the data in the preservation differ from what I see in the full Slack Archive workspace?

When you navigate to the Preservation, you will see the data sources selected to be preserved. You can perform searches against both sources or drill down to an individual source to execute a search. Once the Preservation has finished retaining the required resources, the status will update to 'Synced.'

Please note, if the Preservation does not have an end date, the status of the Preservation may change back to 'Processing' due to additional resources syncing into Onna that match the preservation criteria.

Any resource found in a Preservation will also be found in the original source that it was synced under.

If I remove the mirroring of Slack, but keep syncing and archiving it in Onna, will Slack now sync and keep everything going forward?

Yes, once the option to "mirror Slack Enterprise retention settings" is turned off, there is no additional action required from the user.

How does Onna handle PST files stored in OneDrive?

Onna processes and indexes PST files like any other file. Depending on the size of the file, all text available within the pieces of the PST are made searchable.

What are the differences in the status of 'error' versus 'failed'? What is happening to constitute a failed sync?

At the moment, failed and error sync statuses are the same. We will return failed and error sync status if these events occur during a sync run:

  • Sync run completed with error. In this status, we finished the current sync run; however, there were several resources we could not sync into Onna. We will retry the resources we could not sync into Onna until they successfully sync.

  • The sync run failed to complete. During the collection process, the sync suddenly stopped. This could be due to several reasons (invalid credentials, error connecting to API, etc...). The source will attempt to re-sync automatically.

If you feel there is an issue with a failed or error sync retrying, please get in touch with Support, and we will investigate the matter further.


What's Next
ESC

Eddy AI, facilitating knowledge discovery through conversational intelligence