aji Environment and Data Requirements

Requirements

In order for aji to review your documents, a variety of actions must be fulfilled both in the Reveal environment and with your specific dataset.

Reveal Environment

You must have a Reveal version that is 2025.9 or later with aji enabled.
- aji can be enabled by contacting your CSM.
You must create a new, active project after the 2025.9 GA date.

Dataset Criteria

Your dataset must be uploaded into Reveal.
Your dataset must be in English. Currently, this is the only supported language for aji.
Your dataset must be in a US environment, or you must consent to securely transfer content to a US-based AWS Bedrock region.
Your data must contain OCRed or extracted text.
- aji only searches text. It does not search metadata, pictures, audio or video content, or anything else that is not text.
For each individual document, the text size of your data must be no less than 0.05KB and no more than 150KB.

Note
Reveal reserves the right to parse documents due to the fact that LLMs have size limits on what can be sent for analysis.

Optional Preparation

The below additional prep work can help improve your overall experience with aji.

Reveal Environment Prep

Enable Ask, Reveal’s GenAI query tool that allows you to ask questions about your documents to a GenAI model. This allows you to select “Suggest documents automatically” for Calibration Reviews, which uses Ask to auto-generate a suggested document set to Calibrate.
1. You can learn more about ASK by reading our About Ask article.

Dataset Criteria Prep

Before it’s used by aji, you can filter your dataset in one or more of the the following ways:

Remove documents that have already been manually coded.
Remove out-of-scope documents.
Remove documents that are too small or too large (text size less than 0.05KB or greater than 150KB). aji cannot search documents outside that text size range.
Remove structured data (logs, number-only spreadsheet files, etc.).
Remove exact duplicates.

Once your data is prepped, put it in its own work folder that can be used later. This is your target data population that will be used to create random sample sets for Validation, and used as a work folder source for GenAI Review.