- 19 Nov 2024
- 13 Minutes to read
- Print
- DarkLight
- PDF
Rapid Pilot Project Creation
- Updated on 19 Nov 2024
- 13 Minutes to read
- Print
- DarkLight
- PDF
Document Purpose
The purpose of this document is to provide the instructions to create the "Rapid Pilot Project". This project consists of an Enron subset, and a handful of documents to showcase our Review Accelerators (Translation, Image Labeling, and Transcription) totaling ~40K documents. This allows us to have a project for sandbox or training purposes.
Technical Documentation
To access all technical documentation within the Reveal Discovery Platform, please go to our Online Help Portal.
High Level Process Overview
Below provides the high-level illustration of this entire process.
Access to Load Machine
All Load Machine access is administered through Perimeter 81. To access a Load Machine, you can follow the high-level steps below.
Use your Reveal Universal ID to login
Find/click on the icon for your Load Machine by searching for the MSA.
If you cannot find or access your Load Machine, please email [email protected] and ask for access.
Note
Please take note of the MSA or 8-digit number in the name of the machine. This is the MSA number for the pilot.
Stage Test Data
There are two steps to stage the test data.
Download the test data to the Load Machine.
Copy/Unzip the data to the root of the Z:\ drive on the Load Machine.
Download Test Data
Follow the steps below to download the Test Data.
On the Pilot’s Load Machine, use the following link in Chrome to download the Test Data,
Copy/Unzip the Data
After downloading the zip file, copy it to the Z:\ drive of the Load Machine.
Unzip this file to the root of the Z:\ drive so it looks like the below.
Note
It is possible to receive a permissions error when trying to copy the zip file to the root of the Z:\ drive. If this happens, create a folder named '_RapidPilot' at the root of the Z:\ drive, copy the zip file into this folder, and extract the contents of the zip file into this folder. Do not repeat the '_RapidPilot' folder underneath itself.
This data must be on the root of the Z:\ drive as illustrated above. If the data is not staged in this manner, the importing of the natives/text will fail. If other files and folders exist on the Z:\ drive that is fine as well.
Project Creation
Reveal Review Manager is used to administer your instance of Reveal Review. In Review Manager, you can do any number of functions associated with the administration of Review such as create databases, manage users, load large amounts of data, set analytics and more. To see more information about this, please see Reveal Review Manager Overview.
Create Company
The Reveal Company may already exist. If it does exist, please skip this step, otherwise create the Company Reveal using the steps below. If it does not exist, or you want to create a new Company, follow the steps below.
A Company is the highest level of organization in our permission structure. Please create a Company named Reveal.
Open Review Manager on the Load Machine
Expand Instance Setup
Click Companies
Click the Add New tab
Enter Reveal (or preferred name) as the Company name, and click Insert
Create Client
The Reveal Client may already exist. If it does exist, please skip this step, otherwise create the Client Reveal using the steps below. If it does not exist, or you want to create a new Client, follow the steps below.
A Client is only associated with one Company. A Client can have multiple projects associated with it and is a way to keep a logical structure to the projects.
Open Review Manager on the Load Machine
Expand Instance Setup
Click Clients
Click the Add New tab, enter Reveal (or preferred name) as the Client name, enter a Client Number (typically a number, recommended to use date created in YYYYMMDD format), associate the Client with a Company, and Click Insert
Create Rapid Pilot (RP) Enron Project
After creating or using a previously created Company and Client, follow the steps below to create a new Rapid Pilot.
Open Review Manager on the Load Machine
Expand Instance Setup, select Projects
Click the New tab
Enter the Project Name (use ‘RP_’ as the prefix, the client's MSA_Enron as the project name, i.e. RP_81010001_Enron), enter a Project ID (typically a number, recommended to use date created in YYYYMMDD format), associate the Company/Client, choose the Reveal InControl Standard Template, and UTC for Time Zone.
Click Create Case
Note
You will always choose Reveal as the client and company, but the MSA in the case you create will be different. The MSA provided in this document of 135176 is there for instructional purposes.
User Setup
After creating the project, there are a few things that must be done prior to importing the data.
Create Review Account and Add to Rapid Pilot Project
Create a Review account for yourself and make your user a Company Admin as detailed below.
Open Review Manager on the Load Machine
Expand Project Setup
Click Users
Click Add
Click Create New, fill in the Create a new User ID form, and click Create
Choose the Review Account, and click Add
Change Group to Administrators
Company Admin Rights
A user must have Company Admin rights to create users or a project within Review. To make a user a Company Admin follow the steps below.
Open Review Manager on the Load Machine
Expand Instance Setup, and select Companies
Click the Users By Company tab
Click Add User, and choose your user
After adding the user select the checkbox Is Admin.
Checking for Company Admin Rights in Review
If a user does not have access to the Load Machine, but wants to check for Company Admin rights, the user can check within Review.
Log into Review
You can get the URL to Review to any environment in Review Manager by expanding Advanced Options, click System Settings, and in the Value column type 'indexserverserviceclient'. When copying this URL result, exclude '/review_indexserver' from the result.
Click Admin
Click the Menu button in the upper left-hand corner and look for Company Admin.
Note
If Company Admin is not visible, a user with access to the Load Machine will need to grant Company Admin rights via the steps above.
Import Data
We will import data into the project. This requires importing the DAT file into the system and creating indexes. To do this, follow the steps below.
Note
The project in the screen shot is for illustration purposes only, and you should be choosing your project you created.
Field Creation
To load the data set change the below settings.
Open Review Manager on the Load Machine
Expand Project Setup
Click Fields
Change DOMAINS from 4000 to 0 which is the max characters allowed for a field.
Note
An error may be presented that the system cannot update a field named File Display Name. Please disregard this if this presents itself.
Import Mappings
A new field mapping will also need to be created to load the data.
Open Review Manager on the Load Machine
Expand Project Setup
Click Import Mappings
Browse to the following DAT file
Z:\_RapidPilot\PilotDataProcessingFolder\Exports\0001_EXPORT01\EXPORT01_loadfile.dat
Click Match All Fields
Sort by the field Table Name, click in one of the fields that say -- Select -- and find/remap only the fields that show here as the did not map properly (there will not be that many). When this is complete, there will be no more fields remaining in the drop down.
After mapping the unmapped fields, save this mapping as the Rapid Pilot Mapping
Note
All fields will not map. You will only map the fields that show via step 6 above.
Documents Import
To import the DAT file, follow the steps below.
Open Review Manager on the Load Machine
Expand Import
Click Documents
Choose the Rapid Pilot project
Choose the Rapid Pilot Mapping
Import 1
Import Data File: Z:\_RapidPilot\PilotDataProcessingFolder\Exports\0001_EXPORT01\EXPORT01_loadfile.dat
Click Import Data
This will load 40,656 files.
1. You can ignore this error and click OK.
Import 2
Repeat the same steps from Import 1
Use Import Data File: Z:\_RapidPilot\PilotDataProcessingFolder\Exports\0002_EXPORT02\EXPORT02_loadfile.dat
This will load 86 files.
Note
As shown above, it is possible to receive an Invalid column data error when importing. It is ok to click OK and disregard this error.
Documents Overlay
When this data set was created, there were a few fields that were not within the DAT file that are essential for visual analytics and AI. To get these fields populated, we will need to do an overlay or Update Data for the project in Review Manager. To overlay these fields, follow the steps below.
Open Review Manager on the Load Machine
Expand Import
Click Documents
Choose the Rapid Pilot project
Choose Update Data
Choose CSV
Overlay
Import Data File: Z:\_RapidPilot\PilotDataProcessingFolder\Exports\0003_BRS_OVERLAY\BRS_OVERLAY.CSV
Please note that this is a CSV file and not a DAT file.
Click Update Data
Click Match All
We are only overlaying 3 fields. You will need to find the following two fields and deselect them from the overlay, as Match All will return 6 fields.
Find the field SENT_DATE and deselect it (remove the checkbox from Import for the row) as we should be overlaying the field SENT_DATETIME and not SENT_DATE
Find the field SUBJECT and deselect it (remove the checkbox from Import for the row) as we should be overlaying the field SUBJECT_OTHER and not SUBJECT
Click OK
This will update all files (40,742)
Note
As shown above, it is possible to receive an Invalid column data error when importing. It is ok to click OK and disregard this error.
Create Indexes
After importing the data, you need to index the data. Please only do this after Store Pending = 0 as seen below. As part of this process, we grab the native and text files from the load file, and move them to S3 in AWS, or the “store”. Also note there is 1 file that is a Store Error. This was a virus that was removed by AV, so this is expected. This process will take some time, so this may be a good time to take a break for about an hour.
Please note the project in the screen shot is for illustration purposes only, and you should be choosing your project you created.
Open Review Manager on the Load Machine
Expand Create
Click Indexes
Choose the Rapid Pilot project
Select both indexes
Click Index/Re-Index
Click Native / HTML, Extracted, OCR / Loaded, Document_Metadata, and click OK
Create Document Folders
After kicking off indexing, run the following process in the Create section of Review Manager.
Document Folders
Choose the Rapid Pilot project
Select both data sets
Change Document Folder Field to RELATIVEPATHPARENT
Click Run
Monitor Progress
You can monitor the indexing progress in the Review Manager, or you can go to Menu -> Jobs -> Index, select View All Jobs, and you will see the indexing process of the import. Make sure the counts in your project match the counts in the screen shot below except for Total Errors.
Run the term depart* to make sure 5,260 documents are returned.
Make sure the data loaded properly into AI by going to Menu -> Jobs -> AI Document Sync. If this shows complete
Everything will be complete when the Candy Bar displays counts on the Dashboard, and that the index Brainspace connector overlay completed as well.
Note
It can take upwards of 20 minutes for the Candy Bar and/or the Brainspace connector overlay to display after the AI Document Sync has completed.
Run Final Processes
There are 3 processes to run if you need to showcase the Review Accelerators (i.e., Image Label, Translation, and Transcription). Please note the project in the screen shots is for illustration purposes only, and you should be running these processes in the project you created. When the Label, Translation, and Transcription jobs complete, you will receive an email for each job.
Field Profile Management
By default, field profiles are not assigned to roles out of the box. This makes the Default Field Profile inaccessible for all users. To fix this, follow the below steps, and you can use the Default field profile.
Login to your project via your browser in Review
Select Admin
Click Fields -> select Default -> click Edit
Select Original Administrators -> click UPDATE
Switch to Grid, and choose Default if not selected
Label
Run Image Label on the following images for demo purposes.
Login to your project via your browser in Review
Select the Documents Folder Documents -> Dropbox -> Group Share -> Images, and this should return 49 files
Note
If the top level folder does not return 49 files, click the dropdown arrow and initiate the process outlined below on each sub-folder.
Click the Label button
Click LABEL
You can monitor the progress of this job in the Jobs section (click Menu and choose Jobs). When the job is complete there will be a green check mark.
Translate
There are two sets of documents for translation purposes of Japanese and Korean. Below illustrates how to translate the Japanese files. The process should be repeated for the Korean files as well.
Login to your project via your browser in Review.
Select the Documents Folder Documents -> Dropbox -> Group Share -> Foreign Languages -> Japanese, and this should return 30 files
Click the Translate button
Next to Destination Text Set click the + Create new button, enter the following items, and click Submit.
Back on the Translate Documents settings, choose the following options.
If the Destination Text Set that was just created does not show up in the list of text sets to choose from, logout/login, and the text set will be available.
Repeat this process for the Korean files (Documents Folder Documents -> Dropbox -> Group Share -> Foreign Languages -> Korean), but translate from Korean to English. Please add the Korean files to the same Text Set of Translation, and do not create a new Text Set.
You can monitor the progress of this job in the Jobs section (click Menu and choose Jobs). When the job is complete there will be a green check mark.
Transcribe
There is 1 deposition video to transcribe.
Login to your project via your browser in Review.
Select the Documents Folder Documents -> Dropbox -> Group Share -> Videos, and this should return 1 file
Click the Transcribe, and click TRANSCRIBE (leaving all default settings)
You can monitor the progress of this job in the Jobs section (click Menu and choose Jobs). When the job is complete there will be a green check mark.
Show Transcription
After the deposition video completes, turn on the transcription to show it during a demo.
Login to your project via your browser in Review.
Select the Documents Folder Documents -> Dropbox -> Group Share -> Videos, and this should return 1 file
You can monitor the progress of this job in the Jobs section (click Menu and choose Jobs). When the job is complete there will be a green check mark.
Launch this file in Review by clicking on the row.
Click the CC Show Transcript button , and this will show the text next to the video when it is played. This setting will stick for all future demos of this file.
Setting up Rapid Pilot for Demonstration Purposes
While the Review Enhancers are executing, several additional tasks can be done to get the project ready for demo purposes which are outlined below.
Add AI Model
Apply an AI Model from the Model Library for demo purposes.
Login to your project via your browser in Review
Click into Project Admin -> Tags
Click Add Tags and Choices
Enter and select the following information for the tag:
Return to your main Project screen -> Click Supervised Learning
Click the gear icon by the AI Tag created (Junk)
Scroll through the AI Library Models and add the "Out of Office" AI Model
Click Run Full Process -> Click Save at bottom
Click the Review Icon on left panel, return to dashboard.
Note
The project's documents will now be scored with this AI model as part of the current Classifier which will take some time to run the process before it displays on the dashboard.
You can replicate these steps to create another AI Model for demo purposes or even combine two similar models from the library to "pack and stack" into one uber AI Model to further leverage the AI Model Library.
For more information on applying or leveraging AI Models, please see or help documentation or blog posts.
Creating Work Folder for Key Documents
For demo purposes, start selecting documents from Rapid Pilot Dataset to highlight key differentiators and place into Work Folder.
Login to your project via your browser in Review
Next to Work Folders on top left, Click the ellipsis
Click Add Folder -> Name Folder "Key Docs" -> Click to Grid View
Click into Search Toolbar at top right, Click the ellipsis
Click Add Condition -> Type Begin Number -> Click Begin Number metadata field
Copy and paste the following Begin Numbers into the pop out window:
DEMO-037138
DEMO-040676
DEMO-040720
DEMO-040742
Click Add to Search -> Click Search -> Results should look like documents below:
Click Update -> Action to Take -> Check Folders -> Click dropdown to Work Folders -> Check off Key Docs -> Click Submit -> Confirm
Note
The work folder and documents selected can be selected based on preference for demo purposes - just a suggest set of documents to highlight Review Enhancers.