- 29 Oct 2024
- 21 Minutes to read
- Print
- DarkLight
- PDF
How to Setup a Relativity® Connector
- Updated on 29 Oct 2024
- 21 Minutes to read
- Print
- DarkLight
- PDF
Overview of Connectors and Overlays
Relativity® Overlay
When using a Relativity®* connector for a dataset, you can overlay a group of analytics fields from Brainspace into Relativity® after creating a new dataset or after rebuilding an existing dataset. These fields can be used to organize and to accelerate linear document review in Relativity®.
You can choose to run overlay to Relativity® automatically every time you build a dataset, or you can choose to run overlay to Relativity® manually as needed.
Multiple Relativity® Overlays
When overlaying multiple datasets or classifiers to a single Relativity® Workspace, Brainspace will display duplicate fields appended with additional characters to identify that a particular field in Relativity® has more than one corresponding field in Brainspace. This also applies to multiple Brainspace datasets that use the Relativity® Plus connector.
Relativity® Plus Connector
Brainspace’s Relativity® Plus connector is compatible with Relativity® v9.7 and newer versions of Relativity®. Relativity® v9.7 and v10 work with the legacy Relativity® connector and the Relativity® Plus connector in Brainspace.
Note
The Relativity® Plus connector only works with Relativity® v9.7 and v10.x (including RelativityOne). Brainspace strongly recommends that customers upgrade to Brainspace v6.2 or newer to use the most recent API.
Brainspace-Relativity® Document Links
By default, documents are linked between Relativity® and Brainspace. Clicking the document link in Brainspace opens the source document in Relativity® if network access (http or https) is permitted and the user is logged in to Relativity®. Document links can be disabled using the Advanced Settings feature in Relativity® (see third-party Relativity® documentation for more information).
Multiple Relativity® Web Servers
Relativity® 9.7.229.5 does not support database-backed authorization codes with load-balanced web servers. Using multiple web servers will result in the Relativity® Plus connector failing to authenticate. This can be resolved by configuring the Relativity® Plus connector to explicitly communicate with a single Relativity® web server.
Overlay Process
The Relativity® Plus connector overlays Analytics field data in batches after a build. The Relativity® connector overlays data as a single action. The Relativity® Plus connector no longer causes the Relativity® Workspace to hold a full-table lock on the documents table while overlay is occurring. In the case of an overlay failure, the documents will have field values partially written to the Analytics field.
Pause and Resume
The Relativity® Plus connector does not support pause and resume. Because of the concurrent nature of the implementation, Brainspace could not guarantee that a document would not be missed during the resume. The pause button works from the UI, but when resumed, the ingest will start from the beginning of the entire saved search or Workspace.
Predictive Coding
The Brainspace Addons Relativity® (*.rap) application is still required for predictive coding (PC). This creates the choice fields (BDPC Is Responsive) that Brainspace is not able to create via the API. It also creates the views and saved searches that are useful for the PC workflow.
Note
CMML with ACS provides a control set solution, so PC is no longer required. The CMML solution does not require the *.rap file.
Ingest and Overlay Performance
Ingest performance with Relativity® Plus should be significantly faster than the Relativity® connector; however, the Relativity® Plus connector is highly dependent on the values chosen for the connector configuration and the number of CPUs on the Brainspace servers, as well as the network bandwidth between the Brainspace host and the Relativity® host.
Relativity® Server Maintenance
Based on testing results and interaction with the kCura team, temporary resources are created on the Relativity® server-side that correspond to each export initiated during the dataset ingestion process. The Relativity® services have a cron-job that occurs weekly to clean-up temporary resources. These resources consume large amounts of space on disk, so it is important to monitor disk space for environments where many or large ingestion processes are being done. If more frequent clean-up jobs are required, contact the kCura team for assistance.
Run Overlay Automatically after a Dataset Build
The overlay to Relativity® feature can be activated to run automatically each time you build an existing dataset with a Relativity® connector.
Note
To run overlay automatically when creating a new dataset with a Relativity® connector, see Create a Dataset with a Relativity® Connector.
To run overlay automatically after a dataset build:
In the user dropdown menu, click Administration:
The Datasets screen will open.
In the Datasets screen, locate the dataset with the Relativity® connector, and then click the Settings icon:
The Dataset Settings dialog will open.
In the Dataset Configuration pane of the Dataset Settings dialog, toggle the Overlay switch to the On position:
The Overlay switch will become green.
Click the Save button. Do one of the following:
To close the Dataset Setting dialog without running overlay to Relativity® now, click the Close icon.
To overlay to Relativity® now, click the Build button.
If you choose to close the Dataset Settings dialog without overlaying to Relativity®, overlay to Relativity® will run automatically every time you run a dataset build in the future. If you choose to run overlay to Relativity® immediately and without the auto-overlay feature, overlay to Relativity® will only run when manually initiated.
Run Overlay Manually on an Existing Dataset
After creating a dataset with a Relativity® connector, you can use the overlay to Relativity® feature at any time whether or not automatic overlay to Relativity® feature has been enabled.
To run overlay manually on an existing dataset:
In the user dropdown menu, click Administration:
The Datasets screen will open.
In the Datasets screen, locate the dataset with the Relativity® connector, and then click the Settings icon: The Dataset Settings dialog will open.
In the Dataset Configuration pane of the Dataset Settings dialog, click Run Now:
Note
If the Run Now option is not visible or is greyed out, confirm that the dataset has a connector to Relativity®, and you have fields selected for overlaying.
Click the Save button.
Click the Close (X) icon.
After running the overlay to Relativity®, you set up automatic overlays or manually run the overlay feature at any time.
Enable Multiple Relativity® Overlays on an Existing Relativity® Plus Connector
After or while creating a Relativity® Plus connector, you can enable the multiple Relativity® overlay feature to overlay Relativity® field sets in multiple Brainspace datasets to a single Relativity® Workspace.
Note
This feature is only available for the Relativity® Plus connector.
Note
When a dataset build completes with this feature enabled on the Relativity® Plus connector, Brainspace creates a unique field in Relativity® to map each of the BD fields with the datasets in Brainspace. For more information on Brainspace fields, see Relativity® Overlay Fields on page 42.
To enable multiple Relativity® overlay field sets:
In the user dropdown menu, click Administration:
The Datasets screen will open.
In the Datasets screen, click the Connectors button.
Locate the Relativity® Plus connector, and then click the Update Connector icon.
The Relativity® Plus connector configuration dialog will open.
In the Overlay pane, toggle the Enable Multiple Overlay Field Sets switch to the On position:
Click the Test connector button.
After verifying that the connector configuration is valid, click the Update Connector button.
The connector configuration dialog will close automatically.
Every dataset in Brainspace that employs this connector will produce unique fields in the Relativity® Workspace.
Relativity® Overlay Fields
When configuring a Relativity® or Relativity® Plus connector, you will decide which fields to overlay (see Create a Relativity® Connector and Create a Relativity® Plus Connector.
brs_strict_dup_set_id
If the document is in an SDG, this is the SDG ID of the SDG. Otherwise it is NULL.
brs_strict_dup_pivot is
If the document is in an SDG, this is the document ID of the pivot member of the SDG. Otherwise it is NULL.
BD EMT Duplicate ID
The document identifier of the duplicate email message or attachment. Unique group identifier used to group all documents within each of the exact text duplicate sets.
This field can be removed from overlay. Removing this field from the overlay will prevent Relativity® users from knowing which control number or document IDs an email message or attachment is a duplicate of within an email thread.
BD EMT EmailAction
Identifies the specific action for each message within an email thread (send, forward, or reply).
This field can be removed from overlay. Removing this field from the overlay will prevent Relativity® users from knowing whether an email was the send (the original message in an email thread), a forward, or a reply within an email thread.
BD EMT FamilyID
The control number of document identifier of the parent email message within a document family within an email thread.
This field can be removed from overlay. Relativity® users will not know the Relativity® control number of document identifier of the parent email message when reviewing a document family (message and attachments) within an email thread if this field is removed from the overlay.
BD EMT
Intelligent sort field that allows you to sort email threads hierarchically in descending order so that the most inclusive messages for each branch within an email thread are sorted to the top along with any attachments to those inclusive messages.
This field can be removed from overlay. Removing this field from the overlay will not allow Relativity® users to sort the Brainspace Email Threads hierarchically in Relativity®.
BD EMT IsDuplicate
Identifies whether an email message is a duplicate within the email thread.
This field can be removed from overlay. This field is “Yes” if the email message or attachment is a duplicate of another message or attachment within the email thread. Removing this field from the overlay will prevent Relativity® users from knowing which email messages or attachments are duplicates within an email thread.
BD EMT IsMessage
Identifies which documents within an email thread are actual email message. Documents are consider emails if they have a Populated From field and are not identified as attachments.
This field can be removed from overlay. Removing this field from the overlay will prevent Relativity® users from knowing which documents within an email thread are actual email messages.
BD EMT IsUnique
Identifies which messages within the email threads are the inclusive message.
This field can be removed from overlay. Removing this field from the overlay will prevent Relativity® users from knowing which email messages are inclusive within an email thread. Relativity® users are only required to review the inclusive messages within an email thread as they contain the content of all the non-inclusive messages within the email thread.
BD EMT MessageCt
The total number of messages within an email thread.
This field can be removed from overlay. Removing this field from the overlay will prevent Relativity® users from knowing how many email messages are within an email thread.
BD EMT ThreadID
Unique identifier assigned to a group of messages within a single email thread.
This field can be removed from overlay. This same BD EMT FamilyID is assigned to all emails and attachments that belong to the same email thread. If this field is not overlaid into Relativity®, users will not be able to take advantage of Brainspace’s email threading for batching and review in Relativity®. There are several custom Relativity® Views created by Brainspace that require this field to be populated in order for the Views feature to function properly.
BD EMT ThreadIndent
This field is used for displaying the Email Thread view in Relativity® where messages are properly indented in the view based on the order in which the messages were created within the email thread. For example, a reply to a message will have one greater thread indent than the message it replies to.
This field can be removed from overlay. Relativity® users will not be able to use the custom Brainspace Email Thread Views if this field is not included in the overlay.
BD EMT ThreadPath Full
Contains a semicolon delimited list of the document IDs (control numbers) for all the messages that are included within each inclusive message.
This field can be removed from overlay. Relativity® users will not know which non-inclusive email messages are contained in each inclusive message in the email thread if this field is not included in the overlay. This will make inclusive-only reviews in Relativity® difficult to manage.
BD EMT ThreadSort
Field that sorts email threads by ThreadIndent first and then by chronology (the order in which the messages were generated within each email thread).
This field can be removed from overlay. Relativity® users will not be able to sort the Brainspace email threads in Relativity® chronologically if this field is not included in the overlay.
BD EMT UniqueReason
Indicates why the message is inclusive. “Attach” means the message had an attachment that is not present in the previous messages or is different from the attachment in the previous messages within an email thread. “Message” means the content of the message is not inclusive in another email in the same email thread. “Message” and “Attach” both contain unique information.
This field can be removed from overlay. Relativity® users will not know why a message has been marked IsUnique if this field is not included in the overlay.
BD EMT ThreadHasMissingMessage
Indicates that parsing the ConversationIndex has revealed that a document in the thread has not been included in the Brainspace dataset.
This field can be removed from overlay. Users will not be able to see that the document was missing from the thread if this field is not included in the overlay.
BD EMT WasUnique
Indicates that this document was considered to contain unique content. However, a new document introduced in a subsequent build has all of this document’s content and more. This status will be preserved across all subsequent builds.
This field can be removed from overlay. Users will not be able to see that this document was previously considered having unique content if this field is not included in the overlay.
BD EMT WasUniqueReason
Indicates why the message was unique. “Attach” means the message had an attachment that was not present in the previous message or was different from the attachment in the previous message in an email thread. “Message” means the content of the message was not unique within another email within the same email thread. “Message” and “Attach” both contain unique information.
This field can be removed from overlay. Relativity® users will not know why a message has been marked WasUnique if this field is not included in the overlay.
BD EMT Intelligent Sort
Alternative sorting algorithm that presents the most complete document in an email thread first.
This field can be removed from overlay. Users will not be able to see the most complete version of the email thread if this field is not included in the overlay.
BD EMT AttachmentCount
The number of attachments included with this email.
This field can be removed from overlay. Users will not be able to see how many attachments are included with this email if this field is not included in the overlay.
BD StrictDupStatus
This field identifies the status of a document with regard to its strict exact-duplicate state. With the option to include metadata in CMML classifiers, it becomes necessary to consider that two documents may be textual exact duplicates but have differences in metadata; therefore, this field represents the strict exact-duplicate state (see note below).
It will be populated with one of three values:
unique: This document may not be considered a strict exact-duplicate of any other document.
duplicate: This document is considered a strict exact-duplicate of another document
pivot: This document is the original document of which other documents are listed as duplicates.
This field can be removed from overlay. Relativity® users will not know whether this document is considered a strict exact duplicate if this field is not included in the overlay.
Note
Two documents are considered to be strict exact-duplicates if the analyzed text fields are identical (except for normalized whitespace), if all fields flagged as usedForExactDup in the schema.xml are identical and if all fields are flagged as “analyzed = true” in the schema.xml are identical.
Brainspace supplies a default schema that makes certain choices for which fields are marked as usedForExactDup and/or analyzed. The user can override those choices.
BD ExactDupSetID
Unique identifier for each Exact Duplicate group. Documents that are exact duplicates of one another are grouped together using this ID.
This field can be removed from overlay. This group identifier is used in Relativity® to understand which documents are part of the same exact text duplicate grouping. Documents that are exact text duplicates of one another will all get the same BD EM Duplicate ID.
BD ExactDupStatus
This field identifies the status of a document with regard to its exact-duplicate state.
It will be populated with one of three values:
unique - This document may not be considered an exact duplicate of any other document.
duplicate - This document is considered an exact duplicate of another document.
pivot - This document is the original document of which other documents are listed as duplicates.
This field can be removed from overlay. Removing this field from the overlay will prevent Relativity® users from knowing whether this document is considered an exact duplicate.
Note
Two documents are considered exact duplicates if the analyzed text fields are identical (except for normalized whitespace) and all fields flagged as usedForExactDup in the schema.xml are identical.
BD IsExactPivot
Identifies the original document against which all exact duplicates were compared.
This field can be removed from overlay. Relativity® users will not know which document is considered to be the original against which all documents are compared to identify exact text duplicates if this field is removed from the overlay.
BD IsNearDupPivot
Identifies the original document against which all near duplicates were compared.
This field can be removed from overlay. Relativity® users will not know which document is considered to be the original against which all documents are compared to identify near duplicates if this field is removed from the overlay.
BD NearDupSimilarityScore
Contains the near duplicate similarity score for near-duplicate documents.
The score is a number between the near duplicate threshold (by default 0.8) and 1.0. It is calculated based upon all fields in the schema marked as `analyzed=true`. Note that the configuration of true/false setting is not controlled through the UI and should not be altered without consulting Reveal/Brainspace support.
This field can be removed from overlay to your third party review platform. If that is done users will not know how similar a near-duplicate document is to its original document.
BD Languages
A semi-colon delimited list of the languages potentially within a document.
This field can be removed from overlay. Relativity® users will not know what mix of languages are contained within a document if this field is removed from the overlay.
BD NearDupSetID
Unique identifier for each near-duplicate group. Documents that are near duplicates if one another are grouped together using this ID.
This field can be removed from overlay. Relativity® users will not know which documents belong to the same near duplicate set if this field is removed from the overlay. Users will also not be able to propagate coding decisions to near-duplicate documents in Relativity®.
BD NearDupStatus
This field identifies the status of a document with regard to its near-duplicate state.
It will be populated with one of three values:
unique - This document may not be considered an exact duplicate of any other document.
duplicate - This document is considered an exact duplicate of another document.
pivot - This document is the original document of which other documents are listed as duplicates.
This field can be removed from overlay. Relativity® users will now know whether this document is considered to be a near duplicate if this field is removed from overlay.
Note
By default, two documents are considered to be near duplicates if they share 80 percent of their text shingles in common.
BD Primary Language
The primary (or dominant) language identified within a document.
This field can be removed from overlay. Relativity® users will not know the primary language identified within a document if this field is removed from the overlay.
BD RelatedSetID
Identifies the first parent cluster that is normal (not an exact duplicate or near duplicate). Directly correlates to ClusterID in Brainspace. This field identifies which documents are highly similar in terms of content but not similar enough to be considered near duplicates. Documents that are highly similar but not quite near duplicates are assigned the same BD RelativitySetID.
This field can be removed from overlay. Relativity® users will not be able to organize batches and perform review on documents that are highly similar if this field is removed from overlay.
BD StrictDupStatus
This field identifies the status of a document with regard to its strict exact-duplicate state. With the option to include metadata in CMML classifiers, it becomes necessary to consider that two documents may be textual exact duplicates but have differences in metadata; therefore, this field represents the strict exact-duplicate state.
It will be populated with one of three values:
unique: This document may not be considered a strict exact duplicate of any other document.
duplicate: This document is considered to be a strict exact-duplicate of another document.
pivot: This document is the original document of which other documents are listed as duplicates.
This field can be removed from overlay. Relativity® users will not know whether this document is considered to be a strict exact duplicate if this field is removed from overlay.
Note
Two documents are considered strict exact duplicates if the analyzed text fields are identical (except for normalized white space), all fields flagged as usedForExactDup in the schema.xml are identical, and all fields flagged as “analyzed = true” in the schema.xml are identical. Brainspace supplies a default schema that makes certain choices for which fields are marked as usedForExactDup and/or analyzed, but the user may override those choices.
BD Summary
A summary of the document using six words or phrases. For near duplicates, this field will have the six terms or phrases that best distinguish this document from the pivot. For pivots, this field will have the six terms or phrases that best represent this document.
This field can be removed from overlay. Relativity® users will not have a high-level summary of every document if this field is removed from overlay.
BDID
Brainspace unique identifier for every document ingested. This is an ID that BD gives every document that, when used sequentially, will show an evolution of documents. Every BDID is adjacent to its most similar document (e.g., BD_000000001, BD_000000002 with enough zeros for 999 million docs). Zeros are needed to maintain string sort order. Sorting documents by BDID will result in neighbor documents being highly related to each other, which expedites the review process.
This field allows Relativity® users to sort documents when batching for review so that the documents within Relativity® review batches are highly similar to one another in terms of content and vocabulary. Sorting by this field when creating batches will force “like” documents to be included in the same review batch. This has been proven to accelerate document review by as much as 90 percent.
This field can be removed from overlay. Relativity® users will not be able to take advantage of this field and sorting feature if this field is removed from overlay.
Predictive Coding Overlay Fields
BDPC Auto Code
For a predictive coding (PC) classifier (model), this field contains the recommended coding decision for every document. This field is only populated when the predictive coding session in Brainspace is closed by clicking on “Close Session.” This Relativity® field will only be populated if the user closes out the active PC session.
This field cannot be removed from overlay. Users may choose not to close out the PC session, which will leave this field blank or null in Relativity®.
BDPC Control Set
This field identifies all the documents that are included in the control set (model). This field is only populated when using Brainspace’s predictive coding (PC) workflow.
This field cannot be removed from overlay. This Relativity® field will only be populated if the user creates a control set for a PC session in Brainspace.
BDPC Is Responsive
This is the coding field used to code documents in Relativity® that will be used to train a Brainspace classifier. This field is only populated when using Brainspace’s predictive coding workflow.
This field cannot be removed from overlay. The field will only be populated if the user applies this field in Relativity® to code documents for a Brainspace PC session.
BDPC Needs Review
This field identifies all the documents in Relativity® that need to be reviewed for a given Brainspace training round. This field is only populated when using Brainspace’s predictive coding (PC) workflow.
This field cannot be removed from overlay. This Relativity® field will only be populated if the user creates a PC session in Brainspace and creates a control set or training round.
BDPC Predictive Rank
This field contains the most recent predictive rank. This field is only populated when using Brainspace’s predictive coding (PC) workflow. This Relativity® field will only be populated if the user creates a Brainspace PC session.
This field cannot be removed from overlay. This field is populated and then updates each time the user runs a PC training round in Brainspace.
BDPC Use for Training
This field identifies which documents will be used for training the model.
This field cannot be removed from overlay. This field is populated and then updates each time the user runs a PC training round in Brainspace.
CMML Overlay Fields
BD CMML ## Score Relativity® Field Name
This field is only populated when using Brainspace’s CMML workflow. This Relativity® field will only get populated if the user creates a Brainspace CMML classifier where a “Connect Tag” (Relativity® coding field) was used to train the classifier. A “BD CMML ## Score Relativity® Field Name” field will be created in Relativity® to store the predictive rank for that classifier where ## is the corresponding CMML classifier ID in Brainspace and “Relativity® Field Name” is the name of the Relativity® field connected to the classifier in Brainspace.
This field cannot be removed from overlay. This field is populated and then updates each time the user runs a training round in Brainspace for a CMML classifier. Multiple CMML classifiers cam be created and ran concurrently if more than one issue must be investigated.
Relativity® Plus Configuration Options
Ingest Batch Size
The number of documents to be retrieved by each HTTP export request from Relativity®.
Analytics Overlay Batch Size
The number of documents to be send by each HTTP overlay request to Relativity®.
Embed Native Viewer URL
Whether or not a Relativity® Document URL should be generated, per document.
HTTP(s) Request Timeout
The maximum number of milliseconds that any given HTTP request will wait for Relativity® to respond.
Maximum HTTP(s) Requests Per Second
The maximum number of HTTP requests that the Brainspace application will send to Relativity®, per second.
Validate User Facing URLs
Whether or not the Brainspace application should verify the base document URL and OAuth URLs.
API Query Page Size
The number of objects (documents, Relativity® Workspaces, saved searches, fields) that should be retrieved when querying the Relativity® API, per request.
Document Condition Size Limit
The number of documents that will be used for the optimized incremental ingest query.
Note
This product may only be used by parties with valid licenses for Relativity®, a product of Relativity ODA LLC. Relativity ODA LLC does not test, evaluate, endorse or certify this product.