Dataset Build Options
  • 29 Oct 2024
  • 3 Minutes to read
  • Dark
    Light
  • PDF

Dataset Build Options

  • Dark
    Light
  • PDF

Article summary

Incremental Analytics with Ingest

Use this Build action to incrementally add documents to the dataset and update the indexes. This action updates Email Threading, Near Duplicates and Exact Duplicates, but does not update the Brain or Clusters. The dataset will continue to use the existing Brain and will fold the new documents into the existing Clusters and Cluster Wheel. In order to update the Brain and Clusters, use the Full Analytics with Ingest build action. If the dataset has CMML classifiers, new documents will receive a score of 0 until the classifier is trained with the new documents.

Note

An Incremental build only ingests new documents and adds them to an existing dataset. It will not pick up changes to documents that are already in the dataset. To include changes to existing documents, you must do an INGEST and FULL BUILD.

Additionally, if the user has not changed the Relativity® saved search and no new documents have been added to the search, the platform/plugin will detect that no new documents have been added and actually prevent the Incremental build from being kicked off (edited).

Note

Incremental build is special with respect to Exact Dup Groups (EDG) stability. It has been implemented for the following reasons:

  • Always preserve existing EDG pivots.

  • When forming new EDG groups that involve an existing document, the existing document is always chosen as the pivot.

Add New Documents to an Existing Dataset

To add new documents to an existing dataset:

  1. In the user drop-down menu, click Administration:

  2. The Datasets screen will open.

  3. Locate the dataset in the list, and then click the Settings icon:

    The Dataset Settings dialog will open.

  4. Click the Build button.

    The Dataset Build Options dialog will open.

  5. Click the Run This Build Type button for Incremental Analytics with Ingest:

    A dialog will open.

  6. Select any tags or duplicates to include in the build, and then click the Build button.

    The License Checks dialog will open.

  7. Click the Proceed button.

    The Datasets page will refresh and show the Dataset Queue build in progress:

  8. While the build is in progress, you can click the View Status button to view the build steps in progress. For information on each step in the build process, see Dataset Build Steps.

Full Analytics without Ingest

Use this build action to perform full analytics on previously ingested documents. This action will reanalyze the documents currently in the dataset and refresh the indexes.

Reanalyze Existing Documents

To reanalyze the existing documents:

  1. In the user drop-down menu, click Administration:

    The Datasets screen will open.

  2. Locate the dataset in the list, and then click the Settings icon:

    The Dataset Settings dialog will open.

  3. Click the Build button.

    The Dataset Build Options dialog will open.

  4. Click the Run This Build Type button for Full Analytics without Ingest:

    A dialog will open.

  5. Select any tags or duplicates to include in the build, and then click the Build button.

    The License Checks dialog will open.

  6. Click the Proceed button.

    The Datasets page will refresh and show the Dataset Queue build in progress:

  7. While the build is in progress, you can click the View Status button to view the build steps in progress. For information on each step in the build process, see Dataset Build Steps.

Full Analytics with Ingest

Use this build action to ingest all documents associated with the data source and to perform full analytics on the ingested documents. This action will add or update metadata associated with the documents. If the dataset has CMML classifiers, the new document will receive a score of 0 until the classifier is trained with the new documents.

  1. In the user drop-down menu, click Administration:

    The Datasets screen will open.

  2. Locate the dataset in the list, and then click the Settings icon:

    The Dataset Settings dialog will open.

  3. Click the Build button.

    The Dataset Build Options dialog will open.

  4. Click the Run This Build Type button for Full Analytics with Ingest:

    A dialog will open.

  5. Select any tags or duplicates to include in the build, and then click the Build button.

    The License Checks dialog will open.

  6. Click the Proceed button.

    The Datasets page will refresh and show the Dataset Queue build in progress:

  7. While the build is in progress, you can click the View Status button to view the build steps in progress. For information on each step in the build process, see Dataset Build Steps.


ESC

Eddy AI, facilitating knowledge discovery through conversational intelligence