Metadata Filtering
  • 19 Nov 2024
  • 7 Minutes to read
  • Dark
    Light
  • PDF

Metadata Filtering

  • Dark
    Light
  • PDF

Article summary

Part of Reveal's search capability is filtering on metadata. The categories of Formats and People are available in the Sidebar. Some metadata, such as selectable Dates, Document Types, and Custodians, may be selected from the Dashboard Graph. All fields may be added to a search under +Add Condition > Fields in Advanced Search.

Note

Unless you have the role permission to Allow All Field / Tag Access for Search, your search, Sideboard Filter options and Dashboard data visualizations will be limited to those fields and tags that exist in profiles assigned to your teams, or if you are not assigned to a team, to the Default field and tag profiles.

Starting with keyword search, we will set out the ways that metadata may be used as filters in Reveal.

Start with a keyword search, for example, the term raptor.27 - 01 - Format Filter select-1-4

In this illustration, you can see the keyword "pill" in the Search Box, and the search results reflected in the Dashboard graphs below. Under Filters in the Sidebar, we have opened the Formats item, which now displays the top five document types (as expressed by their common Windows file extensions) and the number of each type of document in the search result. We will talk about using the View all... link to open a Format selection window below.

All items in the Sidebar, including the Filter selections under Formats, are added to the search by clicking on the item, one at a time; each creates a new "pill" in the search. Here is the result of adding the xls format to the search, where an [Advanced] pill may be clicked to open a view of the keyword and filter in the Advanced Search builder:

27 - 02 - Format Filter result-1-4

A few notes on the changes to the Dashboard screen:

  1. The filter pill Extension: xls is added in the Search Box after the original keyword with an AND Boolean connector, meaning that both criteria must be satisfied by the search.

  2. All Filter values will be re-sorted by updated retrieval results.

  3. The Timeline, set here to Application Created Date, is updated to reflect the values found in the filtered XLS documents. NOTE that this would change to No data to display were the date value Date Sent, only present in emails, selected instead.

  4. The Document Types graph also only displays xls as the only value chosen.

  5. In fact, all data visualization graphs are updated to show the current results for XLS file types containing the term raptor.

  6. The Results Pane at the right contains only spreadsheets.

  7. To remove the filter pill Extension: xls, click on the X at the right of the pill.

Adding Multiple Filters of the Same Type

Multiple filters may be selected for any type using the options box opened by clicking View all… at the bottom of the Formats list. The View all... option is available in many filters having more than five values, which in the current listing would be Formats, People>Custodians and Tags.

27 - 03 - Format Filter - More-1

Note

You may also select all representatives of an application supertype (e.g., Spreadsheets, Word Processing) under the Application Type filter starting with Reveal 2024.2; this would allow filtering all .xls, .xlsx, .wks and .wk1 files in the project dataset by simply selecting Spreadsheets.

The option box, which under Formats is labeled Extension, opens with a multi-selectable list of items and their counts in the current search.

If a filter of this type is currently selected, that would be the only option shown, so to remove this filter (for example, the pill for Extension: xls), click on the X at the right of the pill in the Search box.

In addition to the ability to select values from the list, you may also select Any Value (indicating that there is some metadata value for a document record) or No Value (which would collect all document records for which there is no File Extension metadata value). You may further specify a multi-value filter using the following:

  • Select whether to us the exact Is or the less specific Contains Is XLS would retrieve all documents with that specific entry for File Extension, while Contains XLS would retrieve XLS and XLSX file extension documents.

  • Show project-wide counts is a checkbox that, if selected, will show counts for listed items for the entire project regardless of the current search. Leaving this unchecked, which is the default, shows the results within current search criteria.

  • Connect the selected items (if more than one) using  using the default AND or the OR connector.
     27 - 04 - Format Filter Modal selections-1

This pop-up box also can Export the list to CSV (icon to the right of AND<->OR) or Quick Search... the list for values. Click Add To Search when selections are complete. Here is what the above selections look like in Advanced Search once the Advanced pills are expanded: 27 - 05 - Format Filter Modal selections in Advanced Search

Clicking on the File Extension: 3 items pill re-opens the File Extension options modal.

Clicking Search closes Advanced Search and returns to the Dashboard. Here is how the current search (which includes the keyword raptor and the filtering extensions xls OR doc OR pdf) is displayed:

27 - 06 - Format Multi-Filter result-1-4

Note

We have changed the Timeline graph to display the Master Date value. Note too that even though we have three selected extensions, other family file formats are displayed in both FILTERS > Formats in the Sidebar and the Document Types graph.

Adding Other Metadata Filters

Further filters may be applied from several locations.

  1. Filters (Sidebar): The filters and subfilters are presented in alphabetical order under the FILTERS section of the Sidebar. Values from People or Tags may be added, in addition to Deduplication and selections from other non-metadata sections of the Sidebar (see Grid for further details):  

    1. Annotations - Quickly examine a list of documents based upon reviewer notes and markups of document images.

    2. Application Type – Supersets of application categories (e.g., Spreadsheets, Word Processing, Audio, Video) as identified in Reveal Processing, useful in quickly aggregating all user data of a certain type.

    3. Batch Folders – Quickly examine folders created by AI-Driven Batch process for your presently-selected tag profile.

    4. Deduplication - As with the Dashboard's Candy Bar, offers to filter by Exact Duplicates or Near Duplicates.

    5. Document Status - Selects on some Fixed Field and other metadata attributes of documents. Has Been Produced, a flag set by checking Mark Documents as Produced in creating a production job, has been added as of Reveal 2024.2 to quickly display produced and non-produced document lists.

    6. Email Threads - Identify messages that belong to the same email conversation chain, duplicates within the thread and messages containing unique content not present in any other message.

    7. Emotional Intelligence - Shows analytic scoring for Positivity, Negativity and other assessments of emotional content found in the language of documents.

    8. Entities - Categories of items identifiable by analytics and useful in filtering.

    9. Formats - Lists the top five Document Types (by extension) in the currently viewed document set as well as Any Value or No Value, with the ability to View all... in a checkbox modal of all available values for filtering.

    10. People - The subfolder Custodians lists the top five people in the currently viewed document set as well as Any Value or No Value, with the ability to View all... in a checkbox modal of all available values for filtering.

    11. Privileged Tags – Quickly examine all documents in the current result set coded with tags in the present tag profile flagged as Privileged. Counts are shown for tags.

    12. Reviewed Status - Shows branches for each of the user's tag profiles, and breaks out documents under each profile as Reviewed or Not Reviewed based upon the settings for each tag profile.

    13. Reviewed Workflow – Filters for Reviewed Status tags coded by User and by Tag Profile, along with Any Value and No Value aggregations for each.
       27 - 06a - Review Workflow filter

    14. Tags - Shows all tags in the user's currently selected tag profile for filtering selection. Counts are shown for tags.

  2. Advanced Search: The search may be viewed and manipulated in Advanced Search, including updates using +Add. The Fields section under +Add in Search Builder lists all fields for selection as metadata filters.  

    1. Here we will select Custodian ... 27 - 07 - Format Multi-Filter Add Field Condition

    2. ...and enter a value in the field's search box to find and select the desired value(s).  27 - 08 - Format Multi-Filter Add Custodian dialog-1

    3. This adds AND Custodian: skilling to a third clause of the search.   27 - 09 - Format Multi-Filter AND Custodian Advanced Search

    4. The AND may be toggled to an OR with a click. Items may be repositioned within the current pill by clicking and dragging the handles at left.

Data Visualization Filters

The Reveal Dashboard contains graphs representing values in the currently retrieved data, and clicking on any of these values will add further filters to the search. In addition to the Timeline and the Candy Bar (Originals / Duplicates / Near Duplicates), analytic and filtering options from dashboard data visualizations includes the following modifiable and movable graphics.

27 - 10 - Filters from Graphs

  1. Extensions, or other selectable metadata. Any change to the selected metadata value will persist to subsequent sessions.

  2. Entities, either showing all types of people, places or things tracked or a selected type. Any change to the selected value will persist to subsequent sessions.

  3. Custodian, to filter by the party providing a certain set of documents. Any change to the selected metadata value will persist to subsequent sessions.

  4. Documents by Predictive Scores, analytics shown by model used.

  5. Senders + Recipients to filter by persons either sending or receiving messages.

  6. Domains to filter a search by domains encountered in messages sent or received, and whether some documents were unique.

  7. Emotional Intelligence to apply language analytics to contents returned by a search.


What's Next
ESC

Eddy AI, facilitating knowledge discovery through conversational intelligence