- 19 Nov 2024
- 9 Minutes to read
- Print
- DarkLight
- PDF
Migration Import - Logikcull
- Updated on 19 Nov 2024
- 9 Minutes to read
- Print
- DarkLight
- PDF
Once you have exported your data from Open Discovery, Local Discovery or Reveal, there are two parts to the Logikcull import process:
Importing your data to Logikcull.
Quality checking your upload.
Importing Data Into Logikcull
Once you’ve exported your data from your old review platform, you’re ready to begin moving it to Logikcull. Here, Logikcull will help you automatically map your old fields to your new platform.
If you’ll be migrating data over several uploads, consider the timeline of your coming review. Begin with the data that needs to be accessed first, such as unreviewed documents or files needing a second-pass review.
Note
During project creation, be mindful of the timezone setting. Typically, source database timezone should match destination. You have the option to set and normalize your Timezone by choosing UTC.
When you have a completed archive export that meets Logikcull’s production requirements, create a new project in your Logikcull account <1>.
Go to the Uploads tab, click Create a new upload and select Production Upload. The next step will ask for the load files, which should be accessible outside of the zip from your exported data.
Select I'd like to import a data load and checkmark Images, Natives and/or Text <4>, depending on what was included in the export. Usually, the volume should have subfolders <5> titled Images, Natives and/or Text that will let you know what was provided.
Drag and drop, or click to upload your load files. The top box is for .DAT or .CSV files <6>; the bottom box is for .OPT or .LFP files <7>.
Helpful Tip: Image load files
If the images are in multi-page PDF/TIFF format and the filenames are named by their corresponding Bates number, attaching an image load file (i.e. OPT/LFP file) is optional, e.g. For Begin Doc BATES000001, with a multi-page image named as BATES000001.pdf or BATES0000001.jpg, Logikcull will automatically locate the images and load accordingly.Map your import fields. Logikcull will automatically map suggested matches based on the similarity of the field names (see items <8> in the figure below), and mark the rows in green. It is recommended to check that they are mapped correctly; the definition of the field and the sample data that pulls from the first three lines of your load file can help as references to confirm or modify this choice. All fields will have to be mapped before you can move on to the next step. When mapping any fields, you can either choose to: (1) import them into an existing Logikcull Document field, (2) import them as a new custom field, or (3) not import them.
Helpful Tips: Field Mapping
• If there is no corresponding Logikcull field for one of the imported fields, it may be helpful to choose the "import as a new field" option so the metadata is still searchable in Logikcull, as needed.
• If you’re uploading multiple datasets from your old platform, you only have to map fields once! <9> Mapping templates can be re-used when uploading multiple productions, as the fields are expected to be consistent. When you’re done with the initial field mapping, save it as a template in the bottom left then select that template for subsequent uploads.
• If you have a large number of fields, you can checkmark select fields, or use the "bulk map" dropdown <10> to quickly select unmapped/mapped fields, or deselect all fields. Then, you can choose to "import as new", "do not import", or "reset to Logikcull selections" in a single action.
• After successfully mapping all fields, validate the load file. If you encounter any errors during load file validation, please contact support through in-app chat for assistance.Upload your data volume. <11> This should be the entire set of images/natives/text files compressed in a single .zip file. Input a name for this import. Click Next to upload and validate the volume.
Note
The following are checked by default if any of your documents don’t have searchable text files or rendered images, Logikcull will generate them automatically.
- Attempt OCR if no text is detected
- Render native if no image is detectedWhile data is transferring, indicated by the blue progress bar, do not logout or disconnect your session as it will interrupt the transfer.
Helpful Tips:
• Make sure your computer doesn’t go into sleep mode, as this interrupts the transfer.
• If the file is on a network/shared drive, make sure you have continuous connection. If not, copy that file to a local drive and start the transfer from there.When the progress bar turns green, you can safely disconnect and log out. At this stage the data is in Logikcull and processing will begin. If you run into any errors or have any questions about this workflow, feel free to contact our support team through in-app chat.
Helpful Tip: Upload Times
Consider the timeline of your review once you’ve migrated documents in to Logikcull. If you’re migrating your archive through several uploads, begin with the data that needs to be accessed first. Once the data is successfully transferred to Logikcull (green progress bar), rule of thumb for processing 200-300GB of data takes around 3-4 days.
Post-upload quality control and setting up your review
Once your data is uploaded to Logikcull, you’ll want to give it a quick review to ensure everything is ready to go. Here are our top QC tips, plus more suggestions to help you set up and get into your continued review quickly.
Once your upload completes you'll receive a notification via email and in-app.
Before you can start reviewing, analyzing, and searching your data we recommend performing these helpful post-upload quality control checklist:
Check that the entire document Bates range and page count shown in the upload report matches your data load file.
Where applicable, check the images, natives and text are displayed in the document viewer. Please note that some images may reflect slip sheets as intended depending on the file type or production.
If parent-child exist, also known as a document family, utilize the Family Status carousel filter to verify that “Is Parent” and “Is Child” items were properly identified.
Check the expected File or Email information from your DAT or CSV metadata load file such as document date, file path, email from/to/cc/bcc etc. were loaded appropriately.
Execute a bates number search to find specific Begin Doc or Control ID number. Sample syntax: bates:FAM00000001
If there are any corrections, note that Logikcull allows you to perform overlays to update any of these items; images, natives, text or metadata information.
For database uploads, search through your imported metadata fields to transfer work product by bulk tagging results.
You are now migrated and ready to go. As you continue your review, here are some additional resources that may be of help:
APPENDIX: Logikcull Upload Format Specifications
Please review the following data upload specifications and recommendations for preparing uploads to Logikcull.
The database volume must be compressed with a .zip extension.
If the zip file is password-protected, you will have to copy the database volume contents into a zip file without a password.
If the zip file has multiple parts (e.g. .zip, .z01, .z02, etc…), the partitioned zips must be fully extracted, combined into a single volume, and compressed into a single .zip file.
We recommend that the database volume is no larger than 50GB and recommend asking that larger productions be split into smaller, 50GB productions.
Production Upload Folder Structure
Directory structure typically includes the following folders within a top-level volume folder:
DATA
Contains the required metadata load file (DAT, CSV, or TXT) and an image load file (OPT or LFP). Logikcull accepts the following encoding formats for these load files:
UTF8
UTF16
ISO-8859-1 (Latin 1 - also sometimes referred to as Western European)
IMAGES
Supported formats:
Single page black and white TIFF and color JPEG
Multi-page black and white TIFF
Multi-page PDF (black and white, or color)
NATIVES
If Natives are included, Native path must be provided in the DAT file.
TEXT
If Text is included, Text path must be provided in the DAT file.
Load File Structure
The sort order between the metadata load file and the image load file must be aligned.
A Metadata load file must have a header row with unique/clear field names so that they can be mapped properly.
Metadata load files must be in delimited text format, but you can choose which delimiters to use in the load file. Metadata load file makes use of three (3) main delimiters:
Column/Field delimiter: Denotes the separation between column/fields.
Quote/Text Qualifier: Used to separate actual field data from the delimiters around it (especially useful when your field delimiter is a common character, such as a comma).
Newline: Code to denote a newline within the data encompassed by your quote delimiter (most often seen in extracted text fields or other long text fields).
We recommend these characters as delimiters, by default:
Column (separates load file columns) – ASCII 020 (¶)
Quote (marks the beginning and end of each load file field) – ASCII 254 (þ)
Newline (marks the end of a line in any extracted or long text field) – ASCII 174 (®)
A Metadata load file must include at least an Identifier field (in the picture below, this is the “Begin Doc” field). You can provide any additional fields in the load file. Fields that cannot be mapped to Logikcull directly can be imported in as new fields, or not imported at all.
Date and Time Field Formatting
Date and time fields in the metadata load file are preferably separate fields, and should be formatted as follows:
Date:
mm/dd/yyyy
m/d/yyyy
m/dd/yyyy
mm/d/yyyy
mm-dd-yyyy
m-d-yyyy
m-dd-yyyy
mm-d-yyyy
yyyy-mm-dd
yyyymmdd
Time:
hh:mm:ss [AM|PM]
hh:mm [AM|PM]
hh:mm:ss
hh:mm
h:mm [AM|PM]
h:mm:ss [AM|PM]
hhmmss
Date-Time:
12/31/2019 1:13:30 PM
12/31/2019 1:13 PM
12/31/2019 1 13 30 PM
12/31/2019 1 13 PM
12-31-2019 1:30 PM
12-31-2019 13:13:30
12-31-2019 13:13
2019-12-31 13:13:30
2019-12-31 13:13
20191231 131330
20191231 1313
2019-12-31T13:13:30-06:00
2019-12-31T13:13:30Z
Months (mm), days (dd), and hours (hh) may all be denoted by a single digit except for yyyy-mm-dd and yyyymmdd formatting.
If the time is in military format, the AM/PM indicator is not required.
Fields available for direct mapping in a Production Upload
CONTROL NUMBERS | FILE METADATA |
Begin Doc (minimum required) | File Name (preferably with an extension) |
End Doc | File Path |
Family ID* | Document Date |
Begin Family* | File Size |
End Family* | File Author |
Begin Attach* | Last Saved By |
End Attach* | File Company |
File Keywords | |
EMAIL METADATA | File Title |
Email From | File Comments |
Email To | File Subject |
Email CC | File Revision Number |
Email BCC | File Date Create |
Email Subject | File Time Created |
Email Date Sent | File Date Last Modified |
Email Time Sent | File Time Last Modified |
Email Date Received | Full Text |
Email Time Received | |
SPECIAL FIELDS | |
Native File Link† | |
Text File Link† |
Establishing Family Relationships
“Family ID”, “Begin Family” and “End Family”, or “Begin Attach” and “End Attach” fields are utilized to establish a family relationship. This can only be accomplished during the initial upload. Family relationships cannot be overlaid. An example of supported metadata information for family members is shown below.
Custodian Field
The custodian field cannot be overlaid and must be included in the initial metadata load file used to upload a production, if you'd like to populate our existing custodian field. The Custodian names in the field must be < 50 characters and restricted to letters, numbers, spaces, and [,._-"()]
Native, Text, and Image Links
Native File Link and Text File Link in the metadata load file must be properly mapped and match the relative file paths in the volume directory. The Image File Link in the image load file, if provided, also must match the relative file path to images in the volume directory.