importing and exporting (med) Flashcards
what are some of the primary goals of processing?
- Discern, at an item level, exactly what data is found in a certain source.
- Record all item-level metadata as it existed prior to processing.
- Enable defensible reduction of data by selecting only items that are appropriate to move forward to review.
t or f: processing performs language identification.
false
what enables you to efficiently gather runtime diagnostic information?
the logging framework
what can you use logging for?
troubleshooting application problems when you need a very granular level of detail
why shouldn’t you set your logging to verbose when publishing documents to a workspace?
can cause your worker to run out of resources, which can then cause you publish job to cease entirely
what is a processing profile?
object that stores the numbering, deNIST, extraction and deduplication settings that the processing engine refers to when publishing documents in each data source that you attach to your processing set
what happens if you delete a processing profile that is associated with a processing set that you’ve already started?
the in-progress processing job will continue with the original profile settings you applied when the job was submitted
why won’t relativity re-extract text for a re-discovered file (unless an extraction error occurs)?
because processing always refers to the original master document and the original text stored in the database
is it possible for your workspace to contain a document family that has both suffixed and non-suffixed child documents?
yes
why is the same NIST list used for all workspaces in the environment?
because it is stored on the worker manager server
why don’t you need to set the Extract children field to Yes to have the files within PST and other container files extracted and processed?
because relativity breaks down container files by default without the need to specify to extract children
what happens when you select the dtSearch Text Extraction Method?
the Excel Header/Footer Extraction field below is made unavailable for selection because dtSearch automatically extracts header and footer information and places it at the end of the text
what would happen when you process files with both the OCR and the OCR Text Separator fields enabled?
any section of a document that required OCR will include text that says “OCR from image”, and this can pollute a dtSearch index, since now it has to add text that was not originally in the document
what would happen if you change the deduplication method in the middle of running a processing set?
could result in black DeDuped Custodians or DeDuped paths fields after publish, when those fields would otherwise display deduplication information
what does it mean to globally deduplicate your documents?
arranges for documents from each processing data source to be de-duplicated against all documents in all other data sources in your workspace; there should be no exact email duplicates in the workspace after you publish
what is a processing set?
an object to which you attach a processing profile and at least one data source and then use as the basis for a processing job
why should you never upgrade your relativity version while there are jobs in progress in your environment?
leads to inaccurate results when you attempt to finish those jobs after your upgrade is complete. This is especially important for imaging and processing jobs.
A single processing set can contain ____data sources.
multiple
how many processing profiles can be added to a processing set?
only one
you can’t delete a workspace if: (3)
there is an in-progress:
inventory
discovery
publish job in the processing queue
what would happen if you added documents to a workspace and link those document to an in-progress processing set?
distorts the processing set’s report data
how should you NEVER stop relativity services?
through windows services or the use of the IIS to stop a processing job
inventory only populates:
job level errors
discovery populates:
job and document level errors
what can field can you add to any view in order to indicated which processing set a document came from?
originating processing set document field
what mass operations are not available for use with processing sets?
copy, edit and replace
what instance setting determines the frequency in with which the processing set console refreshes?
ProcessingSetStatusUpdateInvertal
email notifications are sent:
per the completion of processing sets, not data sources (unless there is a job-level error)
during publish, when does relativity start the following source?
as soon as the previous one reaches the deduplication and document ID generation stage
If you add, edit, or delete a data source associated with a processing set that has already been inventoried but not yet discovered, then:
you must run inventory again on that processing set
You can’t add or delete a data source to or from a processing set that has: (2)
already been discovered, or if there’s already a jo in the processing queue for the processing set
what prefix takes precedence over the prefix of the processing profile?
prefix from the custodian
when new custodians are created using the quick-create set(s) layout, what is the classification set to?
custodian - processing
what does relativity do if you add processing to an environment that already has custodian information in its database?
creates separate custodian entries
what does relativity processing offer?
complete metadata extraction as well as full container extraction
what is involved in the setup of the relativity processing workflow?
- create new custodian entities
- create processing profile
- create processing set
what is involved in the process step of the relativity processing workflow?
- inventory and filter
- discover the files
- publish discovered files to a workspace