Electronic Discovery Quiz Flashcards

1
Q

Deposition

A

interrogation of a party or witness (“deponent”) under oath, where both the questions and responses are recorded for later use in hearings or at trial

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Interrogatories

A

Interrogatories are written questions posed by one party to another to be answered under oath. No recording taken

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Requests for Production:

A

demand to inspect or obtain copies of tangible evidence and documents

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Requests for Physical and Mental Examination:

A

request for examining a party for mental and health soundness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Requests for Admission:

A

used to require parties to concede, under oath, that particular facts and matters are true or that a document is genuine.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Subpoena:

A

a directive requiring the recipient to take some action, typically to appear and give testimony or hand over or permit inspection of specified documents or tangible evidence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Scope of discovery:

A

nonprivileged, proportional, and relevant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Protective order:

A

“The court may, for good cause, issue an order to protect a party or person from annoyance, embarrassment, oppression, or undue burden or expense”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

ESI competence

A

Assessment, Preservation, Sources, Custodian, Search, Collection, Counsel, Conference, Production

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

EDRM

A

Electronic Discovery Reference Model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Stages in EDRM

A

Information Governance, Identification, Preservation, Collection, Processing, Review, Analysis, Production, Presentation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Information Governance

A

Getting your electronic house in order to mitigate risk & expenses, from initial creation of ESI (electronically stored information) through its final disposition.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Identification:

A

Locating potential sources of ESI & determining its scope, breadth & depth.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Preservation

A

Ensuring that ESI is protected against inappropriate alteration or destruction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Collection

A

Gathering ESI for further use in the e-discovery process (processing, review, etc.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Processing

A

Reducing the volume of ESI and converting it, if necessary, to forms more suitable for review & analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Review

A

Evaluating ESI for relevance & privilege.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Analysis

A

Evaluating ESI for content & context, including key patterns, topics, people & discussion.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Production

A

Delivering ESI to others in appropriate forms & using appropriate delivery mechanisms.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Presentation

A

Displaying ESI before audiences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Analog

A

analog recording— the variations in the recording are analogous to the variations in the music

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Areal Density

A

the quantity of data (in bits) that can be stored on a given surface area of a computer storage medium.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Actuator Arm

A

on an electromagnetic hard drive, holds the read/write heads

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Bates Numbering

A

an organizational method to label and identify legal documents, especially those produced in discovery.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Behind the Firewall:

A

refers to on-premises (“on prem”) computers and networks that exist within a party’s physical dominion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

CD

A

optical media, holds about 700mb

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

CHS Addressing

A

file storage locations were based on the physical geometry of the platters, addressed by Cylinder, Head and Sector tuples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Cloud:

A

typically reside in facilities not physically accessible to persons using the servers, and servers are not typically dedicated to a single user.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Clusters

A

operating systems speed the process by grouping sectors into contiguous chunks of data called clusters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Cylinders

A

Tracks that overlie one-another on both sides of a platter and across multiple platters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

De-Duplication

A

Hashing serves to flag identical documents, permitting a single, consistent assessment of an item that might otherwise have cropped up hundreds of times and been differently characterized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

De-NISTing

A

cull data collected from computers that couldn’t be evidence because it isn’t a custodian’s work product. It’s done by matching hash values of collected data files to hash values corresponding to common retail software and operating systems.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

DVD

A

Optical media, holds 4.7gb

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Electromagnetic

A

examples include magnetic tape, floppy disks, and electromagnetic hard drive, and storage tape

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Encoding

A

how data is translated to be stored in various forms of media, such as binary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Form Factor

A

hardware design aspect that defines and prescribes the size, shape, and other physical specifications of components, particularly in electronics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Hashing

A

the use of mathematical algorithms to calculate a unique sequence of letters and numbers to serve as a reliable digital “fingerprint” for electronic data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Master File Table

A

NTFS uses a powerful and complex file system database called the Master File Table or MFT to manage file storage.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

Network Share

A

When the user stores data to the mapped drive, that data is backed up along with the contents of the file server. Although network shares are not local to the user’s computer, they are typically addressed using drive letters (e.g., M: or T:) as if they were local hard drives. An allocation of remote storage employed to facilitate routine backup of user data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

NTFS:

A

Windows file system, uses MFT

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Platters

A

round, flat discs on an EM hard drive, coated on both sides with a special material able to store data as magnetic patterns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

RAID Array:

A

Redundant Arrays of Independent Disks. Data divided across multiple drives using a technique called striping. “When a drive fails using RAID 1, you’ve still got one copy of the data; when a drive fails using RAID 0, you’ve got nothing—zip, ZERO!.”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

Read-Write Head:

A

on an electromagnetic hard drive, read and write data from a platter. Space between the head and platter is made of air

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

SAN and NAS:

A

storage devices, SAN (for Storage Attached Network) or a NAS (for Network Attached Storage).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

Sectors

A

Disk formatting, first with various concentric rings of data called tracks, and then with tracks further subdivided into tiny arcs called sectors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

SIM card

A

serve both to authenticate and identify a communications
device on a cellular network and to store SMS messages and phone book contacts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

Solid State Storage

A

storage devices with no moving parts where the data resides entirely within the solid semiconductor material which comprise the memory chips. Examples include Flash Drives, Memory Cards, SIMs and Solid-State Drives

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

Media Tracks:

A

low level formatting divides each platter into tens of thousands of densely packed concentric circles called tracks. Each track is broken down into physical sectors of 4096 bytes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

Partitioning

A

divides drives into volumes, which users see as drive letters (e.g., C:, E:, F: and so on).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

Formatting

A

defines the logical structures on the partition and places necessary operating system files at the start of the disk to facilitate booting.

low level-carving out tracks and sectors in the old days; high level=defines structures on a partition

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

“Local” or “on-prem” servers

A

employ hardware that’s physically available to the party that owns or leases the servers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

“Peer-to-Peer” (P2P) networks:

A

exploit the fact that any computer connected to a network has the potential to serve data across the network..

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

In order of data capacity:

A

Bits < Bytes(8 bits) < Sectors(512 bytes convetional) < Clusters(8 sectors–4096 bytes) < Tracks < Cylinders < Platters < Drive<Array

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

EXIF data

A

photo metadata, detailing information about the date and time the
photo was taken, the camera, settings, exposure, lighting, even precise geolocation data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

Load files:

A

ancillary files that can be used to extract metadata from TIFF files where metadata was stripped away

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

Header data:

A

detailing the routing and other information about message transit and delivery for email

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

“MAC dates”

A

Last Modified, Last Accessed and Created. Last modified is most useful, the last accessed is least useful

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

Chain of custody

A

describes the processes used to track and document the acquisition, storage and handling of evidence to be able to demonstrate that the integrity of the evidence has not been compromised.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

Preserving family relationships

A

safeguarding the association between the data and metadata

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

e-discovery review platforms

A

the software tools lawyers use to search, sort, read and tag electronic evidence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
61
Q

TIFF images:

A

strips away all the metadata

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
62
Q

Williams v. Sprint/United Mgmt Co., 230 F.R.D. 640 (D. Kan. 2005)

A

The court responded by ordering production of all metadata as maintained in the ordinary course of business, save only privileged and expressly protected metadata.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
63
Q

Privilege log

A

disclose what’s been withheld or redacted

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
64
Q

Delimited load file

A

Metadata may be produced as a database or housed in it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
65
Q

Crucial Distinctions: System versus Application Metadata

A

File tables hold system metadata about the file (e.g., name, locations on disk, MAC dates): it’s CONTEXT residing outside the file
Files hold application metadata (e.g., geolocation data in photos, comments in docs): it’s CONTENT embedded in the file.
System Metadata Examples: File names, file sizes, Modified, Accessed and Created (MAC) dates, file locations (path), custodian.
Application Metadata Examples: Comments, tracked changes, editing times, last printed dates.
System Metadata values must be collected and produced in delimited text files called “Load Files.”
Application Metadata is embedded in native files, but when files are not produced in native formats, Application Metadata must likewise be extracted and produced in load files.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
66
Q

Active data

A

available to users

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
67
Q

Encoded data

A

Log files and system files are examples about Encoded data that reveal info about a user’s behavior

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
68
Q

Unallocated clusters and slack space

A

holds discarded data, is a forensic artifact

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
69
Q

Slack space

A

difference between file size and nearest cluster size, that’s empty

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
70
Q

Computer forensics

A

is the expert acquisition, interpretation and presentation of the data within these three categories (Active, Encoded and Forensic data)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
71
Q

Master file table

A

AKA MFT, used by Windows’ NTFS file system to track location of files

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
72
Q

Resident files

A

files small enough to be stored fully in the MFT

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
73
Q

Formatting

A

low level-carving out tracks and sectors in the old days; high level=defines structures on a partition

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
74
Q

Partitioning:

A

into volumes, such as C: or E:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
75
Q

Bro-Tech Corp. v. Thermax,

A

an examiner should have no trouble understanding what was expected to examine

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
76
Q

Examination protocol

A

an order of a court or an agreement between parties that governs the scope and procedures for testing and inspection of a source of electronic evidence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
77
Q

Windows registry system

A

central database that stores information the OS needs to manage in hives

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
78
Q

Shellbags

A

maintain information about folder configuration, such as when it was open

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
79
Q

swap/page file

A

File on disk that’s an extension of RAM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
80
Q

Named entities

A

passwords, phone numbers, english text, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
81
Q

Volume shadow copies

A

Windows feature, store a copy of basically everything

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
82
Q

Fragmentation

A

splitting files for storage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
83
Q

File table

A

file directory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
84
Q

File carving

A

copying/recovering a deleted file by binary signature, remnant directory data, or keyword

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
85
Q

Binary signature

A

unique signature identifying file type

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
86
Q

Noise hits

A

false positives when searching by keyword

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
87
Q

Hexavigesimal:

A

Base 26 encoding

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
88
Q

Steganography

A

Hidden messages, one system was invented by Francis Bacon

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
89
Q

File format:

A

establishes the way to encode and order data for storage within a file.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
90
Q

.TGA

A

Targa Graphics files

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
91
Q

.TAG

A

Dataflex data file

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
92
Q

File type identification

A

done using binary file signatures and file extensions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
93
Q

Binary file signature

A

(also called a magic number) will typically occupy the first few bytes of a file’s contents. It will always be hexadecimal values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
94
Q

Pdf files binary signature

A

Starts with hex corresponding to %PDF- in ASCII

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
95
Q

MS Tape archive

A

Starts with hex corresponding to TAPE in ASCII

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
96
Q

Adobe Shockwave Flash

A

Extension .SWF but file signature starts with FWS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
97
Q

JPG image

A

Starts with hex corresponding to ÿØÿà in ASCII

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
98
Q

Offset addressing

A

beginning retrieval at a specified number of bytes from the start of the file (offset from the start) and retrieving a specified extent of data from that offset forward

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
99
Q

Chunk structure

A

data is labeled within the file to indicate its beginning and ending, or it may be tagged (“marked up”) for identification. most commmon file structure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
100
Q

Directory structure

A

constructs a file as a small operating environment. The directory keeps track of what’s in the file, what it’s called and where it begins and ends. Examples: ZIP, MS office files after Office 2007

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
101
Q

Lossless compression

A

If the compression algorithm preserves all compressed data. Example: ZIP uses algo called DEFLATE(free, efficient, most common)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
102
Q

Lossy compression

A

jettisons data. Example: JPEG, Sharpness and color depth is lost in JPEG compression, rough margins called “jaggies”. MPEG and MP3 are also lossy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
103
Q

Identification tool exception

A

If file type cannot be determined from metadata or file signature, flag the file as unknown or pursue other methods such as Byte frequency analysis(BFA)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
104
Q

Only binary files have signatures

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
105
Q

Run-Length Encoding

A

It works especially well for images containing consecutive, identical data elements, like the ample white space of a fax transmission.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
106
Q

Media (MIME) Type Detection

A

MIME, which stands for Multipurpose Internet Mail Extensions, is a seminal Internet standard that enables the grafting of text enhancements, foreign language character sets (Unicode) and multimedia content (e.g., photos, video, sounds and machine code) onto plain text e-mails. Used by Linux and Mac OS. All email is in MIME format

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
107
Q

Internet Assigned Numbers Authority (IANA):

A

oversees global Internet addressing and defines the hierarchy of media type designation. IANA is prompted to change MIME Types to Media Types

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
108
Q

Media types follow a path-like tree structure under one of the following standard types: application, audio, image, text and video (collectively called discrete media types) and message and multipart (called composite media types)

A

Just study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
109
Q

Not IANA

A

File types prefixed with x- are not IANA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
110
Q

Vendor specific

A

prefixed .vnd

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
111
Q

Octet stream

A

When file type is not identifiable exception, identifies as octect stream which is an arbitrary sequence or “stream” of data presumed to be binary data stored as eight-bit bytes or “octets.” Any file the processor fails to recognize

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
112
Q

ESI is much different than paper documents in crucial ways:

A
  • ESI collections tend to be exponentially more voluminous than paper collections
  • ESI is stored digitally, rendering it unintelligible absent electronic processing
  • ESI is electronically searchable while paper documents require laborious human scrutiny
  • ESI is readily culled, filtered and deduplicated, and inexpensively stored and transmitted
  • ESI carries metainformation that is always of practical use and may be probative evidence
  • ESI and associated metadata change when opened in native applications
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
113
Q

Native applications are not suited to e-discovery

A

and you shouldn’t use them for review. E- discovery review tools are the only way to go.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
114
Q

Two broad approaches used by processing tools to extract content from files.

A

One is to use the Application Programming Interface (API) of the application that created the file. The other is to turn to a published file specification or reverse engineer the file(Document Filters) to determine where the data sought to be extracted resides and how it’s encoded.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
115
Q

Document filters:

A

lay out where content is stored within each filetype and how that content is encoded and interpreted. Leading - Oracle Outside In used by most e-discovery tools

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
116
Q

Aspose Pty. Ltd

A

an Australian concern, licenses libraries of commercial APIs, enabling software developers to read and write to, e.g., Word documents, Excel spreadsheets, PowerPoint presentations, PDF files and multiple e-mail container formats. Aspose tools can both read from and write to the various formats, the latter considerably more challenging.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
117
Q

Hyland Software’s Document Filters

A

is another developer’s toolkit that facilitates file identification and content extraction for 500+ file formats, as well as support for OCR, redaction and image rendering. Per Hyland’s website, its extraction tools power e-discovery products from Catalyst and Reveal Software.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
118
Q

dtSearch

A

commercial product that lies at the heart of several e-discovery and computer forensic tools which serves as both content extractor and indexing engine.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
119
Q

open source side, Apache’s Tika

A

is a free toolkit for extracting text and metadata from over a thousand file types, including most encountered in e-discovery. Tika was a subproject of the open source Apache Lucene project, Lucene being an indexing and search tool at the core of several commercial e-discovery tools.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
120
Q

Compound files

A

Modern productivity files like Microsoft Office documents are rich, layered containers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
121
Q

OLE (Object Linking and Embedding)

A

OLE supports dragging and dropping content between applications and the dynamic updating of embedded content

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
122
Q

unitization

A

update the database with information about what data came from what file, a relationship called unitization.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
123
Q

Family Tracking

A

In the context of e-mail, recording the relationship between a transmitting message and its attachments is called family tracking: the transmitting message is the parent object and the attachments are child objects.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
124
Q

important metadata values to preserve and pair

A

One of the most important metadata values to preserve and pair with each object is the object’s custodian or source.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
125
Q

non-searchable documents

A

Common examples of non-searchable documents are faxes and scans, as well as TIFF images and Adobe PDF documents lacking a text layer.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
126
Q

Exceptions report:

A

A processing tool must track all exceptions and be capable of generating an exceptions report to enable counsel and others with oversight responsibility to act to rectify exceptions by, e.g., securing passwords, repairing or replacing corrupt files and running OCR against the files. Exceptions resolution is key to a defensible e-discovery process.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
127
Q

Lexical preprocessing

A

computers apply rules assigned by programmers to normalize, tokenize, and segment natural language

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
128
Q

Character Normalization

A

Unicode equivalency, diacriticals (accents) and case (capitalization).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
129
Q

Unicode Normalization

A

All accented characters are normalized in a same way. Unicode Consortium promulgates normalization algorithms that produce a consistent (“normalized”) encoding for each identical character

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
130
Q

Diacritical Normalization

A

This requires normalizing the data to forge a false equivalency between accented characters and their non-accented ASCII counterparts. So, if you search for “resume” or “cafe,” you will pick up instances of “resumé” and “café.” As well, we must normalize ligatures like the German Eszett (ß) seen in the word “straße,” or “street.”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
131
Q

Case Normalization:

A

Treat Upper and Lower case same

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
132
Q

Time Zone Normalization:

A

a common processing task is to normalize date and time values according to a single temporal baseline, often Coordinated Universal Time (UTC)— essentially Greenwich Mean Time—or to any other time zone the parties choose

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
133
Q

Normalization vs Tokenization(Important)

A

Normalization is the process of reformatting data to a standardized form, such as setting the date and time stamp of files to a uniform time zone or converting all content to the same character encoding. Normalization facilitates search and data organization.
Tokenization is a method of document parsing that identifies words (“tokens”) to be used in a full-text index. Because computers cannot read as humans do but only see sequences of bytes, computers employ programmed tokenization rules to identify character sequences that constitute words and punctuation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
134
Q

Relativity

A

uses dtSearch as an indexing tool and dtSearch has reserved the character “%”. Relativity, treat all the following characters as spaces:
!”#$&’()*+,./:;<=>?@[\5c]^`{|}~. The following characters CANNOT be made searchable in dtSearch and Relativity: ( ) * ? % @ ~ & : =

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
135
Q

The Concordance Index:

A

The term “concordance” describes an alphabetical listing, particularly a mapping, of the important words in a text.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
136
Q

Culling and selecting dataset

A

We can also cull the dataset by immaterial item suppression, de-NISTing and deduplication, all discussed infra. The crudest but most common culling method is keyword and query filtering; that is, lexical search.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
137
Q

Immaterial Item Suppression

A

Immaterial items are those extracted for forensic completeness but having little or no intrinsic value as discoverable evidence. Common examples of immaterial items include the folder structure in which files are stored and the various container files (like ZIP, RAR files and other containers, e.g., mailbox files like Outlook PST and MBOX, and forensic disk image wrapper files like .E0x or .AFF) that tend to have no relevance apart from their contents.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
138
Q

De-NISTing

A

De-NISTing is a technique used in e-discovery and computer forensics to reduce the number of files requiring review by excluding standard components of the computer’s operating system and off- the-shelf software applications like Word, Excel and other parts of Microsoft Office.Eliminating this noise is called “de-NISTing” because those noise files are identified by matching their cryptographic hash values (i.e., digital fingerprints, explanation to follow) to a huge list of software hash values maintained and published by the National Software Reference Library, a branch of the National Institute for Standards and Technology (NIST). the better focused the e-discovery collection effort (i.e., the more targeted the collection), the smaller the volume of data culled via de-NISTing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
139
Q

Near-Deduplication

A

the first file is sometimes called the “pivot file,”and subsequent files with matching hashes are suppressed as duplicates, and the instances of each duplicate and certain metadata is typically noted in a deduplication or “occurrence” log

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
140
Q

Deduplication by hashing

A

requires the same source data and the same, consistent application of algorithms
When parties cannot deduplicate e-mail, the reasons will likely be one or more of the following:
1. They are working from different forms of the ESI
2. They are failing to consistently exclude inherently non-identical data (like message headers and
IDs) from the hash calculation
3. They are not properly normalizing the message data (such as by ordering all addresses
alphabetically without aliases)
4. They are using different hash algorithms
5. They are not preserving the hash values throughout the process; or
6. They are changing the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
141
Q

Entropy Testing

A

Entropy testing is a statistical method by which to identify encrypted files and flag them for special handling.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
142
Q

Bad Extension Flagging

A

Most processing tools warn of a mismatch between a file’s binary signature and its extension, potentially useful to resolve exceptions and detect data hiding.

143
Q

Color Detection

A

When color conveys information, it’s useful to detect such usage and direct color-enhanced items to production formats other than grayscale TIFF imaging.

144
Q

Hidden Content Flagging

A

It’s common for evidence, especially Microsoft Office content, to incorporate relevant content (like collaborative comments in Word documents and PowerPoint speaker notes) that won’t appear in the production set. Flagging such items for special handing is a useful way to avoid missing that discoverable (and potentially privileged) content.

145
Q

N-Gram and Shingle Generation

A

Increasingly, advanced analytics like predictive coding aid the review process and depend upon the ability to map document content in ways that support algorithmic analysis. N-gram generation and text shingling are text sampling techniques that support latent-semantic analytics.

146
Q

Optical Character Recognition (OCR)

A

OCR is the primary means by which text stored as imagery, thus lacking a searchable text layer (e.g., TIFF images, JPGs and PDFs) can be made text searchable.

147
Q

Virus Scanning

A

Files collected in e-discovery may be plagued by malware, so processing may include methods to quarantine afflicted content via virus scanning.

148
Q

Did you read 210 to 220?

A

Yes

149
Q

MFT entries begining

A

FILE0

150
Q

Six of E-Discovery:

A

Key Custodians’ E-Mail (Sources: server, local, archived and cloud),
Key Custodians’ Documents and Data: Network Shares,
Mobile Devices: Phones, Tablets, IoT,
Key Custodians’ Documents and Data: Local Storage, Social Networking Content,
Databases (server, local and cloud),
Seven - Cloud

151
Q

Key Custodians’ E-Mail

A

These servers may be physical hardware managed by IT staff or virtual machines leased from a cloud provider, either running mail server software, most likely applications called Microsoft Exchange or Lotus Domino. A third potential source is a Software as a Service (SaaS) offering from a cloud provider
On desktops and laptops, e-mail is found locally (on the user’s hard drive) in container files with the file extensions .pst and .ost for Microsoft Outlook users or .nsf for Lotus Notes users. Finally, each user may be expected to have a substantial volume of archived e-mail spread across several on- and offline sources, including backup tapes, journaling servers and local archives on workstations and in network storage areas called shares (discussed below).
Types of messages (did they retain both Sent Items and Inbox contents? Have they retained messages as they were foldered by users?);
Temporal range of messages (what are the earliest dates of e-mail messages, and are there significant gaps?); and
Volume (numbers of messages and attachments versus total gigabyte volume—not the same thing).

152
Q

Key Custodians’ Documents and Data: Network Shares

A

in the form of productivity documents MS Word, Excel, PPT etc
network share or file share.
“shared” among multiple users depending upon the access privileges granted to them by the network administrator. (group share)
deal rooms or work rooms where users “meet” and collaborate in cyberspace

153
Q

Mobile Devices: Phones, Tablets, IoT

A

The bottom line is: if you’re not including the data on phones and tablets, you’re surely missing relevant, unique and often highly probative information.

154
Q

Key Custodians’ Documents and Data: Local Storage

A

Though it’s expedient to assume that no unique, potentially-responsive information resides in local storage, it’s rarely a sensible or defensible assumption absent document efforts to establish that the no-local-storage policy and the local storage reality are one-and-the-same.

155
Q

Databases (server, local and cloud)

A

Databases hold so-called structured data, a largely meaningless distinction when one considers that most of data stored within databases is unstructured, and much of what we deem unstructured data, like e-mail, is housed in databases.
If standard reports aren’t sufficient to meet the needs in discovery, inquire into the databases schema (i.e., its structure) and determine what query language the database supports to explore how data can be extracted.

156
Q

Challenges in Electronic Evidence Preservation

A

Preserving electronically stored information (ESI) poses unique challenges because:
* Touching ESI changes it
Windows NTFS file system offers three “slots” for storing file dates (i.e., Modified, Accessed and Created), the CD-R’s Joliet file structure supplies just one.
Similar incongruities may impact the ability to store long filenames as well as the precision of time values
* Digital evidence is ill-suited to printing
* ESI must be interpreted to be used
All digital data are just streaming information denoted as ones and zeroes. For these streams of data to convey anything intelligible to humans, the data must be interpreted by a computer using specialized programming called “interfaces” and “applications.”
* Storage media are fragile and dynamic, changing all the time
* Digital storage media are disposable and recyclable
A later version of a document may overwrite—and by so doing, destroy—an earlier draft, and storage space released by the deletion of one file may well be re-used for storage of another. This is in sharp contrast to paper preservation, where you can save a revised printout of a document without affecting—and certainly not obliterating– a prior printed version.

157
Q

DUTY TO PRESERVE:

A

The duty to preserve evidence may arise before—and
certainly arises without—a preservation letter. A party’s obligation to preserve
evidence is generally held to arise when the party knows
or has reason to know that evidence may be relevant to
future litigation. The preservation letter is but one of several events sufficient to trigger the duty to preserve evidence, but the preservation letter is an explicit, decisive trigger.

158
Q

BALANCE, REASONABLENESS AND PROPORTIONALITY

A

If your goal is to keep the other side from destroying
relevant evidence, any judge will support you in that effort if your demands aren’t cryptic, overbroad or unduly burdensome. In a word: proportionate.

the preservation letter neither creates the duty to preserve nor constrains it. Parties must still think for themselves. If the evidence was relevant and discoverable, its intentional destruction is spoliation, even if you didn’t cite it in your preservation demand.

159
Q

PRESERVATION ESSENTIALS

A

If your preservation letter boils down to “save everything about anything by everyone, everywhere at any time,” it’s time to re-draft it because not only will no trial court enforce it, many will see it as discovery abuse.

160
Q

THE NATURE OF THE CASE

A

More elucidative

161
Q

WHEN TO SEND A PRESERVATION LETTER

A

The conventional wisdom is that preservation letters should go out as soon as you can identify potential defendants

162
Q

WHO GETS THE LETTER?

A

Certainly, if an individual will be the target of the action, he or she should receive the preservation letter.Consider who is most likely to unwittingly destroy evidence and be certain that person receives a preservation letter. Sending a preservation letter to a person likely to destroy evidence intentionally is a different story. The letter may operate as the triggering event to spoliation, so you may need to balance the desire to give notice against the potential for irretrievable destruction.

163
Q

HOW MANY PRESERVATION LETTERS?

A

It’s common to dispatch a single, comprehensive request, but might it instead be wiser to present your demands in a series of focused requests, broken out by, e.g., type of digital medium, issues, business units, or the roles of key players?

164
Q

SPECIFYING FORM OF PRESERVATION

A

Your preservation letter should not demand preservation in forms other than those used in the ordinary course of business. However, when your specification operates to ease the cost or burden to the producing party or otherwise help the producing party fulfill its preservation obligation, an alternate format might be suggested.

165
Q

Backup tapes:

A

The tapes containing the oldest backed-up information are typically recycled. This practice is “tape rotation,” and the interval between use and reuse of a tape or set of tapes is the “rotation cycle” or “rotation interval.”

166
Q

When deletion deletes data?

A

Deletion rarely erases data. In fact,
there are three and only three ways that information’s destroyed on personal computer
Completely overwriting the deleted data on magnetic media (e.g., floppy disks, tapes or conventional hard drives) with new information.
Strongly encrypting the data and then “losing” the encryption key; or,
Physically damaging the media to such an extent that it cannot be read.

167
Q

Preservation letter may specify

A
  1. Act to Prevent Spoliation
  2. System Sequestration or Forensically Sound Imaging [When Implicated]
    2a.The products of forensically sound duplication are called, inter alia, “bitstream images” of the evidence media. A forensically sound preservation method guards against changes to metadata evidence and preserves all parts of the electronic evidence, including deleted evidence within “unallocated clusters” and “slack space.”
    2b. Be advised that a conventional copy or backup of a hard drive does not produce a forensically sound image because it captures only active data files and fails to preserve forensically significant data existing in, e.g., unallocated clusters and slack space.
  3. Further Preservation by Imaging
168
Q

Recursion

A

mechanism by which a processing tool explores, identifies, unpacks and extracts all embedded content from a file

169
Q

three elements often absent from the adversarial process

A

communication, cooperation and trust.

170
Q

Backup Systems Case

A

Going back as far as 1986 and Col. Oliver North’s deletion of e-mail subject to subpoena in the Reagan-era Iran-Contra affair, it’s long been backup systems that ride to truth’s rescue with “smoking gun” evidence.

171
Q

Disaster recovery

A

Backup tapes are made for disaster recovery, i.e., picking up the pieces of a damaged or corrupted data storage system. Some call backups “snapshots” of data, and like a photo, backup tapes capture only what’s in focus.

172
Q

full backups

A

typically focus on all user created data

173
Q

Incremental backups

A

grab just what’s been created or changed since the last full or incremental backup.

174
Q

tape restoration

A

Process of putting data back together again in a process called tape restoration.

175
Q

tape rotation

A

older tapes are supposed to be recycled by overwriting them in a practice called tape rotation.

176
Q

legacy tapes

A

business records—sometimes the last surviving copy

177
Q

save data from loss or corruption via one of three broad measures

A

duplication, replication and backup.

178
Q

Duplication

A

is the most familiar–protecting the contents of a file by making a copy of the file to another location.
1. If the copy is made to another location on the same medium (e.g., another folder on the hard drive), the risk of corruption or overwriting is reduced.
2. If the copy is made to another medium (another hard drive), the risk of loss due to media failure is reduced.
3. If the copy is made to a distant physical location, the risk of loss due to physical catastrophe is reduced.

179
Q

Replication

A

is duplication without discretion. That is, the contents of one storage medium are periodically or continuously mirrored to another storage medium. (Example RAID1)

180
Q

backup

A
  1. involves (reversible) alteration of the data and logging and cataloging of content.
  2. Typically, backup entails the use of software or hardware that compresses and encrypts data.
  3. Further, backup systems are designed to support iteration, e.g., they manage the scheduling and scope of backup, track the content and timing of backup “sets” and record the allocation of backup volumes across multiple devices or media.
181
Q

Major Elements of Backup Systems

A
  1. Source Data (Logical or Physical)
  2. Backup Set (Physical or Logical, Full or Changed-File)
  3. Backup Catalog vs. Tape Log
182
Q

Source Data (Logical or Physical)

A
  1. Drive imaging, a specialized form of backup employed by IT specialists and computer forensic examiners, may draw the logical hierarchy of a drive, collecting a “bitstream” of the drive’s contents reflecting the contents of the medium at the physical level.
  2. The bitstream of the medium may be stored in a single large file, but more often it’s broken into manageable, like-sized “chunks” of data to facilitate more flexible storage.
183
Q

Backup Set (Physical or Logical, Full or Changed-File)

A
  1. A backup set may refer to a physical collection of media housing backed up data,
  2. i.e., the collective group of magnetic tape cartridges required to hold the data, or the “set” may reference the logical grouping of files (and associated catalog) which collectively comprise the backed up data.
  3. Types: Full and Changed file backups.
  4. Changed FIle backups into 3 parts: incremental backups(faster to create, less redundancy), differential backups(easier to restore, great redundancy) and delta block-level backups
  5. The first two identify changed files based on either the status of a file’s archive bit or a file’s created and modified date values.
  6. The essential difference is that every differential backup duplicates files added or changed since the last full backup, where incremental backups duplicate files added or changed since the last incremental backup.
  7. The delta block-level method examines the contents of a file and stores only the differences between the version of the file contained in the full backup and the modified version.
  8. This approach is trickier, but it permits the creation of more compact backup sets and accelerates backup and restoration.
184
Q

Backup Catalog vs. Tape Log:

A
  1. The backup catalog tracks, inter alia, the source and metadata of each file or component of the backup set as well as the location of the element within the set.
  2. Without a catalog setting out the logical organization of the data as stored, it would be impossible to distinguish between files from different sources having the same names or to extract selected files without restoration of all of the backed up data.
  3. tape log, which is typically a simple listing of backup events and dates, machines and tape identifier
185
Q

Tape Backup

A
  1. modern backup tapes use a scaled-up version of that back-and-forth or linear serpentine recording scheme
  2. SAIT-2 tape systems employs a helical recording system that writes data in parallel tracks running diagonally across the tape, much like a household VCR.
186
Q

factors that impact restoration:

A
  • Tape format;
  • Device interface, i.e., SCSI or fiber channel;
  • Compression;
  • Device firmware;
  • The number of devices sharing the bus;
  • The operating system driver for the tape unit;
  • Data block size (large blocks fast, small blocks slow);
  • File size (with millions of small files, each must be cataloged);
  • Processor power and adapter card bus speed;
  • Tape condition (retries eat up time);
  • Data structure (e.g., big database vs. brick level mailbox accounts);
  • Backup methodology (striped data? multi server?).
187
Q

Real World Transfer Time

A

Real World Transfer Time = (in Hours) = Native Cartridge Capacity (in GB)/(1.8*Drive Native Transfer Speed)

188
Q

Common Tape Formats

A

LTO

189
Q

Disc to Disc

A

Hard disk arrays can now hold months of disaster recovery data at a cost that competes favorably with tape. Thus, tape is ceasing to be a disaster recovery medium and is instead being used solely for long-term data storage; that is, as a place to migrate disk backups for purposes other than disaster recovery, i.e., archival.

190
Q

vertical deduplication

A

Deduplicating within a single custodian’s mailboxes and documents is called vertical deduplication

191
Q

horizontal deduplication

A

Deduplication of messages and documents across multiple custodians is called horizontal deduplication.

192
Q

in-line deduplication

A

If an identical file has already been stored, the duplicate is not added to the backup media but, instead, a pointer or stub to the duplicate is created

193
Q

post-process deduplication

A

all files are first stored on the backup medium, then analyzed and selectively culled to eliminate duplicates.

194
Q

Non-native restoration

A

A key enabler of low-cost access to tapes and other backup media has been the development of software tools and computing environments that support non-native restoration. Non-native restoration dispenses with the need to locate copies of particular backup software or to recreate the native computing environment from which the backup was obtained

195
Q

Sampling

A

backup tapes entails selecting parts of the tape collection deemed most likely to yield responsive information and restoring and searching only those selections before deciding whether to restore more tapes.

196
Q

TEN PRACTICE TIPS FOR BACKUPS IN CIVIL DISCOVERY(Study Page No: 269)

A
  1. Backup ≠ Inaccessible.
  2. Determine if your client: Routinely restores backup tapes
  3. Don’t blindly pull tapes for preservation.
  4. Be prepared to put forward a sensible sampling protocol in lieu of wholesale restoration.
  5. Test and sample backups to determine if they hold responsive, material and unique ESI.
  6. Be prepared to show that the relevant data on tapes is available from more accessible sources.
  7. Know the limits of backup search capabilities.
  8. Appearances matter!
  9. If using a cloud-based backup system, consider bringing your e-discovery tools to the data in the Cloud instead of spending days getting the data out.
  10. Backup tape is for disaster recovery.
197
Q

Table

A

Collection of data, with rows and columns

198
Q

Field

A

each separate information item you entered (e.g., name, address, court) is called a “field.”

199
Q

Record

A

The group of items you assembled for each client (probably organized in columns and arranged in a row to the right of each name) is collectively called a “record.”

200
Q

Key field

A

Because the client’s name is the field that governs the contents of each record, it would be termed the “key field.”

201
Q

Flat File Database

A

distinguished by the characteristic that all the fields and records (the cards) comprise a single file (i.e., each a deck of cards) with no relationships or links between the various records and fields except the table structure (the order of the deck and the order of fields on the cards).

202
Q

Relational Database

A

Think of it as adding rudimentary intelligence to a database, allowing it to “recognize” that records sharing common fields likely relate to common information.

203
Q

DBMS

A

the software used to create, maintain and interrogate those tables is called the Database Management System

204
Q

Primary Key

A

it cannot repeat within the table for which it serves as primary key, and a properly-designed database will prevent a user from creating duplicate primary keys.

205
Q

Foreign Key

A

This Is Done By Incorporating The Primary Key in the table referenced as a “foreign key” in the referencing table. The table referenced is the “parent table,” and the referencing table is the “child table” in this joining of the two relations.

206
Q

Constraints

A

Field size
Data type
Unique fields
Group or member lists
Validation rules
Required data

207
Q

Structured Query Language

A

SQL’s sole purpose is the creation, management and interrogation of databases.

208
Q

logical schema

A

detailing how the database is designed in terms of its table structures, attributes, fields, relationships, joins and views.

209
Q

physical schema

A

setting out the hardware and software implementation of the database on machines, storage devices and networks.

210
Q

Data Dictionary

A

houses the schema

211
Q

ERD

A

Entity-Relationship Modeling (ERM) is a system and notation used to lay out the conceptual and logical schema of a relational database. The resulting diagrams (akin to flow charts) are called Entity-Relationship Diagrams or ERDs

212
Q

Field Mapping

A

The deconstruction of fielded data is accomplished by a process called Field Mapping.

213
Q

data map

A

A “data map” might be better termed an “Information Inventory.

214
Q

Tuples

A

the contents of each row in a table is called a “tuple,” defined as an ordered list of elements.

215
Q

Attributes

A

Columns

216
Q

relation

A

is defined as a set of tuples that have the same attributes (table)

217
Q

system catalog

A

In an SQL database, the compendium of all that metadata is called the system catalog.

218
Q

Laws: per Rule 30(b)(6) in Federal civil practice or Rule 199(b)(1) of the Texas Rules of Civil Procedure. (Page No: 290)
Study?

A

Yes

219
Q

referential integrity

A

By pointing each child table to that definitive parent via the use of foreign keys, you promote so-called “referential integrity” of the database.

220
Q

fixed length field record

A

A fixed length field record may begin with information setting out information concerning all of the fields in the record, such as each field’s name (e.g., COURT), followed by its data type (e.g., alphanumeric), length (7 characters) and format.

221
Q

Variable length field

A

employ pointer fields that seamlessly redirect data retrieval to a designated point in the memo file where the variable length field data begins (or continues).

222
Q

A data map might encompass:

A
  • Custodian and/or source of information;
  • Location;
  • Physical device or medium;
  • Currency of contents;
  • Volume (e.g., in bytes);
  • Numerosity (e.g., how many messages and attachments?)
  • Time span (including intervals and significant gaps)
  • Purpose (How is the ESI resource tasked?);
  • Usage (Who uses the resource and when?);
  • Form; and
  • Fragility (What are the risks it may go away?).
223
Q

“TAR” or “Predictive Coding:

A

enterprise archival and advanced analytics termed “TAR” or “Predictive Coding,” promise to make it easier, cheaper and faster to search and collect responsive e-mail, but they’re costly and complex to implement

224
Q

Dominant spaces in Email

A

Outlook, IBM Lotus, third is Novell GroupWise

225
Q

POP3 (Post Office Protocol, version 3)

A

Using POP3, you connect to a mail server, download copies of all messages and, unless you have configured your e-mail client to leave copies on the server, the e-mail is deleted on the server and now resides on the hard drive of the computer you used to pick up mail.
In short, POP is locally-stored e-mail that supports some server storage; but, again, this once dominant protocol is little used anymore.

226
Q

IMAP (Internet Mail Access Protocol)

A

IMAP e-mail clients afford users the ability to synchronize the server files with a local copy of the e-mail and folders. When an IMAP user reconnects to the server, local e-mail stores are updated (synchronized) and messages drafted offline are transmitted. So, to summarize, IMAP is server-stored e-mail, with support for synchronized local storage.

227
Q

Difference bwteen POP3 and IMAP

A

Under POP3, the local collection is deemed authoritative whereas in IMAP the server collection is authoritative

228
Q

MAPI (Messaging Application Programming Interface)

A

Like IMAP, MAPI e-mail is typically stored on the server and not necessarily on the client machine.

229
Q

HTTP (Hyper Text Transfer Protocol) mail

A

or web-based/browser-based e-mail, dispenses with the local e-mail client and handles all activities on the server, with users managing their e-mail using their Internet browser to view an interactive web page.

230
Q

Which mail protocol do companies choose?

A

Companies choose server-based e-mail systems (e.g., IMAP and MAPI) for two principal reasons. First, such systems make it easier to access e-mail from various locations and machines. Second, it’s easier to back up e-mail from a central location. Because IMAP and MAPI systems store e-mail on the server, the backup system used to protect server data can yield a mother lode of server e- mail.

231
Q

Outgoing Mail: SMTP and MTA

A

SMTP for Simple Mail Transfer Protocol, takes care of outgoing e-mail
A server that uses SMTP to route e-mail over a network to its destination is called an MTA for Message Transfer Agent. Examples of MTAs you might hear mentioned by IT professionals include Sendmail, Exim, Qmail and Postfix

232
Q

everything in an e-mail is plain text

A

Yes

233
Q

near deduplication:

A

By hashing particular segments of messages and selectively comparing the hash values, it’s possible to gauge the relative similarity of e-mails and perhaps eliminate the cost to review messages that are inconsequentially different.

234
Q

Other places to look for emails apart from server:

A

Temporary Internet Files folder and the Short Message Service (SMS) exchanges
Microsoft Outlook archive file (archive.pst) and offline synchronization file (Outlook.ost)

235
Q

Easily Accessible:

A
  1. E-Mail Server: E-mail residing in active files on enterprise servers: MS Exchange e.g., (.edb, .stm, .log files), Office 365, Lotus Notes (.nsf files).
  2. File Server: E-mail saved as individual messages or in container files on a user’s network file storage area (“network share”)
  3. Desktops and Laptops: E-mail stored in active files on local or external hard drives of user workstation hard drives (e.g., .pst, .ost files for Outlook and .nsf for Lotus Notes), laptops (.ost, .pst, .nsf), mobile devices, and home systems, particularly those with remote access to networks.
  4. OLK system subfolders holding viewed attachments to Microsoft Outlook messages, including deleted messages.
  5. Mobile devices: An estimated 65% of e-mail messages were opened using mobile phones and tablets in Q4 2015. As many of these were downloaded to a local mail app, they reside on the device and do not necessarily lose such content when the same messages are deleted from the server. E-mail on mobile devices is readily accessible to the user, but poses daunting challenges for preservation and collection in e-discovery workflows.
  6. Nearline e-mail: Optical “juke box” devices, backups of user e-mail folders.
  7. Archived or journaled e-mail: e.g., HP Autonomy Zantaz Enterprise Archive Solution, EMC EmailXtender, NearPoint Mimosa, Symantec Enterprise Vault.
236
Q

Accessible, but Often Overlooked:

A
  1. E-mail residing on non-party servers: ISPs (IMAP, POP, HTTP servers), Office 365, Gmail, Yahoo! Mail, Hotmail, etc.
  2. E-mail forwarded and cc’d to external systems: Employee forwards e-mail to self at personal e-mail account.
  3. E-mail threaded as text behind subsequent exchanges.
  4. Offline local e-mail stored on removable media: External hard drives, thumb drives and memory cards, optical media: CD-R/RW, DVD-R/RW, floppy drives, zip drives.
  5. Archived e-mail: Auto-archived or saved under user-selected filename.
  6. Common user “flubs”: Users experimenting with export features unwittingly create e-mail archives.
  7. Legacy e-mail: Users migrate from e-mail clients “abandoning” former e-mail stores. Also, e-mail on mothballed or re-tasked machines and devices.
  8. E-mail saved to other formats: PDF, .tiff, .txt, .eml, .msg, etc.
  9. E-mail contained in review sets assembled for other litigation/compliance purposes.
  10. E-mail retained by vendors or third- parties (e.g., former service provider or attorneys)
  11. Paper print outs.
237
Q

Less Accessible:

A
  1. Offline e-mail on server backup tapes and other media.
  2. E-mail in forensically accessible areas of local hard drives and re-tasked/reimaged legacy machines: deleted e-mail, internet cache, unallocated clusters.
238
Q

Finding Outlook E-Mail

A
  1. .PST: single, often massive, database file
  2. .OST: offline synchronization file is commonly encountered on laptops configured for Exchange Server environments. It exists for the purpose of affording access to messages when the user has no active network connection.
  3. Archive.pst:Users can customize these intervals, turn archiving off or instruct the application to permanently delete old items.
  4. “Temporary” OLK Folders: Holds attachments in tmp folders
239
Q

Email Servers:

A

E-mail originating on servers is generally going to fall into two realms, being online “live” data, which is deemed reasonably accessible, and offline “archival” data, routinely deemed inaccessible based on considerations of cost and burden

240
Q

Older versions of Exchange Server stored data

A

Older versions of Exchange Server stored data in a Storage Group containing a Mailbox Store and a Public Folder Store, each composed of two files: an .edb file and a .stm file

241
Q

Priv1.edb

A

is a rich-text database file containing user’s e-mail messages, text attachments and headers.

242
Q

Priv1.stm

A

is a streaming file holding SMTP messages and containing multimedia data formatted as MIME data.

243
Q

Public Folder Store, Pub1.edb,

A

is a rich-text database file containing messages, text attachments and headers for files stored in the Public Folder tree.

244
Q

Pub1.stm

A

is a streaming file holding SMTP messages and containing multimedia data formatted as MIME data.

245
Q

Later versions of Exchange Server did away with STM files altogether, shifting their content into the EDB database files.

A

True

246
Q

Recovery Storage Group:

A

supports collection of e-mail from the server without any need to interrupt its operation or restore data to a separate recovery computer

247
Q

Exchange Server Mailbox Merge Wizard

A
  1. but universally called ExMerge allows for rudimentary filtering of messages for export, including by message dates, folders, attachments and subject line content.
  2. PST carved out of EDB using ExMerge
248
Q

Three types of journalling:

A

Message-only journaling
Bcc journaling,
Envelope Journaling

249
Q

Message-only journaling

A

Message-only journaling which does not account for blind carbon copy recipients, recipients from transport forwarding rules, or recipients from distribution group expansions;

250
Q

Bcc journaling

A

Bcc journaling, which is identical to Message-only journaling except that it captures Bcc addressee data; and

251
Q

Envelope Journaling

A

Envelope Journaling which captures all data about the message, including information about those who received it. Envelope journaling is the mechanism best suited to e-discovery preservation and regulatory compliance.

252
Q

Best suited journalling

A

Envelope Journaling

253
Q

Transport rules agents

A

Transport rules agents “listen” to e-mail traffic, compare the content or distribution to a set of rules (conditions, exceptions and actions) and if particular characteristics are present, intercedes to block, route, flag or alter suspect communications.

254
Q

Lotus local files extensions

A

.id and .nsf

255
Q

.MSG

A

All we can say is that the MSG file is a highly compatible near-native format for individual Outlook messages–more complete than the transiting e-mail and less complete than the native PST

256
Q

Exchange Database and sporting the file extension

A

.EDB

257
Q

​​Gmail Law case

A

The question whether it’s feasible to produce Gmail in its native form triggered an order by U.S. Magistrate Judge Mark J. Dinsmore in a case styled, Keaton v. Hannum, 2013 U.S. Dist. LEXIS 60519 (S.D. Ind. Apr. 29, 2013). It’s a seamy, sad suit brought pro se by an attorney named Keaton against both his ex-girlfriend, Christine Zook, and the cops who arrested Keaton for stalking Zook.

258
Q

Latest RFC

A

The latest protocol revision is called RFC 5322 (2008)

259
Q

system catalog

A

the compendium of all that metadata is also called the system catalog

260
Q

Three Reason to retain information

A

Business, Regulatory obligation, Litigation Duty

261
Q

Litigation Duty:

A

Duty owed to court

262
Q

Duty to Preserve:

A

Reasonable and good faith efforts to retain information (Sedona point 5)

263
Q

Identify and Preserve

A

Identify: All data, data map
Preserve: potentially responsive ESI

264
Q

Legal hold Notice vs Preservation Letter

A

Legal hold notice goes to your clients, Preservation goes to the other side

265
Q

Zubulake decisions

A

(Southern district of New York): Producing party pays for production. Burden so great and relevant more cost shifting.
Work with clients, monitor, audit compliance, educate to ensure ESI is preserved.
Cost sharing 75% UBS 25% Zublake
Rule 37E of Federal Rules
Verdict: April 6 2005, gave $26.2 Million , $9.1 Million compensatory, $20.1 Million punitive
Seek input from key players

266
Q

TRCP Rule 196.4 (Effective 1999)

A

specifically request production of electronic data and specify the form. The responding party should produce the electronic or magnetic data that is responsive and reasonably available (Similar to Federal Rule 34) The court must also order that the requesting party pay the reasonable expenses for any extrordinary steps. In federal, producing party only should pay not requesting party

267
Q

MAPI

A

Microsoft Outlook and Microsoft Exchange are database applications that talk to each other using a protocol (machine language) called MAPI, forMessaging Application Programming Interface.

268
Q

chain of custody

A

Demonstrating the authenticity of ESI as evidence entails protecting the integrity of the evidence– data and metadata–and proving a proper chain of custody. No ESI should be produced in discovery or offered into evidence absent the ability to trace it back to its origins

269
Q

collection tends to employ two techniques:

A

targeted collection and forensic imaging.

270
Q

Forensic imaging

A

entails the duplication of the complete contents of a storage medium, typically encompassing the readily-accessible active data areas and the inaccessible, forensically-significant regions of the medium like unallocated clusters and slack space.

271
Q

Targeted collection

A
  1. targeted collection is the identification and duplication of potentially relevant ESI according to specific characteristics of the files and folders in which it resides.
  2. Targeted collection tends to reduce data volumes, with a commensurate reduction in costs to process and host ESI.
  3. Decreased volume also means less data to search and review, prompting greater savings.
  4. But the savings sought from targeted collection must be weighed against the risk of leaving relevant and responsive ESI behind and the expenditures required to scope and carry out targeted collection.
  5. A “targeted collection” is the copying of some or all the active data from an evidence source.
272
Q

“write protection” or “write blocking.”

A

Forensic examiners are careful to prevent the preservation process from effecting changes to the source evidence in a process called “write protection” or “write blocking.”

273
Q

Public Cloud

A

Public Cloud (i.e., on servers shared using the public Internet and managed by third-party service providers like Amazon Web Services and Microsoft’s Azure). Forensic artifacts like deleted files in unallocated clusters don’t exist in the public cloud, making targeted collection the only option.

274
Q

Re-Weekly Holmes:

A

Made request for relevant deleted emails and defendants hard drives and had their own forensic examiner.
Wanted to search within 1 year scope
21 search terms
Defendent objected it as too intrusive, private conversations, trade secrets, previlleged communications.
Mandamus - action to compel - compel the judge by the litigant

275
Q

Custodial hold

A

Custodial hold is relying upon the custodians (the creators and holders) of data to preserve it.
You should backstop custodial holds with objective preservation measures tending to defray the risk of reliance on custodial holds.

276
Q

Ten Elements of a “Perfect” Legal Hold Notice

A
  1. Timely
  2. Communicated through an effective channel
  3. Issued by person(s) with clout
  4. Sent to all necessary custodians
  5. Communicates gravity and accountability
  6. Supplies context re: claim or litigation
  7. Offers clear, practical guidance re: actions and deadlines
  8. Sensibly scopes sources and forms
  9. Identifies mechanism and contact for questions
  10. Incorporates acknowledgement, follow up and refresh
277
Q

E-Discovery from Mobile Devices

A

data volume is growing at a compound annual rate of 42 percent. data volumes today because we are facing volumes ten times as great in five years, and fifty timesas great in ten years.
According to the U.S. Center for Disease Control, more than 41% of American households have no landline phone. For those between the ages of 25 and 29, two- thirds are wireless-only
Apple sold more than one billion iPhones worldwide from 2007 to 2016. These hold apps drawn from the more than 2.3 million apps offered in the iOS App Store, compounding the more than 25 billion times these apps have been downloaded and installed.
Android operating system account for three times as many activations as Apple phones.
“Today many of the more than 90% of American adults who own cell phones keep on their person a digital record of nearly every aspect of their lives.” (Riley vs California 573 US)
We’ve been acquiring from hard drives for thirty years, using two principal interfaces: PATA and SATA.
try seven generations of iOS in five years
Moreover, and by law, any phone sold in the U.S. must be capable of precise GPS-style geolocation in order to support 9-1-1 emergency response services.Your phone broadcasts its location all the time with a precision better than ten meters.
If This Then That (IFTTT), I can ask iPhone’s Siri to turn lights on and off by texting them, at the cost of leaving an indelible record of even that innocuous act
Contents can often be erased by users entering the wrong password repeatedly, and it’s not uncommon to see users making this “mistake” on the eve of being required to surrender their phones.

278
Q

Challenges in EDRM for Mobile devices: - Information Governance:

A

Businesses adopt a BYOD (Bring Your Own Device) model when they allow employees to connect their personal phones and tablets to the corporate network.

279
Q

Challenges in EDRM for Mobile devices: Identification:

A

Mobile devices tend to be replaced and upgraded more frequently than laptop and desktop computers;accordingly, it’s harder to maintain an up-to-date data map for mobile devices. Mobile devices also do not support remote collection software of the sort that makes it feasible to search other network-connected computer systems. Too, the variety of apps and difficulty navigating the file systems of mobile devices complicates the ability to catalog contents.

280
Q

Challenges in EDRM for Mobile devices: Preservation:

A

Even the seemingly simple task of preserving text messages can be daunting to the user who realizes that, e.g., the iPhone offers no easy means to download or print text messages.

281
Q

Challenges in EDRM for Mobile devices: Collection

A

collection from the device, by a computer forensic expert, and tends to be harder, slower and costlier than collection from PC/server environments.

282
Q

Challenges in EDRM for Mobile devices:Processing:

A

complicated by the fact that so many devices have their own unique operating systems. Moreover, each tends to secure data in unique, effective ways, such that encrypted data cannot be processed at all if it is not first decrypted.

283
Q

Challenges in EDRM for Mobile devices: Review:

A

Concordance and Relativity - not equipped to support ingestion and review of all the types and forms of electronic evidence that can be elicited from modern mobile devices and applications.

284
Q

Challenges in EDRM for Mobile devices:Analysis:

A

not to be good candidates for advanced analytics tools like Predictive Coding.

285
Q

Challenges in EDRM for Mobile devices: Production:

A

Much work remains with respect to forms of production best suited to mobile data and how to preserve the integrity, completeness and utility of the data as it moves out of the proprietary phone/app environment and into the realm of more conventional e-discovery tools.

286
Q

Four Options for Mobile Preservation

A

1/ Prove You Don’t Have to Do It: This was easier in the day when many companies employed Blackberry Enterprise Servers to redirect data to then- ubiquitous Blackberry phones. Today, it’s much harder to posit that a mobile device has no unique content. But, if that’s your justification to skip retention of mobile data, you should be prepared to prove that anything you’d have grabbed from the phone was obtained from another source.
2. Sequester the Device: From the standpoint of overall cost of preservation, it may be cheaper and easier to replace the device, put the original in airplane mode (to prevent changes to contents and remote wipes) and sequester it. Be sure to obtain and test credentials permitting access to the contents before sequestration.
3. Search for Software Solutions: For example, if you only need to preserve messaging, there are applications geared to that purpose, such as iMazing, Decipher TextMessage or Ecamm PhoneView. Before using unknown software, assess what it’s limitations may be in terms of the potential for altering metadata values or leaving information behind.
4. Get the credentials, Hire a Pro and Image It: Forensic examiners expert in mobile acquisition will have invested in specialized tools like Cellebrite UFED, Micro Systemation XRY, Lantern or Oxygen Forensic Suite. Forensic imaging exploits three levels of access to the contents of mobile devices referred to as Physical, Logical and File System access. Though a physical level image is the most complete, it is also the slowest and hardest to obtain in that the device may need to be “rooted” or “jailbroken” in order to secure access to data stored on the physical media.

287
Q

Three Principles: underscore the need for efficient, defensible preservation of relevant mobile content:

A
  • When mobile data may be unique and relevant, it should be preserved in anticipation of litigation. This principle is especially compelling when the preservation burden is trivial (as by use of the backup technique described below). You can demonstrate the absence of relevant data by, e.g., sampling the contents of devices; but standing alone, a policy barring the use of a device to store relevant data is not sufficient proof that such device has not, in fact, been used to store data. Too often, practice belies policy, particularly for messaging
  • Mobile preservation should be a customary feature of a defensible litigation hold; but absent issues of spoliation, few matters warrant the added cost of mobile preservation by forensics experts or the burden and disruption of separating users from mobile devices.
  • Legitimate concerns respecting personal privacy and privilege do not justify a failure to preserve relevant mobile data, although they will dictate how data is protected, processed, searched, reviewed and produced.
288
Q

Three Provisos:

A

The method demonstrated here is but one simple, scalable and defensible
Preservation isn’t production.
Please challenge, but don’t dismiss.

289
Q

The advantages of custodian-directed preservation of mobile devices by backup are:

A
  • Custodians need not make judgments concerning relevance, materiality and privilege;
  • Custodians need not run searches or require no special tools or training;
  • The backup process is speedy, easy to autheticate and lets custodians retain their phone;
  • It’s difficult to omit content from a backup and, once created, backups are hard to alter.
290
Q

An iPhone backup won’t preserve e-mail stored on the iPhone.

A
  • Per Apple, an unencrypted iTunes backup also won’t include:
  • Content from the iTunes and App Stores, or PDFs downloaded directly to iBooks
  • Content synced from iTunes, like imported MP3s or CDs, videos, books, and photos
  • Photos already stored in the cloud, like My Photo Stream, and iCloud Photo Library
  • Touch ID settings
  • Apple Pay information and settings
  • Activity, Health and Keychain data
291
Q

Why not use iCloud?

A

At some point, you will use iCloud for preservation; but currently, an iCloud backup is not equal to an iTunes backup. It preserves less data, and byte-for-byte, it takes more time to create than an iTunes backup. Additionally, iCloud encrypts all backups, making them a future challenge for processing and search should a user’s credentials be unavailable.

292
Q

Social Media Content

A

Applications like X1 Social Discovery and service providers like Hanzo can help with SMC preservation; but the task demands little technical savvy and no specialized tools
JSON (JavaScript Object Notation)
keyword search remains the most common approach to e-discovery
Relativity and dtSearch use the common “w/n” to denote a search for two terms or phrases within the number “n” words of one another. OpenText Insight uses the syntax “NEAR/n” for the same purpose and DISCO uses, simply, “/n” when searching within n words in any order and “+n” when searching within n words where the first term must precede the second
Effective processing must be recursive, thoroughly cycling through all the levels of encoding and applying the correct decoding methods to harvest all desired content.
when you search for information in an e-discovery tool, you are not searching the source data; you are searching a collection of information that has been extracted from the source data and indexed.

293
Q

Interrogatory: For each electronic system or index that will be searched to respond to discovery, please state:

A
  1. The rules employed by the system to tokenize data so as to make it searchable;
  2. The stop words used when documents, communications or ESI were added to the system or index;
  3. The number and nature of documents or communications in the system or index which are not searchable because of the system or index being unable to extract their full text or metadata; and
  4. Any limitation in the system or index, or in the search syntax to be employed, tending to limit or impair the effectiveness of keyword, Boolean or proximity search in identifying documents or communications that a reasonable person would understand to be responsive to the search.
294
Q

optical character recognition

A

Files (like scanned paper documents and photos) that depict but don’t store text may be subjected to optical character recognition to enable electronic search.

295
Q

DISCO performs a search using the following order of operations:

A

Term modifiers: !, *, ~
Exact phrases: “ ”
Groupings: ( )
Proximity: /n, +n
Family subsearch: +family &, and, %, not
[space], or

296
Q

Blair and Maron study

A

thought they found 75% of documents, had only found 25%

297
Q

Review is responsible for 75-90% of cost.
Human reviewers miss 20-75% of responsive documents
Only 58% of docs deemed relevant by one reviewer are deemed releveant by the next

A

Study Yes

298
Q

Main keyword/ESI search better to find relevant documents

A

Test Keywords, Iterate searches, Automate

299
Q

Recall vs Precision

A

Recall: proportion of responsive documents retrieved
Precision: proportion retrieved that are responsive
Broad search - Recall high, Precision low
Narrow search - Recall low, Precision high

300
Q

Example: Total Responsive = 20
25 items returned
15 responsive
10 bad

Recall = 15/20 = 75%
Precision = 15/25 = 60%

A

Solved?

301
Q

TAR is latent Semantic

A

True

302
Q

O’Kefe case:

A

Given this complexity, for lawyers and judges to dare opine that a certain search term or terms would be more likely to produce information than the terms that were used is truly to go where angels fear to tread. Accordingly, if defendants are going to contend that the search terms used
by the government were insufficient, they will have to specifically so contend in a motion to compel and their contention must be based on evidence that meets the requirements of Rule 702 of the Federal Rules of Evidence.

303
Q

FORMS OF PRODUCTION IN THE FEDERAL RULES

A

Rule 26(f)(3)(C) requires the parties to submit a discovery plan to the Court prior to the first pretrial conference. The plan must address “any issues about disclosure or discovery of electronically stored information, including the form or forms in which it should be produced.”
Rule 34(b)(1)(C) permits requesting parties to “specify the form or forms in which electronically stored information is to be produced,” yet it’s common for requests for production to be wholly silent on forms of production, despite pages of detailed definitions and instructions.

304
Q

The Federal Rules lay out FIVE STEPS to seeking and objecting to forms of production:

A
  1. Before the first pretrial conference, parties must hash out issues related to “the form or forms in
    which [ESI] should be produced. FRCP 26(f)(3)(C)
  2. Requesting party specifies the form or forms of production for each type of ESI sought: paper, native, near-native, imaged formats or a mix of same. FRCP 34(b)(1)(C)
  3. If the responding party will supply the specified forms, the parties proceed with production. If not, the responding party must object and designate the forms in which it intends to make production. If the requesting party fails to specify forms sought, responding party must state the form or forms it intends to produce. FRCP 34(b)(2)(D)
    The Notes to Rule 34(b) add: “A party that responds to a discovery request by simply producing electronically stored information in a form of its choice, without identifying that form in advance of the production . . . runs a risk that the requesting party can show that the produced form is not reasonably usable and that it is entitled to production of some or all of the information in an additional form.”
  4. If requesting party won’t accept the forms the producing party designates, requesting party must confer with the producing party in an effort to resolve the dispute. FRCP 37(a)(1)
  5. If the parties can’t agree, requesting party files a motion to compel, and the Court selects the forms to be produced.
305
Q

Options for forms of production include:

A
  • Paper [where the source is paper and the volume small]
  • Page Images [best for items requiring redaction and scanned paper records]
  • Native [spreadsheets, electronic presentations and word processed documents]
  • Near-native [e-mail and database content]
  • Hosted production
306
Q

Paper

A

Converting searchable electronic data to paper is rarely a reasonable form of production for ESI, but paper remains an option where the items to be produced are paper documents and so few in number that electronic searchability isn’t essential.

307
Q

Page Images

A

Parties produce digital “pictures” of documents, e-mails and other electronic records, typically furnished in Adobe’s Portable Document Format (PDF) or as Tagged Image File Format (TIFF) images. Converting ESI to TIFF images strips its electronic searchability and metadata. Accordingly, TIFF image productions are accompanied by load files holding searchable text and selected metadata (a so-called “TIFF+ production”). Searchable text is obtained by extraction from an electronic source or, for scanned paper documents, by use of optical character recognition (OCR). Load files are composed of delimited text, i.e., values following a predetermined sequence and separated by characters like commas, tabs or quotation marks. The organization and content of load files must be negotiated, and is often pegged to review software like Summation, Concordance or Relativity.
Pros: Imaged formats are ideal for production of scanned paper records, microfilm and microfiche, especially when OCR serves to add electronic searchability.
Cons: Imaged production breaks down when ESI holds embedded information (e.g., collaborative content like comments or formulae in spreadsheets) or non-printable information (e.g., voice mail, video or animation and structured data). Imaged productions
may also serve to degrade evidence when the information is fielded (e.g., structured data and messaging) or functional (e.g., animations in presentations, table relationships in structured data or threads in e-mail).

308
Q

Native Production

A

Parties produce the actual data files containing responsive information, e.g., Word documents in their native .DOC or .DOCX formats, Excel spreadsheets as .XLS and .XLSX files and PowerPoint presentations in native .PPT and .PPTX. Native production is cheaper and superior in competent hands using tools purpose-built for native review.
Pros: The immediate benefits to the producing party are speed and economy—little or nothing must be spent on image conversion, text extraction or OCR.
The benefits to the requesting party are substantial. Using native review tools or applications like those used to create the data (Careful here! —see Cons below), requesting
parties see the evidence as it appeared to the producing party. Embedded commentary and metadata aren’t stripped away, deduplication is facilitated, e-mail messages can be threaded into conversations, time zone irregularities are normalized and costs are reduced and utility enhanced every step of the way. Moreover, native files sizes tend to be many times more compact that their counterparts converted to static images, making native forms much less costly to ingest for processing and host for review.
Cons: Applications needed to view rare and obscure data formats may be prohibitively expensive (e.g., specialized engineering applications or enterprise database software). If native applications are (unwisely) tasked to review, e.g., Microsoft Word for reviewing Word documents, copies must be used to avoid altering evidence.

309
Q

Near-Native Production

A

When some ESI cannot be feasibly or prudently tendered in true native formats, near-native forms preserve the essential utility, content and searchability of native forms but are not, strictly speaking, native forms. Examples:
* Enterprise e-mail – Enterprise email systems store messages in monolithic container formats like ab Exchange Server’s EDB format; so, exported messages tend to be stored in container- or single message formats not native to the mail server. These replicate the pertinent content and essential functionality of the source, but again, are not, strictly speaking, native forms.
* Databases - Exports from databases are often produced in delimited formats not native to the database yet supporting the ability to interpret the data in ways faithful to the source.
* Social networking content - Content from social networking sites like Facebook won’t replicate the precise way the content is stored in the cloud, so near-native forms seek to replicate its essential utility, completeness and searchability.

310
Q

Hosted Production

A

Hosted production is more a delivery medium than a discrete form of production. Hosted production resides on a secure website. Requesting parties access data using their web browser, searching, viewing, annotating and downloading data. The electronic forms of production above are the grist ingested by hosting providers (service providers) to comprise the hosted collection.

311
Q

Load files

A

Metadata about paper documents into load files before
we employ load files as a sort of road map and as assembly instructions laying out, inter alia, where document images and their load files holding their searchable text and metadata are located on disks or other media used to store and deliver productions and how the various pieces relate to one-another.
So, to review, some load files carry extracted text to facilitate search, some carry metadata about the documents and some carry information about how the pieces of the production are stored and how they fit together.
Load files using commas to separate values are called “comma separated value” or CSV files.

312
Q

In e-discovery, there are three principal functions for delimiters:

A
  • Document Delimiter: A Document Delimiter signals a switch from one document to the next. In the most common load file format, a carriage return/line feed serves this purpose.
  • Field Delimiter: A Field Delimiter signals a change from one field to the next.
  • Quote Delimiter: A Quote Delimiter permits a delimiting character to be used within the fielded data without it being treated as a field delimiter. When, for example, a comma serves as a delimiter in A CSV file, the Quote Delimiter enables the comma to be treated as a comma in the
    text rather than indicating a shift from one field to the next.
313
Q

Concordance load files

A

use the file extension .DAT and the þ (thorn, ALT-0254, Unicode 00FE) as
the Quote Delimiter and the ¶ (pilcrow, ALT-0182, Unicode 00B6) character as the Field Delimiter.

314
Q

Summation load files

A

use the file extension .DII, and separate each record like so:

315
Q

Opticon load files

A

(file extension .OPT) are used in conjunction with Concordance load files to pair Bates numbered pages with corresponding page images and to define the unitization of each document; that is, where they begin and end.

316
Q

Overlay load files

A

are used to update or correct existing database content by replacing data in fields in the order in which the records occur. Thus, it’s crucial that the order of data within the overlay file match the order of data replaced. Data must be sorted in the same way, and the overlay must not add or omit fields.

317
Q

Produce delimited load file(s) supplying relevant system metadata field values for each information item by Bates number. Typical field values supplied include:

A

a. Source file name (original name of the item or file when collected from the source custodian or system);
b. Source file path (fully qualified file path from the root of the location from which the item was collected);
c. Last modified date and time (last modified date and time of the item);
d. UTC Offset (The UTC/GMT offset of the item’s modified date and time, e.g., -500).
e. Custodian or source (unique identifier for the original custodian or source);
f. Document type;
g. Production File Path (file path to the item from the root of the production media);
h. MD5 hash (MD5 hash value of the item as produced);
i. Redacted flag (indication whether the content or metadata of the item has been altered
after its collection from the source custodian or system);
j. Embedded Content Flag (indication that the item contains embedded or hidden comments,
content or tracked changes); and
k. Deduplicated instances (by full path).

318
Q

The following additional fields shall accompany production of e-mail messages:

A

l. To (e-mail address(es) of intended recipient(s) of the message);
m. From (e-mail address of the person sending the message);
n. CC (e-mail address(es) of person(s) copied on the message);
o. BCC (e-mail address(es) of person(s)blind copied on the message);
p. Subject (subject line of the message);
q. Date Received (date the message was received);
r. Time Received (time the message was received);
s. Attachments (beginning Bates numbers of attachments);
t. Mail Folder Path (path of the message from the root of the mail folder);and
u. Message ID (unique message identifier).

319
Q

Bates identifier according to the following protocol:

A

i. The first four (4) characters of the filename will reflect a unique alphanumeric designation identifying the party making production;
ii. The next six (6) characters will be a designation reserved to the discretionary use of the party making production for the purpose of, e.g., denoting the case or matter. This value shall be padded with leading zeroes as needed to preserve its length;
iii. The next nine (9) characters will be a unique, consecutive numeric value assigned to the item by the producing party. This value shall be padded with leading zeroes as needed to preserve its length;
iv. The final six (6) characters are reserved to a sequence consistently beginning with a dash (-) or underscore (_) followed by a five digit number reflecting pagination of the item when printed to paper or converted to an image format for use in proceedings or when attached as exhibits to pleadings.
v. By way of example, a Microsoft Word document produced by Acme in its native format might be named: ACMESAMPLE000000123.docx. Were the document printed out for use in deposition, page six of the printed item must be embossed with the unique identifier ACMESAMPLE000000123_00006. Bates identifiers should be endorsed on the lower right corner of all printed pages, but not so as to obscure content.
vi. This format of the Bates identifier must remain consistent across all productions. The number of digits in the numeric portion and characters in the alphanumeric portion of the identifier should not change in subsequent productions, nor should spaces, hyphens, or other separators be added or deleted except as set out above.

320
Q

Federal Rules of Civil Procedure requires the court and parties to construe, administer and employ the Rules to “secure the just, speedy, and inexpensive determination of every action and proceeding.”

A

Study

321
Q

a black and white screenshot of each page called a Tagged Image File Format or TIFF image plus a load file or files holding text and metadata

A

Study

322
Q

The Federal Rules of Civil Procedure and most states’ rules empower a requesting party to specify the form or forms in which electronically stored information is to be produced. FRCP Rule 34(b)(1)(C). Ifarequestingpartyfailstospecifyaformofproduction(andyoushouldNEVERfailto specify), a producing party must supply ESI in the form or forms in which it is ordinarily maintained or in a reasonably usable form or forms. FRCP Rule 34(B)(2)(e)(ii)

A

Study

323
Q

The most-commonly seen specification for TIFF image production calls for

A

single-page monochrome Group IV images at 300 dpi resolution.

324
Q

“Monochrome”

A

“Monochrome” specifies that the image be devoid of color, i.e., rendered in black and white (which reduces the byte size of the image but sacrifices appearance and intelligibility when color is used to convey content, like color coding and highlighting).

325
Q

“Group IV”

A

“Group IV” is a bitmap image compression specification, and 300dpi (for “300 dots per inch”) is a measure of print resolution. The higher the number of dots per inch, the crisper and more detailed the image.

326
Q

There are three ways I’ve seen vendors handle comments and all three significantly degrade searchability:

A

Delete the comments. Spoilation
merge the comments into the adjacent body text
and frequently,
vendors aggregate comments and dump them at the end of the load file with
no clue as to the page or text they reference.

327
Q

TIFF

A

Tagged Image File Format

328
Q

dpi

A

dots per inch

329
Q

the keys to preserving unitization

A

the keys to preserving unitization lie in both the ordering of documents by Bates numbers and the metadata supplied in load file

330
Q

ASCII

A

ASCII is an acronym for American Standard Code for Information Interchange

331
Q

Rule 1.

A

Scope and Purpose: court and the parties to secure the just, speedy, and inexpensive determination of every action and proceeding

332
Q

Rule 26b

A

Discovery Scope and Limits
Scope in General. Unless otherwise limited by court order, the scope of discovery is as follows: Parties may obtain discovery regarding any nonprivileged matter that is relevant to any party’s claim or defense and proportional to the needs of the case, considering the importance of the issues at stake in the action, the amount in controversy, the parties’ relative access to relevant information, the parties’ resources, the importance of the discovery in resolving the issues, and whether the burden or expense of the proposed discovery outweighs its likely benefit. Information within this scope of discovery need not be admissible in evidence to be discoverable.
Limitations on Frequency and Extent. (B) Specific Limitations on Electronically Stored Information. A party need not provide discovery of electronically stored information from sources that the party identifies as not reasonably accessible because of undue burden or cost. On motion to compel discovery or for a protective order, the party from whom discovery is sought must show that the information is not reasonably accessible because of undue burden or cost. If that showing is made, the court may nonetheless order discovery from such sources if the requesting party shows good cause, considering the limitations of Rule 26(b)(2)(C). The court may specify conditions for the discovery.

333
Q

Rule 26f

A

Conference of Parties; Planning for Discovery
Conference Timing: the parties must confer as soon as practicable— and in any event at least 21 days before a scheduling conference is to be held or a scheduling order is due under Rule 16(b)
Conference Content; Parties’ Responsibilities: In conferring, the parties must consider the nature and basis of their claims and defenses and the possibilities for promptly settling or resolving the case; make or arrange for the disclosures required by Rule 26(a)(1); discuss any issues about preserving discoverable information; and develop a proposed discovery plan. The attorneys of record and all unrepresented parties that have appeared in the case are jointly responsible for arranging the conference, for attempting in good faith to agree on the proposed discovery plan, and for submitting to the court within 14 days after the conference a written report outlining the plan. The court may order the parties or attorneys to attend the conference in person.
Discovery Plan. A discovery plan must state the parties’ views and proposals on:
(B) the subjects on which discovery may be needed, when discovery should be completed, and whether discovery should be conducted in phases or be limited to or focused on particular issues;
(C) any issues about disclosure, discovery, or preservation of electronically stored information, including the form or forms in which it should be produced;
(D) any issues about claims of privilege or of protection as trial-preparation materials, including—if the parties agree on a procedure to assert these claims after production—whether to ask the court to include their agreement in an order under Federal Rule of Evidence 502;

334
Q

Rule 26g

A

Signature Required; Effect of Signature.
By signing, an attorney or party certifies that to the best of the person’s knowledge, information, and belief formed after a reasonable inquiry:
(A) with respect to a disclosure, it is complete and correct as of the time it is made; and
(B) with respect to a discovery request, response, or objection, it is:
(i) consistent with these rules and warranted by existing law or by a nonfrivolous argument for extending, modifying, or reversing existing law, or for establishing new law;
(ii) not interposed for any improper purpose, such as to harass, cause unnecessary delay, or needlessly increase the cost of litigation; and
(iii) neither unreasonable nor unduly burdensome or expensive, considering the needs of the case, prior discovery in the case, the amount in controversy, and the importance of the issues at stake in the action.

335
Q

Amendments - Rule 26b

A

Under this rule, a responding party should produce electronically stored information that is relevant, not privileged, and reasonably accessible, subject to the (b)(2)(C) limitations that apply to all discovery.
The responding party must also identify, by category or type, the sources containing potentially responsive information that it is neither searching nor producing. The identification should, to the extent possible, provide enough detail to enable the requesting party to evaluate the burdens and costs of providing the discovery and the likelihood of finding responsive information on the identified sources.
A party’s identification of sources of electronically stored information as not reasonably accessible does not relieve the party of its common-law or statutory duties to preserve evidence. Whether a responding party is required to preserve unsearched sources of potentially responsive information that it believes are not reasonably accessible depends on the circumstances of each case. It is often useful for the parties to discuss this issue early in discovery.
If the parties cannot agree whether, or on what terms, sources identified as not reasonably accessible should be searched and discoverable information produced, the issue may be raised either by a motion to compel discovery or by a motion for a protective order.Once it is shown that a source of electronically stored information is not reasonably accessible, the requesting party may still obtain discovery by showing good cause, considering the limitations of Rule 26(b)(2)(C) that balance the costs and potential benefits of discovery. The decision whether to require a responding party to search for and produce information that is not reasonably accessible depends not only on the burdens and costs of doing so, but also on whether those burdens and costs can be justified in the circumstances of the case. Appropriate considerations may include: (1) the specificity of the discovery request; (2) the quantity of information available from other and more easily accessed sources; (3) the failure to produce relevant information that seems likely to have existed but is no longer available on more easily accessed sources; (4) the likelihood of finding relevant, responsive information that cannot be obtained from other, more easily accessed sources; (5) predictions as to the importance and usefulness of the further information; (6) the importance of the issues at stake in the litigation; and (7) the parties’ resources.

336
Q

Amendment Rule 26f

A

When a case involves discovery of electronically stored information, the issues to be addressed during the Rule 26(f) conference depend on the nature and extent of the contemplated discovery and of the parties’ information systems. It may be important for the parties to discuss those systems, and accordingly important for counsel to become familiar with those systems before the conference. With that information, the parties can develop a discovery plan that takes into account the capabilities of their computer systems. In appropriate cases identification of, and early discovery from, individuals with special knowledge of a party’s computer systems may be helpful.
Rule 26(f)(3) explicitly directs the parties to discuss the form or forms in which electronically stored information might be produced. The parties may be able to reach agreement on the forms of productionmaking discovery more efficient. Rule 34(b) is amended to permit a requesting party to specify the form or forms in which it wants electronically stored information produced. If the requesting party does not specify a form, Rule 34(b) directs the responding party to state the forms it intends to use in the production. Early discussion of the forms of production may facilitate the application of Rule 34(b) by allowing the parties to determine what forms of production will meet both parties’ needs. Early identification of disputes over the forms of production may help avoid the expense and delay of searches or productions using inappropriate forms.
Rule 26(f) is also amended to direct the parties to discuss any issues regarding preservation of discoverable information during their conference as they develop a discovery plan. This provision applies to all sorts of discoverable information, but can be particularly important with regard to electronically stored information. The volume and dynamic nature of electronically stored information may complicate preservation obligations. The ordinary operation of computers involves both the automatic creation and the automatic deletion or overwriting of certain information. Failure to address preservation issues early in the litigation increases uncertainty and raises a risk of disputes.
The parties should take account of these considerations in their discussions, with the goal of agreeing on reasonable preservation steps
Computer programs may retain draft language, editorial comments, and other deleted matter (sometimes referred to as “embedded data” or “embedded edits”) in an electronic file but not make them apparent to the reader. Information describing the history, tracking, or management of an electronic file (sometimes called “metadata”) is usually not apparent to the reader viewing a hard copy or a screen image. Whether this information should be produced may be among the topics discussed in the Rule 26(f) conference. If it is, it may need to be reviewed to ensure that no privileged information is included, further complicating the task of privilege review.

337
Q

Amendment 26b1

A

Nor is the change intended to permit the opposing party to refuse discovery simply by making a boilerplate objection that it is not proportional. The parties and the court have a collective responsibility to consider the proportionality of all discovery and consider it in resolving discovery disputes
Framing intelligent requests for electronically stored information, for example, may require detailed information about another party’s information systems and other information resources

338
Q

Rule 34

A

The wide variety of computer systems currently in use, and the rapidity of technological change, counsel against a limiting or precise definition of electronically stored information. Rule 34(a)(1) is expansive and includes any type of information that is stored electronically. A common example often sought in discovery is electronic communications, such as e-mail. The rule covers—either as documents or as electronically stored information—information “stored in any medium,” to encompass future developments in computer technology. Rule 34(a)(1) is intended to be broad enough to cover all current types of computer-based information, and flexible enough to encompass future changes and developments.
The addition of testing and sampling to Rule 34(a) with regard to documents and electronically stored information is not meant access might be justified in some circumstances. Courts should guard against undue intrusiveness resulting from inspecting or testing such systems
The amendment to Rule 34(b) permits the requesting party to designate the form or forms in which it wants electronically stored information produced. The form of production is more important to the exchange of electronically stored information than of hard-copy materials, although a party might specify hard copy as the requested form. Specification of the desired form or forms may facilitate the orderly, efficient, and cost-effective discovery of electronically stored information. The rule recognizes that different forms of production may be appropriate for different types of electronically stored information. Using current technology, for example, a party might be called upon to produce word processing documents, e-mail messages, electronic spreadsheets, different image or sound files, and material from databases. Requiring that such diverse types of electronically stored information all be produced in the same form could prove impossible, and even if possible could increase the cost and burdens of producing and using the information. The rule therefore provides that the requesting party may ask for different forms of production for different types of electronically stored information.
The rule does not require that the requesting party choose a form or forms of production. The requesting party may not have a preference. In some cases, the requesting party may not know what form the producing party uses to maintain its electronically stored information, although Rule 26(f)(3) is amended to call for discussion of the form of production in the parties’ prediscovery conference.
A party that responds without identifying that form in advance of the production in the response required by Rule 34(b), that it is entitled to production of some or all of the information in an additional form. Additional time might be required to permit a responding party to assess the appropriate form or forms of production.
If the form of production is not specified by party agreement or court order, the responding party must produce electronically stored information either in a form or forms in which it is ordinarily maintained or in a form or forms that are reasonably usable. Rule 34(a) requires that, if necessary, a responding party “translate” information it produces into a “reasonably usable” form. Under some circumstances, the responding party may need to provide some reasonable amount of technical support, information on application software, or other reasonable assistance to enable the requesting party to use the information. The rule does not require a party to produce electronically stored information in the form it [sic] which it is ordinarily maintained, as long as it is produced in a reasonably usable form. But the option to produce in a reasonably usable form does not mean that a responding party is free to convert electronically stored information from the form in which it is ordinarily maintained to a different form that makes it more difficult or burdensome for the requesting party to use the information efficiently in the litigation. If the responding party ordinarily maintains the information it is producing in a way that makes it searchable by electronic means, the information should not be produced in a form that removes or significantly degrades this feature.
B) Form for Producing Electronically Stored Information Not Specified. If a subpoena does not specify a form for producing electronically stored information, the person responding must produce it in a form or forms in which it is ordinarily maintained or in a reasonably usable form or forms
D) Inaccessible Electronically Stored Information. The person responding need not provide discovery of electronically stored information from sources that the person identifies as not reasonably accessible because of undue burden or cost. On motion to compel discovery or for a protective order, the person responding must show that the information is not reasonably accessible because of undue burden or cost. If that showing is made, the court may nonetheless order discovery from such sources if the requesting party shows good cause, considering the limitations of Rule 26(b)(2)(C). The court may specify conditions for the discovery.

339
Q

TRCP Rule 196.4 Electronic or Magnetic Data (enacted 1999)

A

To obtain discovery of data or information that exists in electronic or magnetic form, the
requesting party must specifically request production of electronic or magnetic data and specify
the form in which the requesting party wants it produced. The responding party must produce
the electronic or magnetic data that is responsive to the request and is reasonably available to
the responding party in its ordinary course of business. If the responding party cannot – through
reasonable efforts – retrieve the data or information requested or produce it in the form
requested, the responding party must state an objection complying with these rules. If the court
orders the responding party to comply with the request, the court must also order that the
requesting party pay the reasonable expenses of any extraordinary steps required to retrieve and produce the information.

340
Q

Green v. Blitz:

A

Judge Ward, Texas) This case speaks to the need for competence in those responsible for preservation and collection and what constitutes a defensible eDiscovery strategy. What went wrong here? What should have been done differently? Note the extraordinary sanctions meted out by the Court.
Blitz violated Federal Rule of Civil Procedure 26(g) - reasonable effort to search for and produce documents responsive to a discovery request
The Fifth Circuit has instructed that for sanctions under Rule 37(b)(2), “[f]irst, any sanction must be ‘just;’ second, the sanction must be specifically related to the particular ‘claim’ which was at issue in the order to provide discovery.”
Should have gotten IT help
extraordinary sanctions meted out by the Court.

341
Q

In re: Weekley Homes: (Texas Supreme Court)

A

This is one of the three most important Texas cases on ESI.
You should understand the elements of proof which the Court imposes for access to an opponent’s storage devices and know the terms of TRCP Rule 196.4, especially the key areas where the state and Federal ESI rules diverge.
It is one of very few cases in Texas applying the first (and accordingly, the oldest) electronic discovery procedural rule in the United States: TRCP 196.4, which states (circa 1999):
Rule 196.4 Electronic or Magnetic Data: To obtain discovery of data or information that exists in electronic or magnetic form, the requesting party must specifically request production of electronic or magnetic data and specify the form in which the requesting party wants it produced. The responding party must produce the electronic or magnetic data that is responsive to the request and is reasonably available to the responding party in its ordinary course of business. If the responding party cannot - through reasonable efforts - retrieve the data or information requested or produce it in the form requested, the responding party must state an objection complying with these rules. If the court orders the responding party to comply with the request, the court must also order that the requesting party pay the reasonable expenses of any extraordinary steps required to retrieve and produce the information.
In this case, HFG failed to make the good-cause showing necessary to justify the trial court’s order. The harm Weekley will suffer from being required to relinquish control of the Employees’ hard drives for forensic inspection, and the harm that might result from revealing private conversations, trade secrets, and privileged or otherwise confidential communications, cannot be remedied on appeal. Accordingly, Weekley is entitled to mandamus relief.

342
Q

Zubulake: (Judge Scheindlin, New York)

A

The Zubulake series of decisions are seminal to the study of e-discovery in the U.S. Zubulake remains the most cited of all EDD cases, so it is still a potent weapon even after the Rules amendments codified much of its lessons. Know what the case is about, how the plaintiff persuaded the court that documents were missing and what the defendant did or didn’t do in failing to meet its discovery obligations. Know what an adverse inference instruction is and how it was applied in Zubulake versus what must be established under FRCP Rule 37(e) after 2015. Know what Judge Scheindlin found to be a litigant’s and counsel’s duties with respect to preservation. Seven-point analytical frameworks (as for cost-shifting) make good test fodder.
In Zubulake I, Judge Scheindlin addresses the proper scope of discovery and her seven-factor analytic framework for shifting costs of discovery between parties
FRCP Rule 37(e) that changed the standard for imposition of serious sanctions for spoliation of electronic evidence.
By and large, the solution has been to consider cost-shifting: forcing the requesting party, rather than the answering party, to bear the cost of discovery.
Cost Shifting when benefits only when electronic discovery imposes an “undue burden or expense” on the responding party. The burden or expense of discovery is, in turn, “undue” when it “outweighs its likely benefit, taking into account the needs of the case, the amount in controversy, the parties’ resources, the importance of the issues at stake in the litigation, and the importance of the proposed discovery in resolving the issues.”
whether production of documents is unduly burdensome or expensive turns primarily on whether it is kept in an accessible or inaccessible format
“Backup” and “Deleted” are inaccessible
For the reasons set forth above, the costs of restoring any backup tapes are allocated between UBS and Zubulake seventy-five percent and twenty-five percent, respectively.
A lawyer cannot be obliged to monitor her client like a parent watching a child.
UBS acted willfully in destroying potentially relevant information, which resulted either in the absence of such information or its tardy production

343
Q

Williams v. Sprint: (Judge Waxse, Kansas).

A

Williams is a seminal decision respecting metadata.
In Williams v. Sprint, the matter concerned purging of metadata and the locking of cells in spreadsheets in the context of an age discrimination action after a reduction in force.
Judge Waxse applied Sedona Principle 12 in its earliest (and now twice revised) form. What should Sprint have done? Did the Court sanction any party? Why or why not? This case was discussed various times in your Workbook readings,(e.g., pp. 59 & 78) not in Canvas.
[W]hen a party is ordered to produce electronic documents as they are maintained in the ordinary course of business, the producing party should produce the electronic documents with their metadata intact, unless that party timely objects to production of metadata, the parties agree that the metadata should not be produced, or the producing party requests a protective order. - Sedona 12

344
Q

Columbia Pictures v. Bunnell: (Judge Chooljian, California)

A

What prompted the Court to require the preservation of such fleeting, ephemeral information? Why were the defendants deemed to have control of the ephemeral data? Unique to its facts?
this court concludes that data in RAM constitutes electronically stored information under Rule 34
A pen register is essentially a device which captures outgoing telephone numbers or IP addresses.
A trap and trace device essentially captures incoming IP addresses or telephone numbers
Bunnell had engaged in “systematic destruction” of evidence by intentionally deleting information from the website’s RAM (Random Access Memory) after he was put on notice of the lawsuit.
The court’s ruling regarding RAM data was significant because it recognized the potential evidentiary value of volatile data, such as RAM, which can be lost if not immediately preserved. The decision affirmed that the duty to preserve evidence extends to data in volatile memory, and that intentional spoliation (destruction) of such data can result in serious legal consequences.

345
Q

In re NTL, Inc. Securities Litigation: (Judge Peck, New York) Be prepared to discuss what constitutes

A

control for purposes of imposing a duty to preserve and produce ESI in discovery and how it played out
in this case. I want you to appreciate that, while a party may not be obliged to succeed in compelling the preservation or production of relevant information beyond its care, custody or control, a party is obliged to exercise all such control as the party actually possesses, whether as a matter of right or by course of dealing.
What’s does The Sedona Conference think about that? Recall you read The Sedona Conference Commentary on Rule 34 and Rule 45 “Possession, Custody, or Control” (pp. 475-527 only).
The test for the production of documents is control, not location
“‘control’ does not require that the party have legal ownership or actual physical possession of the documents at issue; rather, documents are considered to be under a party’s control when that party has the right, authority, or practical [**60] ability to obtain the documents from a non-party to the action.”
Court holds that defendant NTL Europe had “control” over documents and ESI at New NTL for the purpose of this litigation even though most of those documents and ESI ended up in the physical possession of non-party New NTL.

346
Q

RAMBUS (Judge Whyte, California):

A

I expect you to know what happened and to appreciate that the mere reasonable anticipation of litigation–particularly by the party who expects to bring the action- -triggers the common law duty to preserve.
Be prepared to address the sorts of situations that might or might not trigger a duty to initiate a legal hold.
Although the evidence does not support a conclusion that Rambus deliberately shredded documents it knew to be damaging, the court concludes that Rambus nonetheless spoliated evidence in bad faith or at least willfully.
This conclusion is based upon the facts that: (1) Rambus destroyed records when litigation was reasonably foreseeable; (2) Karp, the officer in charge of the destruction, was experienced in litigation and undoubtedly knew that rele- vant documents should not be destroyed when litigation is reasonably foreseeable; (3) the destruction was part of a litigation plan; (4) one of the motives of the destruction was to dispose of potentially harmful documents; and (5) Rambus shredded a huge number of doc- uments without keeping any records of what it was destroying.
The court declines to apply the unclean hands doctrine as a complete defense to Rambus’s patent infringement claims.

347
Q

United States v. O’Keefe

A

(Why/Where do angels fear to tread?) and His Honor’s consideration of the limits and challenges of keyword search. Given this complexity, for lawyers and judges to dare opine that a certain search term or terms would be more likely to produce information than the terms that were used is truly to go where angels fear to tread.
The last being a topic that bears scrutiny wherever it has been addressed in the material. That is, does keyword search work as well as lawyers think, and how can we improve upon it and compensate for its shortcomings?
Accordingly, if defendants are going to contend that the search terms used by the government were insufficient, they will have to specifically so contend in a motion to compel and their contention must be based on evidence that meets the requirements of Rule 702 of the Federal Rules of Evidence.

348
Q

Victor Stanley v. Creative Pipe I (Judge Grimm, D. Maryland):

A

Read VS I with an eye toward understanding the circumstances when inadvertent production triggers waiver (pre-FRE 502). What are the three standards applied to claims of waiver? What needs to be in the record to secure relief?
(1) the waiver is intentional;
(2) the disclosed and undisclosed communications or information concern the same subject matter; and
(3) they ought in fairness to be considered together.
Victor Stanley I is one of the first decisions to speak of the need for “sampling” and “quality assurance” in the context of competent search and retrieval of responsive or privileged ESI
the Defendants have waived any claim to attorney client privilege or work-product protection for the 165 documents at issue because they failed to take reasonable precautions by performing a faulty privilege review of the text-searchable files and by failing to detect the presence of the 165 documents, which were then given to the Plaintiff as part of Defendants’ ESI production.
Defendants waived any privilege or work-product protection for the 165 documents at issue by disclosing them to the Plaintiff. Accordingly, the Plaintiff may use these documents as evidence in this case, provided they are otherwise admissible. In this regard, the Plaintiff has only sought use of the documents themselves, and the court has not been asked to rule, and accordingly does not, that there has been any waiver beyond the documents themselves.

349
Q

Maurer v. Sysco Albany LLC (Judge Hummel, ND New York):

A

Representative example of Court dealing/struggling with dispute over keyword search terms and scope of search. Did the Court embrace the sentiments Judge Facciola expressed so eloquently in U.S. v O’Keefe?
Use of predictive coding
Based on the volume of hits generated by name-only searches using the decisionmakers’ names, the Court concludes that defendants’ have demonstrated that their proposal to use the “name plus” search criteria for the initial search-term-based search of the decisionmakers’ names in plaintiff’s Sysco email account, excluding auto- generated emails and hits resulting solely as a result of an email address or signature line, is more proportional to the discovery needs of this case.
As courts in this circuit have stated, “the best solution in the entire area of electronic discovery is cooperation among counsel,”

350
Q

Anderson Living Trust v. WPX Energy Production, LLC (Judge Browning, New Mexico):

A

This case looks at the application and intricacies of FRCP Rule 34 when it comes to ESI versus documents. My views about the case are set out in an article called “Breaking Badly” at https://craigball.net/2014/05/08/1653/
that the term “documents” includes ESI. In fact, defining “Documents” to regrettable because the 2006 Federal Rules encompass data compilations has a long and uncontentious history in the Rules.
distinguish the three dimensions of discoverable ESI: form, organization and scope

Quoting Professor Rabiej, the Court notes that “while (E)(i) document production gives
the producing party the right to choose whether to produce ‘in the usual course of
business” or “label[ed] … to correspond to the categories in the request,’ (E)(ii) puts the
ball in the requesting party’s court by first giving them the option to ‘specify a form for
producing’ ESI. Fed. R. Civ. P. 34(b)(2)(E)(i)-(ii). It is only if the requesting party declines to
specify a form that the producing party is offered a choice between producing in the form

‘in which it is ordinary maintained’ — native format — or ‘in a reasonably useful form or forms.’ Fed. R. Civ. P. 34(b)(2)(E)(i)-(ii).”
That’s powerful stuff, and dead right. Producing parties have long assumed that they were free to ignore a requesting party’s specification of form so long as they produced in a form claimed to be “reasonably usable.” Not so. As the Court notes, the “reasonably usable” option applies only when a requesting party fails to specify a form.
The lesson for requesting parties is ALWAYS, ALWAYS, ALWAYS specify forms for production in your requests. If you want Word documents produced natively, SAY SO! If you want e-mail in functional forms, specify the forms!

351
Q

Fast v. GoDaddy.com (Judge Campbell, D. Arizona):

A

Employment discrimination sanctions case applying FRCP Rule 37. Judge Campbell chaired Rules Advisory Committee that drafted Rule 37(e) and approvessanctions for loss of ESI under Rule 37(c)(1), laying waste the supposition that Rule 37(e) is the exclusive spoliation avenue for lost ESI.
His Honor chastises the plaintiff for failing to make a backup copy of her mobile phone before it was stolen. You can read my take on this case at https://craigball.net/2022/02/19/fast-v-godaddy-com-exemplary-jurisprudence-and-overlooked-opportunity/
“Spoliation is the destruction or material alteration of evidence, or the failure to otherwise preserve evidence, for another’s use in litigation.”
Spoliation arises from the failure to preserve relevant evidence once a duty to preserve has been triggered.
This rule establishes three prerequisites to sanctions: the ESI should have been preserved in the anticipation or conduct of litigation, it is lost through a failure to take reasonable steps to preserve it, and it cannot be restored or replaced through additional discovery. If these requirements are satisfied, the rule authorizes two levels of sanctions. Section (e)(1) permits a court, upon finding prejudice to another party from the loss of ESI, to order measures no greater than necessary to cure the prejudice. Section (e)(2) permits a court to impose more severe sanctions such as adverse inference jury instructions or dismissal, but only if it finds that the spoliating party “acted with the intent to deprive another party of the information’s use in the litigation.”[3] Fed. R. Civ. P. 37(e)(2).
Intent»» Prejudice

352
Q

In re: State Farm Lloyds

A

(Texas Supreme Court):
Proportionality is the buzzword here; but does the Court elevate proportionality to the point of being a costly hurdle serving to complicate a simple issue?
What does this case portend for Texas litigants in terms of new hoops to jump over for issues as straightforward as forms of production?
The requesting party seeks ESI in native form while the responding party has offered to produce in searchable static form, which the responding party asserts is more convenient and accessible given its routine business practices. Agreeing with the requesting party, the trial court ordered production in native form, subject to a showing of infeasibility. The court of appeals denied mandamus relief.
As requested by the homeowners, the trial court ordered all ESI to be produced in its native or near-native forms rather than in the alternative,“reasonably usable” format State Farm proposed in a competing discovery protocol.1
State Farm has offered to produce ESI in searchable, but “static” form. PDF, TIFF, and JPEG files are common examples of static electronic formats. Static forms of ESI are created by converting native formats into static images, which removes metadata from the native files. Static form may be searchable—to a more limited extent than native form—using optical character recognition (OCR) software.
ECS - server for State Farms - Enterprise Claims System (ECS)
Referring to static-form production as “the electronic equivalent of a print out,” the homeowner’s expert explained that useful metadata would not be viewable in static form, including tracked changes and commenting in Word documents; animations, other dynamic information, and speaker notes in static printouts of PowerPoint documents; and threading information in emails that would allow construction of a reasonable timeline related to State Farm’s processing of the homeowners’ claims.

353
Q

Monique Da Silva Moore, et al. v. Publicis Groupe & MSL Group and Rio Tinto Plc v. Vale S.A., (Judge Peck, New York):

A

DaSilva Moore is the first federal decision to approve the use of the form of Technology
Assisted Review (TAR) called Predictive Coding as an alternative to linear, manual review of potentially responsive ESI.
Rio Tinto is Judge Peck’s follow up, re-affirming the viability of the technology without establishing an “approved” methodology.

354
Q

Brookshire Bros. v. Aldridge (Texas Supreme Court):

A

This case sets out the Texas law respecting spoliation of ESI…or does it? Is the outcome and “analysis” here consistent with the other preservation and sanctions cases we’ve covered?
Note the procedural difference between Texas and Federal practice here: a federal judge can submit spoliation fact issues to the trier of fact (i.e., to the jury); but as I read Brookshire Brothers, a Texas state judge cannot