Data Files and Serialization Flashcards
What is data integration?
Data integration is combing data, which exists in different resources, and presenting them to the user.
File transfer - when two systems needs to communication they need to produce shared files which they both can make use of.
Shared database - is often used because it is language & platform independent.
Handling different data formats depending on the use case.
Common data transfer mechanism is essential - communication between system should be independent of language & platform.
What is File Based data integration?
When two or more applications make use of shared files.
The integrator will be responsible for setting the format and other common protocols, for when the use of these files are executed.
Mention a few pros for File Based Data Integration.
Files are popular because they compatible with all popular languages.
Integrators does only need to know input/output, not how the consuming apps works.
Highly flexible.
No extra tools needed.
Easy to understand
Mention a few cons for File Based Data Integration.
Scalability - Adaptability - Security - Data synchronisation.
Also a format must be agreed upon sharing files.
A more desirable approach is Shared database, Messaging or RPC
Which file types/formats are popular in Data Files?
CSV - XML - JSON
TEXT (not used for Integration, mostly used logging errors/processes, testing or to read tokens)
When would you use text files as your data file?
When logging errors/processes, testing or to read tokens
When could you use CSV files as Data file?
To move large amount of data (fast because no overhead like JSON OR XML
To view a data table (very human readable)
To better organise data
What pros are there for using CSV as Data file?
Human readable and easy to modify.
Faster to handle (no overhead).
It is easy to generate CSV files.
Very popular, many use CSV and therefore it is well know.
What cons are there for using CSV as Data file?
It can only handle basic data, a comma would break the structure since it is used as separator.
No distinction between text & numeric values.
Problems when importing CSV to SQL (No distinction between NULL and quotes)
When is JSON used as a Data file?
When creating micro-services & RESTful services
When storing data temporarily. Much easier than storing in a database.
Often used as configuration files & manifest.
What pros are there for using JSON as Data file?
Easy to read - both for machines and humans
Easy to parse - Most languages support parsing JSON.
Better performance than XML(faster, easier to read)
Very popular..
What cons are there for using JSON as Data file?
Can’t use comments
It is encapsulated. The structure needs to be divided into seperate objects if you want to add new data.
When could you use XML files as Data file?
Heavily used in SOA.
In SOAP. when you need complicated objects
Configuration files and manifests for apps (Maven,
Gradle etc.)
Standard for Office file formats
What pros are there for using XML as Data file?
Great at describing data using attributes
Lightweight when parsing as it has a strict syntax
What cons are there for using XML as Data file?
Data redundancy, which leads to large size and storage costs.
Can not hold arrays.