08 - External data representation Flashcards by Franparre francisco9

Explain the difference between internal and transferred data representation.

Internal data is represented as data structures, arrays, or objects in programming languages like C, Java, or Python.
Transferred data is represented as byte sequences for transmission, that must be flatten.

How well did you know this?

Not at all

Perfectly

What are the key aspects of data transmission format?

Agreement on a common format for the transmitted data.
Conversion of data by the transmitter and receiver.
Data is transmitted in the sender’s format and converted by the receiver.

How well did you know this?

Not at all

Perfectly

What is the process of data serialization and deserialization in the context of data transfer and its requirements?

Serialization involves converting data structures into a format that can be stored or transmitted.
Deserialization is the reverse process, where the received or stored data is converted back into usable data structures.
Requires correciton, efficiency, interoperability and ease to use

How well did you know this?

Not at all

Perfectly

Discuss the limitations of CORBA, Java Object Serialization, DCOM COM+, JSON, Plain Text, and XML in data representation.

CORBA: Over-designed and heavyweight.
Java Object Serialization: Tailored to the Java environment.
DCOM COM+: Tailored to Windows environment.
JSON, Plain Text, XML: Lack * protocol description, high parsing overhead, and programmer maintenance required.

How well did you know this?

Not at all

Perfectly

Explain the concept of endianness in data representation.

Endianness refers to the order of bytes in a multi-byte number.
Big-endian stores higher-order bytes at lower memory addresses.
Little-endian stores lower-order bytes at lower memory addresses.
In networking, addresses are always big-endian.

How well did you know this?

Not at all

Perfectly

What challenges arise in data representation due to heterogeneity in communication?

Different data structures like alignment, data size, and pointers vary across machines.
Alignment on word boundaries can change the size of a structure between machines.
Pointers, while convenient, have no meaning outside their defining machine

How well did you know this?

Not at all

Perfectly

How should data be represented for efficient communication?

Decide on the data types to support: base types, flat types, complex types.
Determine encoding methods for data transmission and decoding methods for data reception.
Use self-describing (tags) or implicit descriptions (end knowledge) for data encoding.

How well did you know this?

Not at all

Perfectly

What is the role of stub generation in data representation?

Systems generate stub code from an independent specification (IDL).
IDL (Interface Description Language) describes an interface in a language-neutral way.
Stub generation separates logical data description from dispatching, marshaling/unmarshaling, and data wire format.

How well did you know this?

Not at all

Perfectly

Describe the key features of XDR’s approach to standardizing data representations.

XDR defines a single byte order (Big Endian) and floating-point representation (IEEE).
It decouples programs creating/sending portable data from those using/receiving it.
New machines or languages don’t affect existing programs; they just need to convert between standard and local representations.

How well did you know this?

Not at all

Perfectly

What are the canonical data types defined by XDR?

Basic data: Integer, Unsigned Integer, (Hyper) Integer, Floating-Point, Void.
Variable size data: Fixed-Length Opaque Data, Variable-Length Opaque Data, String.
Composed data: Fixed-Length Array, Variable-Length Array, Structure, Discriminated Union.

How well did you know this?

Not at all

Perfectly

What is the function of the XDR Library in data representation?

XDR library solves data portability problems by transforming data to/from a canonical format.
It allows reading and writing arbitrary C constructs in a consistent and well-documented manner.
The library consists of functions for encoding/decoding data, based on defined data structures, known as filters.

How well did you know this?

Not at all

Perfectly

Describe the types of data filters provided by XDR.

Basic Data Filters: Handle types like char, short, int, long, float, double, and void.
Variable Size Data Filters: For handling fixed-length opaque data, variable-length opaque data, and strings.
Composed Data Filters: Used for fixed-length arrays, variable-length arrays, structures, and discriminated unions.
Pointer Filters: Manage data structures involving pointers, with functions for transformation and memory allocation.

How well did you know this?

Not at all

Perfectly

How is XDR used in data transfer?

XDR data structures don’t contain metadata, so type determination from binary data is impossible.
Client and server must agree on the format of transferred data.
XDR files from RPC (Remote Procedure Call) have compatible encoder/decoder.
The same code is used for both encoding and decoding data, and functions must be called in the same order.

How well did you know this?

Not at all

Perfectly

What are the main features of Google Protocol Buffers?

Defined by Google and widely used internally and externally.
Supports common types and service definitions.
Natively generates C++, Java, and Python code, with over 20 other languages supported by third parties.
Efficient binary encoding and readable text encoding, significantly smaller and faster to process than XML.
It’s not a full RPC system but handles marshalling, with many third-party RPC implementations available.

How well did you know this?

Not at all

Perfectly

What properties characterize Google Protocol Buffers?

Efficient binary serialization.
Support for protocol evolution, allowing addition of new parameters.
The order of specified parameters is not important, and non-essential parameters can be skipped.
Supports somewhat complex structures and provides compile-time error checking.
Used for RPC calls, serializing data to non-relational databases, and as a long-term storage format due to its backward compatibility.

How well did you know this?

Not at all

Perfectly

What is the primary goal of Protocol Buffers?

Study These Flashcards

Provide a language- and platform-neutral way to specify and serialize data.
Ensure the serialization process is efficient, extensible, and simple to use.
Allow serialized data to be stored or transmitted over the network.

Describe the features of the Protocol Buffer Language.

Study These Flashcards

Messages contain uniquely numbered fields.
Fields are represented by field-type, data-type, field-name, encoding-value, and optional default value.
* Supports primitive, enumerated, and nested message data-types.
* Enables structuring data into a hierarchy.

What are the different field types in Protocol Buffers?

Study These Flashcards

Required fields: Must be present exactly once in a well-formed message.
Optional fields: Can appear zero or one time in a well-formed message.
Repeated fields: Can appear any number of times (including zero) in a well-formed message.

What is the role of a .proto file in Protocol Buffers?

Study These Flashcards

The .proto file contains the specification of the message.
It is compiled by the protoc tool, which generates code allowing programmers to manipulate the message type.

Explain the function of protoc-c in Protocol Buffers.

Study These Flashcards

It defines the .proto file following the language syntax.

message M1{
    required string str = 1;
    optional int32 i = 2;
}

Generates .h and .c files, including structures and functions for manipulating messages.
Provides functions like m1_init, m1_get_packed_size, m1_pack, m1_unpack, and m1_free_unpacked.

What are the field rules in Protoc-c?

Study These Flashcards

Required fields must appear exactly once in a well-formed message.
Optional fields may appear zero or one time, with a boolean flag indicating their presence.
Repeated fields can occur multiple times in a message, and their count is tracked.

How are optional fields handled in Protoc-c?

Study These Flashcards

Optional fields can have default values defined in the .proto file.
If no value is assigned, the field takes a type-specific default value.
The C structure includes a boolean to indicate if an optional field is transmitted.

Explain the handling of repeated fields in Protoc-c.

Study These Flashcards

Repeated fields represent arrays or lists in a message.
The C structure includes a count of the number of elements and a pointer to the array.
Memory management for repeated fields must be handled manually in C.

Describe the use of enumerated fields in Protoc-c.

Study These Flashcards

Enumerated fields allow a field to have one of a predefined list of values.
Enums are defined in the .proto file and translated into C enums.
Fields can be defined to use these enumerated types, providing a set of allowed values.

How are message types used in Protoc-c?

* Message types are created for each specific use case, like RPC calls or data storage. * Each message type has its own structure and set of fields. * Fields can be required, optional, or repeated, and are manipulated using generated C code.

08 - External data representation Flashcards

(25 cards)