Version control systems Flashcards

1
Q

Version

A

Snapshot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Branch

A

(temporarily) independent development line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Merge

A

Integration of a branch into another development line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Merge conflict

A

A problem that occurs when attempting a merge with changes that cannot be integrated easily

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Development line

A

A logically sequential line of version

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Tag

A

Named version in the repository

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

SVN stands for

A

Subversion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does it mean that SVN is a client sever version control system?

A

SVN operates on a client-server model, which means that there are two main components: the client and the server.
Developers have a local working copy on their machine (but not a full repo with complete history).
The server is a centralised repo that stores the complete version history of the project.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Definition SVN

A

Subversion is a centralised version control system used for tracking changes in files and directories.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

SVN repository

A

SVN uses a centralised repository to store the complete version history of a project.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

SVN branching and merging

A

Supports branching and merging, but can be more manual and less intuitive compared to git

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

True or false:
SVN allows to check out parts of the repository (directories/files)

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

SVN import

A

Initialise repository from working copy.
svn import C:\repository https://www.example.com/svn

From computer to repository.
Put a directory under version control by sending its data from the local computer to the remote repository. This is done only once for a directory to initialise the repository with the respective data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

SVN Checkout

A

Initialize working copy from repository.
svn checkout https://www.example.com/svn C:\repository

Initialise a local working copy with the data from the repository. This is only done once for a directory from the repository when copying to the local computer.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

SVN commit

A

Send changes from working copy to repository.
svn commit -m “Completed printing feature.”

Send changes to the repository, which then makes an effort to integrate it into the current state of the repository (even if this may have changed since the last update).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

SVN update

A

Retrieve changes from repository to working copy.
svn update

Retrieve changed data from the repository where it is integrated into the current state of the working copy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

SVN add

A

The add command tells SVN to have the specified files under version control, i.e., when committing to the repository, it check if there were changes and, if so, stores them as new version.
Put new file(s) under version control.
svn add Example1.java

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Git definition

A

Git is a distributed version control system that allows multiple developers to work on a project individually.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Git repository

A

Each developer has a complete copy of the repository, including the full version history, making it a distributed system.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Git branching and merging

A

Branching and merging are fundamental and easy to use.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Git Workflow

A

Follows a distributed model, enabling developers to work offline and commit changes locally before pushing to a central repository.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Git init

A

Initialize local repository from working copy.
git init
Put a directory under version control by sending its data to the local repository.
This is done only once for a directory to initialise the repository with the respective data. The data is, at this point, not yet available in the remote repository and needs to be pushed.

22
Q

Git clone

A

Initialise local repository from remote repository.
git clone https://github.com/chseidl/sdse_students.git

Initialise a local repository and local working copy with the data from the remote repository. This is only done once for the remote repository when copying to the local computer.

22
Q

Git pull

A

Retrieve changes from remote repository to local repository and
working copy.
git pull

Send the data from the remote repository to the local repository where it is integrated into the current state of the working copy.

22
Git commit
Send changes from working copy to local repository. git commit -m "Completed printing feature." Send changed data to the local repository, which then makes an effort to integrate it into the current state of the repository (even if it may have changed since the last update). The data is, at this point, not yet available in the remote repository and needs to be pushed.
22
Git push
Send changes from local to remote repository. git push Send the data from the local repository to the remote one.
22
Git pull
Retrieve changes from the remote repository to local repository and working copy. git pull
23
Git add
Put new/changed file(s) under version control. git add Example1.java The add command tells Git to have the specified file(s) under version control, i.e., when committing to the local repository and pushing to the remote directory, it checks if there were changes and, if so, stores them as a new version.
23
Create a branch in Git
git checkout -b BRANCHNAME The checkout command switches to another branch of development. When used with -b, a new branch is created before immediately switching to it.
24
Merge branch with master branch in Git
git checkout –b printing // development git commit –m "Realized new feature." git checkout master git merge printing The merge command merges the changes of the specified branch into the currently active branch, i.e. when wanting to merge into master (the default) one has to switch to it first.
24
Stable release
The current version of the software as the majority of people would use it.
24
Pre release (alpha/beta/cutting edge/nightly build/etc.)
Is the version of the software that contains the newest features, but have not been tested properly for a general release.
25
Long term support version
For late adopters that cannot update frequently. Supposed to still receive critical updates over long periods of time, but no new functionality.
26
Semantic versioning scheme for releases
Major.minor.patch Major: significant new program functionality Minor: new program functionality that is compatible with old functionality Patch: bug fixes and minor internal changes.
27
What are three typical challenges in big data / data science?
1. Data Large, sparse, replicated data 2. Coordination Communication, query data, similar setup 3. Calculation Scaling, parallelism, distribution
28
Hadoop
Solve massive data problems with distributed computers - storage and processing of big data - compensates for hardware failures MapReduce HDFS distributed file system, many cheap computers
29
Spark
Distributed computing system. Especially supports iterative and interactive/exploratory programming models as, e.g. needed by training algorithms for machine learning. Apache Spark is designed for in-memory data processing, which makes it much faster than traditional data processing frameworks like Apache Hadoop.
30
Apache HBase
Data storage tool. Distributed, fault tolerant, column oriented non-relational database on top of HDFS. Logo is shark.
31
Apache Phoenix
Data storage tool Distributed relational database engine with SQL support using Hbase.
32
Apache druid
Data storage tool Distributed column-oriented data store for real time analytics
33
Apache Cassandra
Data storage tool Distributed wide-column data store for big data
34
Apache hive
Data storage tool Data warehouse for simplified/unified data query and analysis
35
Apache drill
Data storage tool Standard SQL queries on Hadoop for big data
36
Apache Zookeeper
Coordination tool Centralised service for distributed access to a hierarchical key-value store. Apache ZooKeeper is an open-source distributed coordination service designed to manage and synchronize the configuration information, naming, and various other distributed services across a large distributed system
37
Apache Pig
Calculation tool High-level platform for creating programs that run on Hadoop. Pig's intent is to make development of applications for hadoop easier.
38
Apache Kafka
Calculation tool Collect and distribute data streams in real-time from/to interested clients
39
Apache samza
calculation tool Develop applications that process streaming data, e.g. from Kafka
40
Apache Mahout
ML tool Collection of distributed, scalable machine learning algorithms Distributed linear algebra framework
41
ml4j
ML tool ML library for Java
42
DL4J
ML tool Distributed deep learning library for Java
43
WEKA
ML tool Data mining through machine learning
44
For one of the core libraries of your system, a colleague recommends using a “pre-release” version. Argue for or against that idea with at least two benefits/drawbacks that this may entail.
Arguing for using a "pre-release" version: Access to New Features and Improvements: Benefit: Pre-release versions often include the latest features, enhancements, and bug fixes. By using a pre-release version, you can gain early access to these improvements, allowing you to take advantage of new functionality and optimizations. Early Testing and Feedback: Benefit: Adopting a pre-release version allows you to participate in early testing and provide feedback to the developers. This can be valuable for both you and the development team, as it helps identify and address issues before the stable release. Your input could contribute to a more robust and reliable final release. Arguing against using a "pre-release" version: Stability and Reliability Concerns: Drawback: Pre-release versions are inherently less stable than their stable counterparts. They may contain bugs, incomplete features, or undergo significant changes that could impact the reliability of your system. Depending on your project's requirements, relying on a pre-release version might introduce unnecessary risks. Compatibility Issues: Drawback: Pre-release versions may not be backward compatible with the stable releases or with other libraries/tools in your ecosystem. This could lead to integration challenges and increase the complexity of your development and deployment processes. Using a stable release ensures a more predictable and compatible environment.
45
Contrast client-server version control systems and distributed version control systems using the examples of SVN and Git. Specifically mention, for each type of system, which repositories exist, where they are stored and how communication between them works. For describing the communication, use the terms of the respective commands in SVN and Git.
Look at slides
46