Lecture 2: Terminology and Basics Flashcards

1
Q

Describe what Tracking is.

A

Tracking provides data about some object, user, etc. that is being tracked.

With regards to MR, tracking usually refers to positional tracking. The amount of (positional) data being tracked may vary depending on the area of application.

Tracking is considered an integral part of modern VR systems, as it is often seen as the key to interactive environments.

Current AR/VR devices are bundled or equipped with a number of different tracking systems that vary in tracked data, precision, flexibility, latency, ease-of-use, etc..

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What type of data provides Tracking and what bodyparts, devices does it include?

A

A tracking system provides (positional) data for devices or objects.

May include but is not limited to:
- Head-mounted displays
- Controllers
- Fingers
- Hands
- Full body
- …

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How fast do we track?

What does “latency” mean in the context of tracking? What should we keep in mind when we think about latency and tracking?

A

In tracking, latency is how long it takes from a change of the tracked object or device until we know about the change.

When tracking, keep in mind that we not only need to measure a change and send the data to our application. Rendering a new frame based on this new information takes some time as well (60 fps should be considered a minimum for VR)!

Even seemingly low latency values (15-20ms) are well noticeable when it comes to movement in VR!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Differentiate Rotation and Translation

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does DOF refers to? Give an example what a DOF can be.

A

Degrees of Freedom (DOF)
DOF refers to the number of parameters that can change individually.

With regards to tracking, a degree of freedom may be:
* Rotation around a single axis
* Translation along a single axis

“Rotation” is NOT a single degree of freedom. To freely express rotation 3 degrees of
freedom (one per axis) are needed!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Differentiate 3DOF and 6DOF

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Illustrate the meaning of 3DOF and 6DOF with a sketch

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the goal of VR? Is it the same with AR?

A

When users enter a virtual world, ideally they should:
* Feel like they are actually in the virtual environment
* React to input like they would in the real world

These concepts apply only to VR, as AR does not try to remove users from the real world.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Give a definition of Presence. What influences Presence?

A

Presence is the feeling of actually being in the virtual world and will cause the user to react to virtual input just as they would in the real world.

Convincing a user that they are present in a virtual environment is usually the goal of an interactive VR system!

Presence is not only depending on the technology used to “enter” a virtual environment,
but also influenced by:
* Content
* Interactivity
* …

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Describe the term Immersion

A

Slater:
Let’s reserve the term ‘immersion’ to stand simply for what the technology delivers from an objective point of view. The more that a system delivers displays (in all sensory modalities) and tracking that preserves fidelity in relation to their equivalent
real-world sensory modalities, the more that it is ‘immersive’.

–> By this definition, immersion refers solely to the technological aspects of a VR system.

Other definitions of immersion refer more to the sensation of being surrounded or enveloped by a virtual environment:

Witmer and Singer:
Immersion is a psychological state characterized by perceiving oneself to be enveloped by, included in, and interacting with an environment that provides a continuous stream of stimuli and experiences.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What can be said about immersion and presence? Is there a clearly defined line between?

A

Immersion and presence are connected.
There is no clearly defined line between immersion and presence. High Immersion helps achieving good presence in a virtual environment!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How can we measure immersion and presence?

A

Usually when we want to collect data about immersion and presence, we need to talk to the users! User Studies help us collect these types of data. Multiple questionnaires are available that you can build upon!

We can use:
* SUS
* Presence Questionnnaire
* SSQ (Simulator Sickness Questionnaire)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Describe what cybersickness is. What causes it?

A

Some people experience symptoms similar to motion sickness when using VR systems.
This is not as common with AR, as we are watching the real world! Cybersickness is usually caused by a discrepancy between senses. Although cybersickness is influenced by many factors like quality of the tracking system
and type of content, there is currently no cure or way to avoid it for everybody!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Describe Binocular Vision

A

Front facing eyes
Two eyes facing the same direction perceive two images from slightly different directions.
The resulting perception is a single three-dimensional image.
We can use this principle to create the illusion of depth when artificially creating images!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Describe what parallax is

A

When we move a camera, the objects seem to move within the picture. When looking at the world through two eyes, the objects’ positions within the pictures shift for each eye. This shift is called parallax.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Binocular vision is a big part of depth perception. But are there more approaches to reach a illustion of depth?

A

Our brains can gather depth information not only from binocular vision, but also from other sources: movement, occlusion, etc. (e.g. video games, “wiggle gifs”). A single, static two-dimensional image does not work well for conveying depth!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Briefly explain Stereoscopic Rendering. What is called a
stereogram?

A

Stereoscopic rendering produces two images from slightly different viewpoints. One image is presented to each eye, creating the illusion of depth – if done correctly.

A pair of images - one image for the left, the other for the right eye – is called a stereogram, though the term is not commonly used.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What does IPD stands for?

A

Interpupilary Distance
IPD is the distance between a human’s (or dog’s) pupils.
To achieve a good illusion of depth, the IPD is adjusted to the individual we are rendering a stereogram for.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What does Vergence mean?

A

Viewing direction of our eyes is not parallel.When focusing on an object the eyes will move in opposite directions to focus on the target. This movement is called vergence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What does Accommodation mean?

A

Besides vergence, our eyes will also focus, so we see a clear image despite varying distance. This is called accommodation. Accommodation is achieved by adjusting the lenses in our eyes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Describe the Problem with Vergence and Accommodation

A

Vergence-Accomodation-Conflict:
Stereoscopy only creates the illusion of depth and our eyes try to focus on the object that we are displaying. The distance to the display and to the virtual object may not be identical, causing in a mismatch of vergence and accommodation.

The vergence-accommodation-conflict can cause problems with focus, eyestraing, fatigue,
etc.

It is a problem we currently face with HMDs, 3D displays/TVs, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Desribe what FOV stands for. What can limitate the FOV?

A

Field of View (FOV):
The field of view denotes the area visible to a device or being. The FOV may be limited by a recording device (or eye) or by an output device (like an HMD)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

How do we measure FOV? What should we consider when reading about FOV?

A

For our purposes FOV is usually measured in degrees.
When reading about FOV, be mindful of the direction: FOV may be horizontal, vertical
or diagonal!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is another term for Rendering?

A

image
synthesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Is Rendering for VR or AR differnt to “normal” computer graphics?

A

Rendering for VR or AR is not different from “normal” computer graphics (CG)

Stereoscopic rendering just means rendering a different 2D image for each eye

26
Q

Input Data

How can we represent the data?

A

There is a wide variety of ways to describe input data. This ranges from geometric
data (like points or polygons) over volumes to mathematical descriptions (like surfaces).

27
Q

Name different Data Types. Are all Rendering Methods are applicable to all Data Types?

A
28
Q

Name three Primitives (rendering primitive). Why should we tell an API about the Rendering Primitive?

A
29
Q

What is point-based rendering?

A

For our purpose a “point” is a rendering primitive that stores at least a position in space.

A set of points, or point cloud, can be used to render a continuous three-dimensional
surface.

Point-based rendering has been proposed as early as the mid 80s

30
Q

Point-Based Rendering

What additional attributes may store Points?

A
31
Q

What negative effects can you encounter with Point-Based Rendering? Name one approach to counteract against the side effect.

A

Given enough points we can render a seemingly continuous three-dimensional surface.

But: If we move the camera too close to the surface, holes may appear between points!

Various techniques have been proposed to use point clouds to render smooth surfaces.
One simple approach is to render small discs instead of just the points, to create a surface.

32
Q

Is point based rendering efficient? Justify your answer.

Where are point clouds commonly used today?

A

Rendering points is very efficient:
* Each point is rendered individually (including lighting)
* No connectivity information needed, all points are individual
* No interpolation between edges (like with polygons)
* Supported by GPUs and most APIs

Point clouds are commonly used today:
* Laser scanning
* Stereo-Vision
* Photogrammetry

33
Q

From Data to Image

Describe Line Rendering

A

Lines:
Building upon points, a line is a rendering primitive that is defined by its two endpoints.
Line rendering is not commonly used for surfaces, but may be useful at times and is
supported by most rendering APIs.

We only know the data stored in the endpoints. To generate data for any position on the line, the data from the endpoints is interpolated.

Line rendering is common in SciVis!

34
Q

What is a streamline?

A

A streamline is a line in 3D space
representing the path of flow for a fluid,
particle, etc.

Allows easy to understand representation to
visualise turbulence, path of least resistance,
etc.

35
Q

What are Polygons and triangles?

A

A polygon is a geometric figure that is described by a number of straight lines.

As a rendering primitive triangles are most commonly used. A triangle primitive consists
of three points forming three edges (lines) and a planar surface.

As seen with lines, data for any position on the surface is generated by interpolating data from the three points.

36
Q

What do we need to approximate complex surfaces with triangles?
What is a Triangle Mesh?

A
37
Q

Polygons

What serve modern APIs to improve storage efficiency?

What is required to form triangles from a set of points?

A

To improve storage efficiency modern APIs support vertex and index buffers to avoid redundancy:

  • Vertex buffers store just point data
  • Index buffers store the index of points in the vertex buffer; three consecutive indexes
    form a triangle

Forming triangles from a set of points requires information about connectivity: which groups of 3 points form a triangle?

Since triangles may be connected, a single point can be part of multiple triangles.

38
Q

Colours and Textures

Where may be stored additional surface data (like Colours and Textures)?

A

Additional surface data may be either stored in the mesh (vertex data) or in a separate
map/image/lookup table (texture).

39
Q

Vertex Colours

What is a simple way to add colours to meshes? What is this approach limiting?

A

A simple way to add colours to meshes is to store colour data in the vertices.

Each vertex stores a colour value, for each point of the triangle surface the colour value is computed by interpolating the colours from the vertices.

Vertex colours are a rather simple approach that is quite limiting, since the amount of
detail in the surface colour depends on the size and number of triangles.

40
Q

What is a more sophisticated approach of storing colour values?

A

Ideally, vertex data should define the shape, but not limit detail in the colours. Instead of storing colour values in the mesh, we map an image onto the surface. The image is called a texture.

To know what part of the image goes where, each vertex stores a texture coordinate
(often denoted as uv coordinate), usually in the range [0-1].

As with vertex colours, interpolation across the triangle is used, but here for the texture
coordinates and not the colour values.

41
Q

What may occur during creating textures?

A

When creating textures stretching and
shrinking of parts of the image may occur to match the uv-mapping and the model.

Nowadays, textures can be more than
just images. They may store a wide variety of data that is read and mapped to a model during rendering (e.g. materials, normal vectors, etc.).

42
Q

What does MipMapping mean?

A
43
Q

How to create and use mipmaps? What can be a disadvantage?

A

Mipmapping takes a texture and creates multiple versions with progressively lower
resolutions: for each new mipmap of a texture the horizontal and vertical resolution is
halved.

During rendering a mipmap may be used instead of the original texture depening on
the distance between the camera and a pixel (greater distance, lower resolution
mipmap).

Mipmapping can avoid artefact and increase rendering speed, but at the cost of texture
memory.

44
Q

What is a Voxel?

A

A voxel is a “three-dimensional pixel”. It represents a value for a cell within a grid in space.

Voxels are commonly used to represent volume data. A number of voxels, often as part
of a regular grid, describe the parameters of the volume for their given region.

Volume data are often used in scientific applications and can contain all kinds of data, e.g.: Pressure, moisture, flow speed and direction, temperature, …

45
Q

Where is Volume data frequently encountered?

A
  • In medical applications: CT scans, medical imaging, …
  • Simulation and Supercomputing
46
Q

What is called “rendering”?

A

The process of creating an image from data is called “rendering”.

47
Q

How is image generation most commonly done?

A

Rasterisation is the process of mapping our objects to the pixels of an image.

Rasterisation is well supported in modern hardware and APIs and is commonly used in
video games, real-time graphics, etc.

Each primitive to be rendered is mapped to pixels of the resulting image. Due to it’s performance and hardware support most VR rendering is done using rasterisation.

48
Q

What is Ray Tracing?

A

Ray tracing is simulating individual rays of light.
This approach allows us to simulate many physical effects, especially when recursively
tracing rays. Ray tracing is very computationally intensive, though.

Ideally we would want to render everything through ray tracing. Due to the required
computational power this is, however, currently not possible. Current advancements in hard- and software try to integrate ray tracing with rasterisation for improved reflections, shadows, etc. (Nvidia RT cores, AMD ray accelerators).

49
Q

Can we simplify ray tracing?
If yes, what may be the trade-off?

A

Ray casting simplifies the idea of ray tracing by casting a ray from the camera through
each pixel of the image. When a ray hits an object, we have a colour value.

Ray casting is a lot simpler and faster than ray tracing, it does loose a lot of ray tracing
features though, as it does not simulate the physical effects as well.

50
Q

Can we simulate the real world?
What does PBR mean in this context?

A

Physically Based Rendering (PBR)

Physically Based Rendering tries to generate images by simulating what happens in the real world (light bouncing around, being reflected, refracted, etc.).

Usually aims to create photo-realistic results, but is computationally intensive!

PBR used to be for cinematic rendering only, but now started to enter the real-time world as well!

51
Q

What is the difference between direct and indirect lighting?
What is the aim of Global Illumination?

A

Direct light is light directly from a light source hitting an object.

In reality, light bounces from surfaces. Every surface reflects some amount of light,
usually changing the color. This reflected light also lights surfaces around it -> indirect lighting.

Global Illumination aims to add light from reflections to create a more realistic image

52
Q

When rendering for MR, especially VR.. what is considered crucial?

A

performance!

Not only do low framerates cause cybersickness, but we also need to render separate images for each eye, so twice the frames!

As a result methods to increase performance, ideally without loosing visual quality, are
very important in VR!

53
Q

When rendering, everything gets mapped to…

What can we say about how many pixels of an image an object is mapped to?

A

pixels

How many pixels of an image an object is mapped to depends on scale, distance, etc.
Fewer pixels means fewer details are visible!

54
Q

What does LoD mean?

A
55
Q

What means Culling?

A

Not all parts of a scene are always visible.

Some objects may be outside the area visible by the camera, others may just be hidden by some object and for non-transparent objects we usually cannot see the backside.

Culling refers to removal of objects or part of objects that are not visible. Usually we talk about frustum culling, occlusion culling and backface culling.

56
Q

What is called foveated rendering?

A

We mentioned that we render objects with varying detail, depending on the amount of detail we need.

Humans only see full detail if we fixate on some point. Anything in our field of vision that we do not focus on is not perceived as detailed.

Knowing where a user looks allows us to render with high detail only where needed
(= what the user looks at). This is called foveated rendering.

57
Q

When rendering a scene, the final picture is created…

Go in more detail regarding Drawcalls.

A

incrementally

Drawcalls:
In a scene we usually have many objects. Most often, each object is sent to the GPU individually and drawn onto the image. When all objects are drawn, the image is complete.

Draw calls can be expensive and can cause CPU overhead. So less drawcalls is better/faster!

58
Q

What describes Geometry Instancing?

A

In many scenes, objects are rendered many times: grass/vegetation, crowds, flocks of
birds, etc!

When rendering an object multiple times we can use geometry instancing!

Instead of drawing one object many times, we send the geometry to the GPU once together with a list of varying attributes (position, rotation, etc.)

Geometry Instancing can reduce overhead and improve performance (when CPU bound).

59
Q

What does Instanced Stereo mean?

A

In VR, we always render everything twice (once per eye)!

In essence, we render each object twice: once for the left and once for the right eye. Instanced stereo (also known as single pass stereo) replaces drawcalls with instanced drawcalls to render both eyes at once (into a single packed texture). Can help to increase VR performance!

60
Q

What are ways for Data Generation?

A

Manual Generation: time consuming, need of artistic skill and creativity

Automated Generation

61
Q

What data will be created when using a laser scanner?

A

Tools like laser scanners usually scan the environment and create a point cloud for
any solid surface.

The resulting data can either be rendered directly or computed into a triangle mesh, volume data, …

62
Q

What is Photogrammetry?

A

One variety of scanning the environment is photogrammetry.

Photogrammetry uses a camera as a sensor.
Having a lot of photos of a single object or environment, an automated algorithm can
match the photos and find overlapping areas and compute how the images fit together. If a point is visible on multiple images, its position in 3D space can be reconstructed. This process is repeated to create a (coloured) point cloud.