Computer Vision Terminology Flashcards

Question 1

Q

Computer vision

Answer

A

Computer vision is a subfield of artificial intelligence and computer science that focuses on enabling computers to understand and interpret the visual world. Essentially, it’s about teaching computers to “see” and understand digital images or videos.

Question 2

Q

Simultaneous localization and mapping (SLAM)

Answer

A

Simultaneous Localization and Mapping, or SLAM, is a computational problem in the field of robotics. As the name implies, it’s about doing two things at the same time:

Localization: Determining where a robot is located in an environment.

Mapping: Building a map of that environment.

Question 3

Q

vSLAM: Initialization

Answer

A

“Initialization” refers to the process of setting up the initial conditions or starting point for the algorithm.

Question 4

Q

vSLAM: Local mapping

Answer

A

Local mapping is where the robot builds a smaller, more immediate map of its surroundings, often referred to as a local map.

Question 5

Q

vSLAM: Loop closure

Answer

A

The idea of loop closure is to correct this drift by recognizing when the robot returns to a place it has visited before. When the robot recognizes such a place, it can “close the loop”, correcting its current position estimate and map to align with the previous visit.

Question 6

Q

vSLAM: Relocalization

Answer

A

It refers to the ability of a robot to determine its current location in a map that it previously built or in a known environment, particularly after it has lost track of its position due to an error, disturbance, or after it has been manually moved (also known as the “kidnapped robot” problem).

Question 7

Q

vSLAM: Tracking

Answer

A

“Tracking” typically refers to the process of continuously estimating the robot’s motion and position over time based on its sensor data.

Question 8

Q

Human pose estimation (HPE)

Answer

A

Human pose estimation (HPE) is a computer vision task that involves determining the position and orientation of the human body, along with the positions of various body parts such as the head, arms, legs, and so on, usually in real-time.

Question 9

Q

Rigid pose estimation (RPE)

Answer

A

Rigid Pose Estimation (RPE) is a concept in computer vision and robotics that involves determining the position and orientation (the “pose”) of an object that does not deform or change shape — in other words, a “rigid” object. The term ‘rigid’ indicates that the distance between any two points on the object remains constant over time, regardless of the object’s movement or orientation.

Question 10

Q

Global map optimization

Answer

A

Process of improving the accuracy and consistency of a map that a robot has created of its environment.

Question 11

Q

Global positioning system (GPS) signal

Answer

A

The Global Positioning System (GPS) is a satellite-based navigation system that provides location and time information in all weather conditions, anywhere on or near the Earth where there is an unobstructed line of sight to four or more GPS satellites.

Question 12

Q

GPS-degraded environment

Answer

A

A GPS-degraded environment refers to any situation or location where the Global Positioning System (GPS) signals are unreliable, weak, or completely unavailable.

Question 13

Q

GPS-denied environment

Answer

A

A GPS-denied environment is a location or situation where the Global Positioning System (GPS) signals are not available at all.

Question 14

Q

Dead reckoning data

Answer

A

Dead reckoning is a process used in navigation to determine one’s current position based on a previously known position, or fix, and advancing that position based upon known or estimated speeds over a period of time, and the direction in which the person or vehicle is known or estimated to have moved.

Question 15

Q

Robot drift

Answer

A

“Robot drift” is a term often used in the context of robotics and refers to the accumulated error in a robot’s estimated position and orientation over time.

Question 16

Q

Object occlusion

Answer

A

Object occlusion in the context of computer vision refers to the event where a part or all of an object in the scene is hidden from view by some other object in the scene. In simple words, when an object is in front of another object, blocking it from view, we say that the second object is occluded.

Question 17

Q

Inertial measurement unit (IMU)

Answer

A

An Inertial Measurement Unit, or IMU, is a device that measures and reports on a vehicle’s velocity, orientation, and gravitational forces, using a combination of accelerometers, gyroscopes, and sometimes magnetometers. IMUs are typically used to aid in navigation and tracking systems, particularly when GPS data is unavailable or unreliable.

Question 18

Q

Light detection and ranging (LiDAR)

Answer

A

Light Detection and Ranging, more commonly known as LiDAR, is a method of remote sensing that uses light in the form of a pulsed laser to measure distances to an object. These light pulses, combined with other data recorded by the airborne system, generate precise, three-dimensional information about the shape of the Earth and its surface characteristics.

Question 19

Q

Odometry sensor

Answer

A

An odometry sensor is a device used to estimate the change in position over time of a vehicle, like a car or a robot, based on data from its own sensors.

Question 20

Q

Sensor fusion model

Answer

A

Sensor fusion is a method used in robotics and automation that involves merging data from different sensors to improve the understanding of the environment. This process can reduce uncertainty, improve accuracy, and make the system more robust to failures of individual sensors.

Question 21

Q

Bundle adjustment

Answer

A

Bundle adjustment is a mathematical process that tries to minimize the errors in the 3D reconstruction of the scene and the camera positions. It takes all the images and the initial guesses of the camera positions and 3D scene points, and refines these guesses by minimizing the difference between the observed image points and the projected 3D points onto the image planes of the cameras.

Question 22

Q

Edge computing

Answer

A

Edge computing is a distributed computing paradigm that brings computation and data storage closer to the sources of data. This is done to improve response times and save bandwidth. The “edge” refers to the edge of a network, closer to the devices that produce or consume data, as opposed to a centralized data center or cloud.

Question 23

Q

Key points/pairs

Answer

A

Key points, also known as feature points or interest points, are distinct and unique points in an image that are easy to find and accurately describe. These points are usually selected because they represent corners, edges, or other interesting aspects of the image, and they are used in many computer vision tasks for things like object recognition, image alignment, and 3D reconstruction.

Question 24

Q

Keyframe selection

Answer

A

The robot selects certain frames as “keyframes”. These are the frames that contain significant or new information — maybe the robot has moved to a new location or turned a corner, for example. By focusing on these keyframes, the robot can create a map of its environment and track its position more efficiently.

Question 25

Q

Optimization

Answer

A

Optimization is the method of obtaining the most effective and/or efficient solution to a given problem