Computer vision

Related papers: 20

About

Computer vision is the field of artificial intelligence concerned with enabling machines to interpret and understand visual information from the world — including images, video, depth maps, and point clouds. In robotics and AI, it serves as a foundational sensing capability, allowing systems to perceive their surroundings, recognize objects, estimate poses, navigate environments, and interact with physical elements. Key applications include autonomous driving, where cameras and LiDAR streams are processed to detect obstacles and lanes; visual SLAM, where robots simultaneously build maps and localize themselves using camera feeds; visual servoing, where real-time image data guides robotic manipulators; and 6D object pose estimation, which enables precise grasping. Modern computer vision increasingly relies on deep learning techniques, such as convolutional neural networks, to extract meaningful features from raw sensor data. Benchmarks like KITTI have been instrumental in advancing the field by providing standardized evaluation datasets. Computer vision matters because it bridges the gap between raw sensor input and actionable robot behavior, making autonomous and intelligent systems practical in unstructured, real-world environments.

Top Cited Papers

Are we ready for autonomous driving? The KITTI vision benchmark suite

Andreas Geiger, P Lenz, R. Urtasun

Citations: 14348 • 2012

Vision meets robotics: The KITTI dataset

Andreas Geiger, Philip Lenz, Christoph Stiller, Raquel Urtasun

Citations: 9681 • 2013

A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses

R. Tsai

Citations: 5813 • 1987

Color indexing

Michael J. Swain, Dana H. Ballard

Citations: 5591 • 1991

Robot Motion Planning

Jean‐Claude Latombe

Citations: 5429 • 1991

3D is here: Point Cloud Library (PCL)

Radu Bogdan Rusu, Steve Cousins

Citations: 4825 • 2011

VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection

Yin Zhou, Oncel Tuzel

Citations: 4542 • 2018

VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator

Tong Qin, Peiliang Li, Shaojie Shen

Citations: 4390 • 2018

Parallel Tracking and Mapping for Small AR Workspaces

Georg Klein, David W. Murray

Citations: 4244 • 2007

Simultaneous localization and mapping: part I

Hugh Durrant‐Whyte, T. Bailey

Citations: 4107 • 2006

Computer and Robot Vision

Robert M. Haralock, Linda G. Shapiro

Citations: 3952 • 1991

A benchmark for the evaluation of RGB-D SLAM systems

Jrgen Sturm, Nikolas Engelhard, Felix Endres, Wolfram Burgard, Daniel Cremers

Citations: 3918 • 2012

MonoSLAM: Real-Time Single Camera SLAM

Andrew J. Davison, Ian Reid, Nicholas Molton, Olivier Stasse

Citations: 3909 • 2007

Robot Vision

Berthold K. P. Horn

Citations: 3635 • 1986

VoxNet: A 3D Convolutional Neural Network for real-time object recognition

Daniel Maturana, Sebastian Scherer

Citations: 3579 • 2015

A tutorial on visual servo control

Seth Hutchinson, Gregory D. Hager, Peter Corke

Citations: 3499 • 1996

SECOND: Sparsely Embedded Convolutional Detection

Yan Yan, Yuxing Mao, Bo Li

Citations: 3212 • 2018

Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age

César Cadena, Luca Carlone, Henry Carrillo, Yasir Latif, Davide Scaramuzza, José Neira, Ian Reid, John J. Leonard

Citations: 3158 • 2016

Domain randomization for transferring deep neural networks from simulation to the real world

Josh Tobin, Rachel Fong, Alex Ray, Jonas Schneider, Wojciech Zaremba, Pieter Abbeel

Citations: 2736 • 2017

Manipulability of Robotic Mechanisms

Tsuneo Yoshikawa

Citations: 2516 • 1985