Recognizing Textured Planar Objects with OpenCV

How can we recognize textured objects under the assumption of planarity in images? OpenCV provides extensive support for this task, and helps with local feature extraction, fast approximative matching, and robust model fitting. This article describes a minimalistic detector based solely on OpenCV for all vision subtasks, and also offers pictures, videos, and the source code.

Here is the problem to begin with: We have a few planar textured objects that we want to recognize in images. We are permitted to create models of these objects in a training phase. Then the task is to recognize and locate these objects in still 2D gray-scale images.

The approach presented in this article is available as open source (tpofinder) and has been developed for robotic machine vision (mappotino). The technique is very similar to those used in some panorama stitching applications. More precisely, it is a bag-of-features approach, where the test images and the models are described by local features that are somewhat robust towards scale, rotation and change in illumination. An approximative nearest-neighbor algorithm matches the local features between the test image and the models of the objects. This is done in order to find correspondences between the each model and the test image. Finally, a robust model fitting algorithm estimates homographies between the models and test images based on the matched keypoints. Here is a real-time video of the detector in action:

This approach is not really new. There are already a few implementations of object detectors available on the web. Just in order to mention a few I am aware of, there are both pure object detectors or frameworks such as: BLORT, MOPED, RoboEarth, Object Recognition Kitchen. However, in the robot challenge, we required software that is quickly installed, compiled, and integrated. This was a few months ago, and after a few attempts on the one or other third-party detector, we turned towards rolling out our own detector on reinvented wheels, tpofinder.

The models are obtained from multiple views. The collection process currently requires manual annotation of the region of interest in an image that shows the object of interest. Also, if there are multiple views of the object, the homographies between the views need to be specified through pairs of annotated keypoints on the respective images. Apart from that, tpofinder can be tested out-of-the-box with a webcam. If no object from the sample model database is at hand, then a picture of it will do as well. Here the simplification of planarity pays out in an unexpected way. During testing, the detector is able to locate more than one object in the image. However, in the implementation, the time required for detection increases with the number of models: tpofinder was tested with five objects on a 2-years old notebook and obtained a detection rate of 2 fps.

At the end, tpofinder is a small and, up to some Boost libraries, purely OpenCV-based object detector. In fact, most of the credits belong to OpenCV, which provides algorithms for feature extraction and matching, as well as functions for robust model fitting. The OpenCV project has advanced in big steps in the last year in several aspects, such as: incorporation of recent vision algorithms, cleaning up of mess from earlier times, an appropriate website, improved documentation, migration to git. Progress, which hopefully continues.

« Arch Linux: switched to systemd

Quick-fix for X11: Typing Å on German Keyboard »

a blog by Julius Adorf

Posts in TechnologyPomodoro Timer: Prototype, Round 3 Pub combinatorics: the joy of rediscovery Quick-fix: Typing ÄÖÜ on a UK Keyboard Pomodoro Timer: Prototype, Round 2 Pomodoro Timer: Prototype with an ATmega32 Right control key on keyboard as i3 modifier in Ubuntu 20.04 A formula for converting pace from min/mile to min/km in Google Spreadsheets Visualizing Strava activities with BigQuery and Google Data Studio Thoughts on Model Thinking: a smörgåsbord Statistics tell you when to stop practicing Applying Machine Learning to Strava activities using BigQuery ML Inspecting air pollution data from OpenAQ using Colab, Pandas, and BigQuery What probability theory tells you about starting on time Analysing Strava activities using Colab, Pandas & Matplotlib (Part 4)Analysing Strava activities using Colab, Pandas & Matplotlib (Part 3)Analysing Strava activities using Colab, Pandas & Matplotlib (Part 2)Analysing Strava activities using Colab, Pandas & Matplotlib (Part 1)Misleading infographics: How Not To Bubble Chart Memories from University: Teaching the Computer to play Connect Four Missing Maps: Use Your Phone for the Better How data can assist us in forming good habits Missing Maps: Putting People on the Map Energy from Thin Air: Measuring Air Pollution with CleanSpace Bletchley Park and the rebuilt bombe Motion Segmentation of RGB-D Videos via Trajectory Clustering Preview: Motion Segmentation of RGB-D Videos via Trajectory Clustering Fixing a Shimano EF50-8R bicycle shifter Programmer-friendly German keyboard layout on GNU/Linux Case study: when average speed matters Recursive circle packing with PostScript Managing encrypted devices with LVM on top of LUKS with luksctl Benchmarking Google's Speech Recognition Web Service Asus Xtion Pro Live – First Impressions Using Google's Speech Recognition Web Service with Python Speech Input in Google Chrome: x-webkit-speech Clustering Crash Simulation Data with LLCA German PC keyboard layout in Mac OS Prolonging the Life of a Logitech K340 Keyboard Computing PageRank for the Swedish Wikipedia Case Study: Role-Playing Game in C++Artificial Neural Network: Animation of Training Inspecting Algorithms with Graphs Behind the scenes: a thought abroad HP Officejet 6500 e710n-z on Arch Linux Task Manager with Focus on Usability: dropandforget Netgear WNR612 Classic Wireless Router – Good Value for Money Version Control on Top of Dropbox Public Transport in Munich now on Google Maps Quick-fix for X11: Typing Å on German Keyboard Rudimentary Recognition of Spoken Words at KTHRecognizing Textured Planar Objects with OpenCVThe Viterbi Algorithm and Breadth-First Search Arch Linux: switched to systemd Rotating Backups with rsnapshot Olve Maudal and Deep C++Mappotino: A Robot for Exploration, Mapping, and Object Recognition Template Tracking using Hyperplane Approximation Fix for Wireless Presenters and Flash-based Full-screen Prezi Reinventing the Wheel: Panorama Stitching with Matlab Saving the Parrots with Homogeneous Coordinates A Connection between Motion Blur and the Fourier Transform Disabling hot-corner effect in Gnome 3 Dual-booting Arch and Ubuntu with LVM on top of LUKS Team Black Sheep presents amazing stunts with first-person-view RC plane Sampling from a Poisson distribution - a benchmark Understanding someone else's source code Enhancing Details with Unsharp Masking Nearest-Neighbor-Resampling in Matlab Zweidimensionale Bereiche plotten mit Wolfram|Alpha Hosting bei Dreamhost, Domain woanders Eine weitere Identität für Binomialkoeffizienten Remote Procedure Calls über den DBus Syntaxhervorhebung mit Pygments 2D-Grafik-Ausgabe mit Cairo und OCaml Programmierkonzepte für Multi-Core-Prozessoren Funktionsgraphen zeichnen mit PostScript