DL Feature Matching¶
This module introduce different algorithms for performing feature matching on pairs of images by using Deep Learning feature matching algorithms, such as SuperGlue, LOFTR or LightGlue.
Feature matching consists of extracting corresponding points between two images of the same scene/object. This is a fundamental step in many computer vision applications, such as object detection, tracking, and motion estimation, as well as in the photogrammetric process of image-based 3D reconstruction.
Introduction¶
First, let's load the required modules.
Additionally, even though this step is not mandatory, it is suggested to setup a logger to see the output of the matching process. If no logger is setup, the output of the process is suppressed.
from icepy4d.core import Image
from icepy4d.utils import setup_logger
from icepy4d.matching import (SuperGlueMatcher, LOFTRMatcher, LightGlueMatcher, Quality, TileSelection, GeometricVerification)
setup_logger()
Jupyter environment detected. Enabling Open3D WebVisualizer. [Open3D INFO] WebRTC GUI backend enabled. [Open3D INFO] WebRTCWindowSystem: HTTP handshake server disabled.
We can load the images as numpy arrays.
We will use the Image class implemented in ICEpy4D, which allows for creating an Image instance by passing the path to the image file as Image('path_to_image')
.
Creating the Image instance will read the exif data of the image and store them in the Image object. The actual image value is read when the Image.value
proprierty is accessed.
Alternatevely, one can also use OpencCV imread function to read the image as a numpy array (pay attention to the channel order, that should be RGB, while Opencv uses BGR).
image0 = Image('../data/img/p1/IMG_2650.jpg').value
image1 = Image('../data/img/p2/IMG_1125.jpg').value
print(f"Image data-type: {type(image0)}")
print(f"Image0 shape: {image0.shape}")
print(f"Image1 shape: {image1.shape}")
Image data-type: <class 'numpy.ndarray'> Image0 shape: (4008, 6012, 3) Image1 shape: (4008, 6012, 3)
The Matcher class¶
All the matching algorithms implemented in ICEpy4D are implemented as a class, which can be initialized by passing a dictionary of parameters as input.
The actual matching is then run by calling the match
method of the class instance.
Some parameters are common to all the matching algorithms, such as the the Tiling
parameters, which are used to split the image in tiles to reduce the memory usage, and the Geometric Verification
parameters, which are used to filter out the outliers from the matching results.
The common parameters are presented here, while the specific parameters for each algorithm are presented in the corresponding section.
When running the matching, additional parameters can be given as arguments to the match
method to define the matching behavior. The parameters are the following:
- image0: the first image to be matched.
- image1: the second image to be matched.
- quality: define the resize factor for the input images. Possible values "highest", "high" or "medium", "low". With "high", images are matched with full resulution. With "highest" images are up-sampled by a factor 2. With "medium" and "low" images are downsampled respectively by a factor 2 and 4. The default value is "high".
- tile_selection: tile selection approach. Possible values are
TileSelection.None
,TileSelection.EXHAUSTIVE
,TileSelection.GRID
orTileSelection.PRESELECTION
. Refer to the following "Tile Section" section for more information. The default value isTileSelection.PRESELCTION
. - grid: if tile_selection is not
TileSelection.None
, this parameter defines the grid size. - overlap: if tile_selection is not
TileSelection.None
, this parameter defines the overlap between tiles. - do_viz_matches: if True, the matches are visualized. Default value is False.
- do_viz_tiles: if True, the tiles are visualized. Default value is False.
- save_dir: if not None, the matches are saved in the given directory. Default value is None.
- geometric_verification: defines the geometric verification approach.
Tile Selection¶
To guarantee the highest collimation accuracy, by default the matching is performed on full resolution images. However, due to limited memory capacity in mid-class GPUs, high- resolution images captured by DSLR cameras may not fit into GPU memory. To overcome this limitation, ICEPy4D divides the images into smaller regular tiles with maximum dimension of 2000 px, computed over a regular grid. The tile selection can be performed in four different ways:
TileSelection.None
Images are matched as a whole in just one step. No tiling is performed.TileSelection.EXHAUSTIVE
All the tiles in the first image are matched with all the tiles in the second image. This approach is very computational demading as the pairs of tiles are all the possible combinations of tiles from the two images and the total number of pairs rises quickly with the number of tiles. Additionally, several spurios matches may be found in tiles that do not overlap in the two images.TileSelection.GRID
Tiles pairs are selected only based on the position of each tile in the grid, i.e., tile 1 in imageA is matched with tile 1 in imageB, tile 2 in imageA is matched with tile 2 in imageB, and so on. This approach is less computational demanding than the exhaustive one, but it is suitable only for images that are well aligned along a stripe with regular viewing geometry.TileSelection.PRESELECTION
This is the only actual 'preselection' of the tiles, as the process is carried out in two steps. First, a matching is performed on downsampled images. Subsequently, the full-resolution images are subdivided into regulartiles, and only the tiles that have corresponding features in the low-resolution images are selected as candidates for a second matching step.
When a tile pre-selection approach is chosen, the tile grid must be defined by the tile_grid
argument. This is a list of integers that defines the number of tiles along the x and y direction (i.e., number of columns and number of rows). For example, tile_grid=[3,2]
defines a grid with 3 columns and 2 rows.
Additionally, a parameter specifiyng the overlap between different tiles can be defined by the overlap
argument. This is an integer number that defines the number of pixels of overlap between adjacent tiles. For example, overlap=200
defines an overlap of 100 pixels between adjacent tiles. The overlap helps to avoid missing matches at the tile boundaries.
The following figure shows the tile preselection process. An example of the tiles that are selected for the second matching step are highlighted in green.
Geometric Verification¶
Geometric verification of the matches is performed by using Pydegensac (Mishkin et al., 2015), that allows for robustly estimate the fundamental matrix. The maximum re-projection error to accept a match is set to 1.5 px by default, but it can be changed by the user. The successfully matched features, together with their descriptors and scores, are saved as a Features object for each camera and stored into the current Epoch object.
SuperGlue matching¶
SuperGlue is a Deep Learning-based feature matching algorithm that uses a SuperPoint keypoint detector and a SuperGlue feature matcher. You can find some more information on SuperGlue in the original paper and in the original repository.
For running the matching with SuperGlue, a new SuperGlueMatcher object must be initialized. A set of additional parameters can be set when initializing the SuperGlueMatcher object. The parameters are given as a dictionary (see the documentation of the class for more details).
The configuration dictionary may contain the following keys:
- "weights": defines the type of the weights used for SuperGlue inference. It can be either "indoor" or "outdoor". The default value is "outdoor".
- "keypoint_threshold": threshold for the SuperPoint keypoint detector. The default value is 0.001.
- "max_keypoints": maximum number of keypoints to be detected by SuperPoint. If -1, no limit to keypoint detection is set. The default value is -1.
- "match_threshold": threshold for the SuperGlue feature matcher. Default value is 0.3.
- "force_cpu": if True, SuperGlue will run on CPU. Default value is False.
- "nms_radius": radius for non-maximum suppression. Default value is 3.
- "sinkhorn_iterations": number of iterations for the Sinkhorn algorithm. Default value is 20.
If the configuration dictionary is not given, the default values are used.
matching_cfg = {
"weights": "outdoor",
"keypoint_threshold": 0.0001,
"max_keypoints": 8192,
"match_threshold": 0.2,
"force_cpu": False,
}
matcher = SuperGlueMatcher(matching_cfg)
matcher.match(
image0,
image1,
quality=Quality.HIGH,
tile_selection=TileSelection.PRESELECTION,
grid=[3,2],
overlap=200,
min_matches_per_tile = 5,
do_viz_tiles=False,
save_dir = "./matches/superglue_matches",
geometric_verification=GeometricVerification.PYDEGENSAC,
threshold=1.5,
)
2023-10-03 09:09:38 | [INFO ] Running inference on device cuda Loaded SuperPoint model Loaded SuperGlue model ("outdoor" weights) 2023-10-03 09:09:39 | [INFO ] Matching by tiles... 2023-10-03 09:09:39 | [INFO ] Matching tiles by preselection tile selection 2023-10-03 09:09:39 | [INFO ] - Matching tile pair (0, 1) 2023-10-03 09:09:42 | [INFO ] - Matching tile pair (2, 1) 2023-10-03 09:09:45 | [INFO ] - Matching tile pair (2, 2) 2023-10-03 09:09:48 | [INFO ] - Matching tile pair (2, 4) 2023-10-03 09:09:51 | [INFO ] - Matching tile pair (3, 2) 2023-10-03 09:09:53 | [INFO ] - Matching tile pair (3, 3) 2023-10-03 09:09:56 | [INFO ] - Matching tile pair (3, 4) 2023-10-03 09:09:59 | [INFO ] - Matching tile pair (3, 5) 2023-10-03 09:10:02 | [INFO ] - Matching tile pair (4, 4) 2023-10-03 09:10:04 | [INFO ] - Matching tile pair (5, 4) 2023-10-03 09:10:07 | [INFO ] - Matching tile pair (5, 5) 2023-10-03 09:10:10 | [INFO ] Restoring full image coordinates of matches... 2023-10-03 09:10:10 | [INFO ] Matching by tile completed. 2023-10-03 09:10:10 | [INFO ] Matching done! 2023-10-03 09:10:10 | [INFO ] Performing geometric verification... 2023-10-03 09:10:10 | [INFO ] Pydegensac found 1150 inliers (50.46%) 2023-10-03 09:10:10 | [INFO ] Geometric verification done. 2023-10-03 09:10:10 | [INFO ] [Timer] | [Matching] preselection=0.864, matching=30.168, geometric_verification=0.022, Function match took 31.0577 seconds
True
The matches with their descriptors and scores are saved in the matcher object. All the results are saved as numpy arrays with float32 dtype. They can be accessed as follows:
# Get matched keypoints
mktps0 = matcher.mkpts0
mktps1 = matcher.mkpts1
print(f"Number of matches: {len(mktps0)}")
print(f"Matches on image0 (first 5):\n{mktps0[0:5]}")
print(f"Matches on image1 (first 5):\n{mktps1[0:5]}")
# Get descriptors
descs0 = matcher.descriptors0
descs1 = matcher.descriptors1
print(f"Descriptors shape: {descs0.shape}")
# Get scores of each matched keypoint
scores0 = matcher.scores0
scores1 = matcher.scores1
print(f"Scores shape: {scores0.shape}")
# Matching confidence
confidence = matcher.mconf
print(f"Confidence shape: {confidence.shape}")
print(f"Confidence (first 5): {confidence[0:5]}")
Number of matches: 1150 Matches on image0 (first 5): [[ 8. 1356.] [ 8. 1383.] [ 8. 1384.] [ 10. 1372.] [ 11. 1313.]] Matches on image1 (first 5): [[5342. 98.] [5335. 137.] [5335. 137.] [5341. 122.] [5449. 8.]] Descriptors shape: (256, 1150) Scores shape: (1150,) Confidence shape: (1150,) Confidence (first 5): [0.05422363 0.1714691 0.14908865 0.14837408 0.1697674 ]
You can also plot the matches by using the plot_matches
function of the ICEpy4D visualization module.
from icepy4d.visualization import plot_matches
out = plot_matches(image0=image0, image1=image1, pts0=mktps0, pts1=mktps1, path="./matches/superglue_matches.jpg")
LightGlue matching¶
LightGlue is a Deep Learning-based feature matching algorithm that uses a SuperPoint or DISK keypoint detectors. It is a recent evolution of the SuperGlue matcher, developed by the Computer Vision Group of ETH Zurich. You can find more information on LightGlue in the original paper and in the original repository.
The process of running the matching with LightGlue is very similar to the one of SuperGlue. You just need to initialize a LightGlueMatcher object and run the matching.
matcher = LightGlueMatcher()
matcher.match(
image0,
image1,
quality=Quality.HIGH,
tile_selection=TileSelection.PRESELECTION,
grid=[2, 3],
overlap=200,
origin=[0, 0],
do_viz_matches=True,
do_viz_tiles=True,
min_matches_per_tile = 3,
max_keypoints = 10240,
save_dir="./matches/LIGHTGLUE",
geometric_verification=GeometricVerification.PYDEGENSAC,
threshold=2,
confidence=0.9999,
)
2023-10-03 09:10:13 | [INFO ] Running inference on device cuda 2023-10-03 09:10:13 | [INFO ] Matching by tiles... 2023-10-03 09:10:13 | [INFO ] Matching tiles by preselection tile selection 2023-10-03 09:10:14 | [INFO ] - Matching tile pair (0, 2) 2023-10-03 09:10:15 | [INFO ] - Matching tile pair (1, 1) 2023-10-03 09:10:16 | [INFO ] - Matching tile pair (1, 4) 2023-10-03 09:10:18 | [INFO ] - Matching tile pair (2, 4) 2023-10-03 09:10:19 | [INFO ] - Matching tile pair (2, 5) 2023-10-03 09:10:21 | [INFO ] - Matching tile pair (3, 3) 2023-10-03 09:10:22 | [INFO ] - Matching tile pair (4, 3) 2023-10-03 09:10:24 | [INFO ] - Matching tile pair (4, 4) 2023-10-03 09:10:25 | [INFO ] - Matching tile pair (5, 4) 2023-10-03 09:10:26 | [INFO ] - Matching tile pair (5, 5) 2023-10-03 09:10:28 | [INFO ] Restoring full image coordinates of matches... 2023-10-03 09:10:28 | [INFO ] Matching by tile completed. 2023-10-03 09:10:28 | [INFO ] Matching done! 2023-10-03 09:10:28 | [INFO ] Performing geometric verification... 2023-10-03 09:10:28 | [INFO ] Pydegensac found 1763 inliers (51.66%) 2023-10-03 09:10:28 | [INFO ] Geometric verification done. 2023-10-03 09:10:29 | [INFO ] [Timer] | [Matching] preselection=0.621, matching=14.147, geometric_verification=0.019, Function match took 16.0027 seconds
True
LOFTR matching¶
The LOFTR matcher shares the same interface as the SuperGlue matcher, therefore the same parameters can be used for the match
method.
The only difference is in the matcher initialization, which takes no parameters, as default values are defined from Kornia (see the documentation of the class for more details).
The matched points can be retrieved as before, but the descriptors are not saved in the matcher object, as they are not computed by LOFTR.
matcher = LOFTRMatcher()
matcher.match(
image0,
image1,
quality=Quality.HIGH,
tile_selection=TileSelection.PRESELECTION,
grid=[5, 4],
overlap=50,
save_dir= "./matches/LOFTR_matches",
geometric_verification=GeometricVerification.PYDEGENSAC,
threshold=1.5,
)
mktps0 = matcher.mkpts0
mktps1 = matcher.mkpts1
print(f"Number of matches: {len(mktps0)}")
2023-10-03 09:10:29 | [INFO ] Running inference on device cuda 2023-10-03 09:10:29 | [INFO ] Matching by tiles... 2023-10-03 09:10:29 | [INFO ] Matching tiles by preselection tile selection 2023-10-03 09:10:30 | [INFO ] - Matching tile pair (1, 1) 2023-10-03 09:10:30 | [INFO ] - Matching tile pair (4, 3) 2023-10-03 09:10:31 | [INFO ] - Matching tile pair (5, 1) 2023-10-03 09:10:32 | [INFO ] - Matching tile pair (5, 2) 2023-10-03 09:10:33 | [INFO ] - Matching tile pair (8, 8) 2023-10-03 09:10:34 | [INFO ] - Matching tile pair (8, 9) 2023-10-03 09:10:35 | [INFO ] - Matching tile pair (9, 9) 2023-10-03 09:10:36 | [INFO ] - Matching tile pair (9, 12) 2023-10-03 09:10:37 | [INFO ] - Matching tile pair (9, 13) 2023-10-03 09:10:37 | [INFO ] - Matching tile pair (10, 10) 2023-10-03 09:10:38 | [INFO ] - Matching tile pair (10, 13) 2023-10-03 09:10:39 | [INFO ] - Matching tile pair (10, 14) 2023-10-03 09:10:40 | [INFO ] - Matching tile pair (11, 10) 2023-10-03 09:10:41 | [INFO ] - Matching tile pair (11, 14) 2023-10-03 09:10:42 | [INFO ] - Matching tile pair (11, 15) 2023-10-03 09:10:43 | [INFO ] - Matching tile pair (12, 16) 2023-10-03 09:10:44 | [INFO ] - Matching tile pair (13, 12) 2023-10-03 09:10:44 | [INFO ] - Matching tile pair (13, 13) 2023-10-03 09:10:45 | [INFO ] - Matching tile pair (13, 17) 2023-10-03 09:10:46 | [INFO ] - Matching tile pair (14, 13) 2023-10-03 09:10:47 | [INFO ] - Matching tile pair (14, 17) 2023-10-03 09:10:48 | [INFO ] - Matching tile pair (15, 14) 2023-10-03 09:10:49 | [INFO ] - Matching tile pair (15, 18) 2023-10-03 09:10:50 | [INFO ] - Matching tile pair (17, 17) 2023-10-03 09:10:51 | [INFO ] - Matching tile pair (18, 17) 2023-10-03 09:10:52 | [INFO ] Restoring full image coordinates of matches... 2023-10-03 09:10:52 | [INFO ] Matching by tile completed. 2023-10-03 09:10:52 | [INFO ] Matching done! 2023-10-03 09:10:52 | [INFO ] Performing geometric verification... 2023-10-03 09:10:53 | [INFO ] Pydegensac found 3393 inliers (17.91%) 2023-10-03 09:10:53 | [INFO ] Geometric verification done. 2023-10-03 09:10:53 | [INFO ] [Timer] | [Matching] preselection=0.309, matching=22.242, geometric_verification=0.815, Function match took 23.3744 seconds Number of matches: 3393
# Clean up result folders
import os
import shutil
if os.path.exists("./matches"):
shutil.rmtree("./matches")
if os.path.exists("./logs"):
shutil.rmtree("./logs")