CDeC-Net: Composite Deformable Cascade Network for Table Detection in Document Images

Illustration of complex table detection results. Blue and Green colored rectangles correspond to ground truth and predicted bounding boxes using CDeC-Net.

Abstract

The proposed network consists of a multistage extension of Mask R-CNN with a dual backbone having deformable convolution for detecting tables varying in scale with high detection accuracy at higher IoU threshold.

Our solution has three important properties:

CDEC-NET: COMPOSITE DEFORMABLE CASCADE NETWORK

Cascade Mask R-CNN

Cascade R-CNN

Composite Backbone

We use a dual backbone based architecture which creates a composite connection between the parallel stages of two adjacent ResNeXt-101 backbones (one is called assistant backbone and other is called lead backbone).

CBNetV2: A Composite Backbone Network Architecture for Object Detection

Deformable Convolution

We replace the fixed receptive field CNN with deformable CNN [22] in each of our dual backbone architectures. The gird is deformable as each grid point can be moved by a learnable offset.

Deformable Convolution

EXPERIMENTS