Interpretable Visual Question Answering by Reasoning on Dependency Trees.

Interpretable Visual Question Answering by Reasoning on Dependency Trees.

Cao, Qingxing;Liang, Xiaodan;Li, Bailin;Lin, Liang;
ieee transactions on pattern analysis and machine intelligence 2019
350
cao2019interpretableieee

Abstract

Collaborative reasoning for understanding each image-question pair is very critical but underexplored for an interpretable visual question answering system. Although very recent works also attempted to use explicit compositional processes to assemble multiple subtasks embedded in the questions, their models heavily rely on annotations or handcrafted rules to obtain valid reasoning processes, leading to either heavy workloads or poor performance on composition reasoning. In this paper, to better align image and language domains in diverse and unrestricted cases, we propose a novel neural network model that performs global reasoning on a dependency tree parsed from the question, and we thus phrase our model as parse-tree-guided reasoning network (PTGRN). This network consists of three collaborative modules: i) an attention module to exploit the local visual evidence for each word parsed from the question, ii) a gated residual composition module to compose the previously mined evidence, and iii) a parse-tree-guided propagation module to pass the mined evidence along the parse tree. Our PTGRN is thus capable of building an interpretable VQA system that gradually derives the image cues following a question-driven parse-tree reasoning route. Experiments on relational datasets demonstrate the superiority of our PTGRN over current state-of-the-art VQA methods.

Citation

ID: 53758
Ref Key: cao2019interpretableieee
Use this key to autocite in SciMatic or Thesis Manager

References

Blockchain Verification

Account:
NFT Contract Address:
0x95644003c57E6F55A65596E3D9Eac6813e3566dA
Article ID:
53758
Unique Identifier:
10.1109/TPAMI.2019.2943456
Network:
Scimatic Chain (ID: 481)
Loading...
Blockchain Readiness Checklist
Authors
Abstract
Journal Name
Year
Title
5/5
Creates 1,000,000 NFT tokens for this article
Token Features:
  • ERC-1155 Standard NFT
  • 1 Million Supply per Article
  • Transferable via MetaMask
  • Permanent Blockchain Record
Blockchain QR Code
Scan with Saymatik Web3.0 Wallet

Saymatik Web3.0 Wallet