Computer Vision with Open Vocabulary

less than 1 minute read

Published:

Introduction

  1. What do we mean by open world vision?
    • Referring expression segmentation
    • Referring expression matting
    • Open Vocabulary panoptic segmentation
  2. How are these systems trained?
  3. What are the popular models and their current limitations?
  4. What are the popular benchmarks?
  5. How do we accurately measure the performance of such systems?
  6. What are the novel applications enabled by these systems?

References

  1. https://www.cs.cmu.edu/~shuk/open-world-vision.html
  2. https://github.com/nvlabs/odise
  3. https://github.com/JizhiziLi/RIM
  4. https://github.com/isl-org/lang-seg
  5. https://github.com/ngthanhtin/owlvit_segment_anything