Reffering Expression Generation for Visual Data

Referring expressions, which are sentences that try to refer to a single object within a group, have been studied extensively in the natural language processing (NLP) community. They are considered a basic building block for any natural language generation system. However, trying to generate these expressions for visual data, requires taking many other factors into effect which are not present when looking only at lists of objects and attributes (as is commonly done in the NLP community). These include uniquely visual factors such as visual saliency, classifier uncertainty, spatial arrangements, etc. My research tries to address these issues by developing algorithms which take them into consideration, and use human experiments to measure their efficiency.

Fig 1. Two examples of our algorithm constructing referring expressions for the people marked with the red rectangles in the images.

Publications:

A. Sadovnik, Y. Chiu, N. Snavely, S. Edelman and T. Chen. “Image Description with a Goal: Building Efficient Discriminating Expressions for Images“, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012.

A. Sadovnik, A. Gallagher and T. Chen .“It’s Not Polite To Point: Describing People With Uncertain Attributes.“, Computer Vision and Pattern Recognition (CVPR), 2013.

A. Sadovnik, A. Gallagher and T. Chen .“Not Everybody’s Special: Using Neighbors in Referring Expressions with Uncertain Attributes.”, The V&L Net Workshop on Language for Vision, Computer Vision and Pattern Recognition (CVPR), 2013.

Leave a Reply Cancel reply