Wednesday, November 16, 2011

Game with A Purpose

Two games are designed to collect semantic tags.

First game is analogous to Peek-A-Boom game:

Inverse-Problem game
Boom:
Give a image with city name (from panoramio probably)
Ask to point some predefined attributes (tree, building,…etc.) for peek
Peek
Answer the location of the city with as least of number of attributes given as possible
Effect:
Collect location of most representative objects for the place



The second game is analogous to Herd-It game:

Given one query image (with or w/o ground truth location) and multiple players
Game:
Ask which of k other images matches query image the best.
Effect:
Collect relational similar image of query image and output the possible location of query image at the same time



Social Mobilization + GWAP - VFYW breaker:

Crowd sourcing the answer of VFYW game by playing the game (such as the second game)
Collect useful tags at the same time
The highest score player has the right for own the answer to VFYW contest




Thursday, October 27, 2011

Geometric Feature Pruning

Geometric Feature Pruning uses the semantic tags on maps to form a feature based on geometric relationship of tags to reduce search space of image localization problem.

Semantic tags:
Google map API allows us to extract semantic tag such as
road
man-made building...etc
Google Style Map Spec

Geometric Feature:
In Madrid example, the angle between road is used.

Experimental Setup:
The Madrid example is used to verify the idea.

The full search space is a rectangle around the ground truth location.

The search space is 425 m in width and height which is about 0.03 of size (m^2/m^2) of city.

Semantic map:
(a) road
(b) building and space

We have 9 maps with the same size of example image to cover the search space.

Geometric feature:
We develop an intersection descriptor which can be automatically extracted from styled Google Map.
Two features in intersection descriptor can be used for pruning search space.
1. the number of corners at the intersection.
2. the angle of the corner.

Experiment:
Query Intersection is computed from rectified images.

Here "Number of Corner: 2" and "Angle: 73.88" are used to prune the search space.

The ground truth location is at the center of image.

The degree of matching score can be represented by radius of blue circle.
The radius r is computed by
r = R * exp(-d(ang0,ang1)/sigma)
where R, sigma is a constant. d(. , .) is 2 norm distance of query angle and angle in database.

Another Example: Paris
Query: "Number of Corner: 4" and "Angle: 57.68" are used to prune the search space.
Result:
The ground truth location is at the center of image.






Tuesday, October 18, 2011

Bibliography

Image-based localization:
From Structure-from-Motion Point Clouds to Fast Location Recognition CVPR'09
Location Recognition using Prioritized Feature Matching ECCV'10
Fast Image-Based Localization using Direct 2D-to-3D Matching ICCV'11

3D reconstruction:
Piecewise Planar and Non-Planar Stereo for Urban Scene Reconstruction
- multiview stereo with peice-wise constraint
- not talking about roof

Fusion of feature- and area-based information for urban buildings modeling from aerial imagery
- have all data (range data)
manhattan world stereo
- Piece-wise planer assumption everywhere

Blocks World Revisited: Image Understanding Using Qualitative Geometry and Mechanics
- claim only qualitative reconstruction is feasible and "impossible for metric reconstruction from a single image"
- aim at a fully automatic system

Closing the Loop in Scene Interpretation
-

Friday, October 14, 2011

VFYW Dataset

Introduction:

VFYW Dataset is designed for Image localization problem. The dataset currently consists of 73 query images labeling with ground truth location and negative locations. The ground truth location is represented by city and geographic coordinate. Negative locations were collected from wrong guesses and rephrased into city to avoid ambiguity. Other negative locations can be obtained by the results feeding query image into Google Search by Image.

All query images and human responses are collected from VFYW contest and will update weekly with the contest.

The degree of difficulty of query images can be assigned with three levels by the precision of winner's answer:
a. No one Correct
No one answered the correct city. (9/71)
b. Up to a City
The winner answered the correct city but not provide the exact location. (12/71)
c. Exact location
The winner gave exact location of query image. (50/71)

Here are some sample images of different degree of difficulty.
a. No one Correct
b. Up to a City
c. Exact location

Here an example (#71) is shown to illustrate VFYW dataset.

Query image (solved with exact location)

Ground truth location:
'Dhaka, Bangladesh'
'23.754239,90.392075'

Negative location (guesses from human)
'Tripoli, Libya'
'Maputo, Maputo City, Mozambique'
'Kinshasa, Democratic Republic of Congo'
'Dar es Salaam, Tanzania'
'Iquitos, Peru'
'Managua, Nicaragua'
'Cairo, Egypt'
'Beirut, Lebanon'
'Famagusta , Cyprus'
'Bangkok, Thailand'
'Mumbai, India'

Negative location (Google search by image, keyword 'google map panoramio' is added to ensure every retrieved image with geo tag)

VFYW #70 (Baby Step)


VFYW # 70: Edinburgh, UK
The query image:

The GUI helps human perform metrology to query image.
The precise scale of ROI region of overhead view can be reconstructed by image metrology.
Match reconstructed overhead image to satellite image provided by Google Map.

The reconstructed overhead view in this image shows the road in query image is curve. This clue can filter can reduce search space given many road goes straight.
The above image is road map in Edinburgh. Search space can be reduced by only looking at road with certain curvature.


VFYW #57 (Baby Step)


VFYW # 57: Muskegon, MI
The query image:
The GUI helps human perform metrology to query image.
The precise scale of ROI region of overhead view can be reconstructed by image metrology.
Match reconstructed overhead image to satellite image provided by Google Map.

In this case, the reconstructed overhead  image assists human to find out the exact matching between ground image and satellite image. Furthermore, the camera position can also be estimated by looking at the homography. If the labels of 'Ocean' and 'Ground' are given in both query and satellite image, some automatically indexing algorithm can be developed.

Tuesday, October 11, 2011

Four Types of VFYW Players


Interesting comment from #35

1) The people who write, "I am absolutely, 100% sure of the location, because I was just there last year and will never forget it," followed by a completely wrong answer. -- I always feel a little sorry for these people. They were so certain! It must be a blow.
2) The people who write, "I am absolutely, 100% sure of the location because I happen to be: a) sitting there right now! b) looking at the exact same picture that I took last week when I was there! or c) *fill in the blank with another amazing coincidence!*" -- These people make me jealous on some petty level that I'd rather not examine too closely
3) The people who write, "I am absolutely, 100% sure of the location because I recognized some infinitesimally arcane detail in the photo, and after spending several hours researching Dumpster Colors of the Southern Hemisphere, I was able to narrow it down to a city, whose streets I then spent several days examining, block by block for a LONG TIME on Google Earth, until I found the spot! Oh and, I got the last 3 contests correct!" -- These people fill me with awe and admiration and maybe a just a little bit of fear.
4) The people who write, "I have no clue. I threw some search terms into Google and came up with this. Hope I'm right." -- These are my kind of people, and they have inspired me to join the VFYW Contest fray.

Monday, October 10, 2011

VFYW #32 (Baby Step):

VFYW #32:
The Graffiti wall indicates the picture is taken at Madrid. But where is it?
Baby Step: Put query image and candidates side by side and see if human can VERIFY it.


Original Image (#32):

Compare rectified image with Google road map and satellite map:

The acute angle between graffiti wall and road helps to verify that these two images could be the same.

Since the rectified image is up to metric, the area of plaza should be the same.
The region of plaza is annotated by Google map (it is possible to fill in plaza region easily).
The area of some annotated region can be a strong cue for verification. 

In this case:
1) the acute angle between road and graffiti wall helps us to verify the case.
2) the area and shape of a region with annotation also helps us to filter out major parts in the map.

To reject a false positive from Google search by image:
The image is the first entry retrieved by Google search by image with key word "google map panoramio".
We try to use two rules (check angle and area) to verify this image.
(This image does not be annotated with building...)

The reasons we can reject this image:
1) Only few acute angle between roads between road are shown.
2) Since testing and reference images are in the same scale. It is easy to show there is no good region match between these two images.