Tuesday, January 7, 2014

What's up with Photosynth 2

I don't work at Microsoft or on Photosynth, but a couple years ago, I had a project called PhotoCity that was sort of the awkward academic sister of Photosynth. Both were descended from the Photo Tourism project and made the tech available to anyone online.

These speculations about Photosynth 2 are reconstructed (hah) from my own experience working on similar things and from a talk Blaise gave at CVPR last summer.

Underlying Geometry 

Take a bunch of photos of the same scene from different angles, and you can reconstruct 3D information about the scene. It's hard to reconstruct a clean, correct, complete 3D model, but you can usually get a sparse idea of what's going on, like a point cloud or depth estimates for each image, as well as how the images relate to one another in 3D space.

Some software and apps that are out there: [bundler + pmvs] [visualsfm] [123d catch]

Dense (PMVS) point cloud reconstruction of Hing Hay Park in Seattle, photographed by Erik Andersen

My notes say that Photosynth reconstructs a point cloud (probably dense, something like PMVS) and rough geometry (probably from the point cloud). It doesn't seem to use the point cloud in the rendering. Instead, each image has its own piecewise planar proxy geometry of the scene. As you move between images, that geometry (or your view of the geometry) is interpolated.

View-dependent textures

I am certain that if you looked at the Photosynth 2 geometry without any texture, it would look like crap. You can sometimes tell how bad the geometry is in the transitions.

Screenshot of mid-transition. There are ghosted objects (that moved as if they were separate geometry) and tears and other issues, but they don't actually detract from the experience. View this ship yourself. 

UPDATE: YES YES LOOK AT THOSE CAMERA FRUSTA AND WACKY GEOMETRY... press 'c' on any synth to get this inside look for yourself!

One piece of magic though, the thing that smooths out all the terrible geometry blemishes, is the fact that you're looking at the beautiful original photos plastered onto the crappy geometry. AND you're looking from an angle that is similar to where the camera was actually taken. In that sense, it doesn't really matter what shape the geometry is because it's just going to look like the original photo.

Sometimes the viewpoint and geometry will be such that you can see past something to a place that's not textured. It looks like they just use some kind of texture hole filling to deal with that.

My friend and colleague Alex Colburn has some really cool work on using view-dependent textures to do cinematic parallax photography and to model and remodel home interiors.

Camera manifolds

Ye olde Photosynth let you take photos all higglety-pigglety and then let you browse them in a similarly haphazard fashion. I think many of the design choices of Photosynth 2 stem directly from trying to address the problems with capture and navigation of the original. (My PhotoCity solution was to show the 3D point cloud instead of the photos or show the point cloud projected on a 2D map.)

Oldschool synth: a sea of photographs. But! Since you're using Photosynth and not PhotoCity, you don't get much feedback about which photos work/fail and you don't get the chance to improve your model except by doing the whole thing again.


The types of manifolds Photosynth 2 supports are (at least):
  1. object: taking photos in a loop looking in at an object
  2. panorama: taking photos in a loop looking out, e.g. around a room or at the vista on top of a mountain
  3. strafe: take photos in a line moving sideways
  4. walk: take photos in a line moving forwards

My notes about types of camera manifolds including: object, panorama, strafe, walk, and concepts like "negative radius panorama" (shooting across a room)


These manifolds give the photographer some constraints about how to shoot, probably makes the computation easier, and makes navigation a hell of a lot easier for viewers (by constraining it).

Navigation UI magic

The camera manifolds make it super easy to navigate. Spin in a circle! Move along a single path! I think the Photosynth 2 people are really proud (and rightfully so) of being able to poke at a synth (via a touch screen) and have it respond in a sensible and pleasing way.

The second piece of navigation magic is how if you stop actively moving around, the synth just drifts to the nearest actual photo. When it's moving, the artifacts kind of blend together and get ignored by your brain. When it stops, you're looking at something free of artifacts, something that looks like a photo because it is a photo. But, unlike a normal photo, you can touch it to get a new view of the scene from a different angle.

A new art medium  

I haven't used Photosynth 2 yet (or even the original because I had my own version) but I can attest to the fact that taking photos for PhotoCity or for any other reconstruction/stitching pipeline deeply changed how I take photos and think about photography. Instead of hiking up a mountain and taking a single photo at the top, I want to take a bunch of photos and reconstruct the whole valley. (But ugh, not enough parallax from one peak!)

I think Photosynth 2 is a little more modest by enabling people to make reaaallly sexy/rich/immersive panorama-like things. Something pretty familiar, but also much enhanced. And on top of that, people will uncover quirks in the medium, like being able to capture dynamic, lively scene action in a stop-motion kind of way. For example, friend/past-colleague/actual Photosynth engineer shot this synth in Hawaii and there are occasional wave splashes!  Like a cinemagraph but in 3D! Compare that to your ghosted/amputated people in normal stitched panoramas.

2 comments:

  1. Your insight on photosynth is great. i've been digging around to try to find something similar to photosynth and have come up short. It appears the Photosynth 2 team is no longer working on the project. Do you have any recommendations how to create something similar with anything out there? I can figure out how to get the matching points of photos and also how to create a depth map with the photos but other than that as far as play back etc I'm stuck.

    ReplyDelete
    Replies
    1. Hi Omar,
      There's a cool startup called Mapillary that is in this space now. Their focus is on crowdsourced streetview (through an easy to use mobile app) but they also do 3D reconstruction like Photosynth, and they have a 3D viewer: http://blog.mapillary.com/update/2015/07/30/better-transitions.html

      Some good academic Structure from Motion libraries are:
      http://ccwu.me/vsfm/ and http://www.theia-sfm.org/

      Delete