Wednesday, June 26, 2013

CVPR brain dump, part 2 of n

I couldn't stay away!! Again, in reverse chronological order starting with the Amazon mixer I just got back from.


I remember watching these kids TV shows about factories and how certain things were made (e.g. Mister Rogers at the Crayon Factory) and thinking, a) there are a lot of weird crazy processes behind these crayons and other things my kid self interacts with and b) it's someone's job to be involved with all of that. Many people's jobs, even, to invent it, build it, manage it, whatever. I wondered if I would have some wacky, behind-the-scenes, "make the world go round" job when I grew up.

faster amazon deliveries
I got a glimpse of being able to have that (if I wanted, which I'm not sure I do) while talking to people from the new computer vision research lab at Amazon. Their focus is on using cameras and computer smarts to track everything and enhance the warehouse and fulfillment stuff -- the physical back-end of Amazon retail services. I think it's going to have a big impact. It might be kinda subtle at first with our Amazon packages arriving ever so slightly quicker in ever sleeker packaging, but then bam, one day we'll have everything we need all the time like in Wallace and Gromit when Wallace gets tipped out of bed into his clothes for the day. And we'll just drink Soylent all the time. And then we'll be in space and living on other planets and that level of automation will help us focus on other things, like mining resources from the environment and setting up Amazon warehouses on Mars.

automagical dressing+coffee robot/alarm contraption

Apps and Demos

I should know this already, given the amount of traction I've gotten from Sketch-a-bit, but nothing beats a slick, simple demo. I need a kick in the pants to go make some short and sweet (and self-contained) apps of things rattling around my brain. Something(s) that I can pull out to either show off an idea in a tangible way, explain what kind of work I do, or just inspire conversation. Some nice examples of other apps from the last few days:
  • This face tracker was running on a nice, big display at the far end of the posters. It also just worked really, really well. I want to use it in my stuff.
  • I got to poke at some new Photosynths, which have a very nice feel to them.
  • Someone was playing around with these Microsoft cliplet/cinemagraph/blink things on a windows phone and made this one of me:
These things are fun! They're nice to play with! They are delightful. They stick in my brain. It's also making me aware of demos and apps (including my own) that that just feel like they're "missing something." I think that missing "something" is usually feedback to show that you made a difference or created something new (and hopefully shareable).

Art (and Photosynth 2)

A couple days ago now I saw a talk on Photosynth 2, basically a sneak peek of the TED talk Blaise just gave in Scotland. The new synths have nice 3D and parallax effects. They've also tried to make them easier to capture and navigate with a touch screen. Their exact nature shall be revealed to the internet soon enough. I mainly wanted to comment on how they (Blaise et al) are (and have been) treating them as new artistic mediums. I caught a glimpse of Photosynth person emailing other Photosynth person a synth of their recent hiking adventure, of a place that suited the medium really well. And then the synth itself is this neat artifact that can be appreciated independently. I kinda felt that way about PhotoCity models, so it's nice to see it validated in this more corporate world.

The Blink/Cinemagraph thing has a similar new-form-of-artistic-expression feel. It's cool. I want to create more things like that. To create the opportunities for new types of creative expression, actually, rather than the cinemagraphs themselves... You're getting all artsy-fartsy around the edges, Microsoft!

Tuesday, June 25, 2013

CVPR brain dump, part 1 of n

Sitting on the floor outside the poster session -- so many ideas and whatnot have gotten into my head that I need to take time to get some of them out again. This will kind of be in reverse chronological order.

CVPR 2016

Is there someone in my current field of a view wearing a shirt supporting the bid to hold CVPR 2016 in Seattle? Yes! There is! The vote is tonight: Seattle vs. Los Angeles. (Also ladies vs. gentlemen with the Seattle organizing committee made almost entirely of women.) Someone left a pamphlet on a table promoting the LA side of things that contained a misleading snowcapped mountain range. I know those mountains (San Gabriel?) can sometimes have snow, but I think they're just jealous and had snowy mountain envy or something.


There was a talk called Fine-Grained Crowdsourcing for Fine-Grained Recognition from Stanford that I went to because I wanted to see what "crowdsourcing" meant to these people. I've seen it mean a) "find things on the internet that people posted for their own personal reasons and use it for research" or b) "pay people to do perception-related tasks to explore how humans understand images" or c) "pay someone that's not a grad student to do a bunch of menial labor like image labeling or segmentation" or d) "attempt to disguise menial labor as fun so that you don't even have to pay" or e) "involve the crowd with something their kinda interested/personally invested in, possibly through a game". 

To my pleasant surprise, these folks had designed a pretty reasonable game which they simply paid people on mturk to play. The game involved categorizing a blurry black and white picture of an object/animal into one of two categories and using "bubbles"/circles to expose parts of the image in color and resolution to help you make a better distinction. The player's goal was to guess the category correctly and use as few bubbles as possible. The underlying goal is to learn which parts of an image are important for distinguishing very specific things, like between two similar breeds of birds. 

It's a lot like von Ahn's game Peek-a-boom, in which you expose parts of an image until your partner can guess what it is. In this bubbles game, the game mechanics of exposing only what you need to categorize an image are directly tied to the computer vision problem of figuring out which features/parts the computer should look at to make the same decision automatically.

They called it a "machine-human collaboration", which I am 100% in support of. The more we expose of how these algorithms work and why, the more we can let humans identify and correct the silly assumptions computers sometimes make, and have the humans guide the computers to success!

More to come!

SAVE and publish, damn you.... slow conference internet...

Sunday, June 2, 2013

Estimating task times

I've got more work to do this weekend than I want to do. Possibly more work than I have time to do, at least if I get too distracted by nice weather and errands. So I'm going to take a few minutes to blog about time management!

I have two goals here:


Goal 1: Not get needlessly overwhelmed by large projects

I have a tendency to think about scheduling like this: "I need to do LARGE PROJECT X, which consists of several pretty large pieces A, B, and C... But I don't really know how long each will take so I'll just try to do A in the first half of the day and B later that day... and C the next day... But in actuality, part A has all these secret little traps or yaks to shave so parts B and C get put off and/or condensed." Perhaps there is a better way to plan out projects such that I can have more confidence in the schedule I've set forth.


Goal 2: Be better at predicting how long things will take

Timing and productivity are the things that make me most anxious in my life right now. I think I'm reasonably okay at predicting how long things will take, but I think I could be a whole lot better about it and thus feel a lot more in control of my schedule and my productivity.

Mindful scheduling experiment of the evening

I had a list of tasks today:
  1. fix face alignment (ETA: 15 minutes)
  2. change collection flow output (ETA: 30 minutes)
  3. hook up 3D face code to pipeline (ETA: 20 minutes)

I was actually expecting things to take 3x longer than predicted. That usually seems like a good approximation, although it's not great that the math works out that way.

Task 1 actually took 50 minutes, or about 3 times as long (hooray, right on schedule). Why? Part of it was an OpenCV method returning weird things (either an empty transformation matrix or a wonky one) for reasons I still don't fully understand. A second part was a bug I actually introduced where I was shortening some array I'd already shortened. I didn't discover this bug until it was already part of my web/twisted/c++ pipeline. And the third part was that I was testing my code inefficiently by making it load some files it didn't need and take 5 seconds instead of <1s. In summary: Something perplexing, something totally my fault, something inefficient.

Task 2 also actually took 50 minutes, or 20 minutes more than expected. I spent about 6 minutes planning out what I'd do. It was a more straightforward task than the previous one, and time went towards things like collecting test cases and looking up various documentation. I think I was just overly optimistic with my time estimate and didn't count the actual number of mini-tasks I'd need to do. I expected it to go smoothly, and it did.

Task 3 took four times as long as I anticipated, coming in at a whopping 80 minutes. The main reason was that I looked at the code I was planning to plug straight into some bigger pipeline, and I discovered it was the mutant, bloated beast that had evolved out of something else. So about 40 minutes went to just restructuring the code and cleaning out the stuff I didn't need. Then I spent about 10 minutes making sure it ran on my remote machine and 25 minutes actually plugging it into the pipeline. There we go, my estimate for "plug this into that" was about 25 minutes! I had just lost track of some details until I took a closer look.


That was a lot of different tasks and many different reasons for underestimating the time they'd take. Some of the bigger patterns might be:
  • Perplexing things will take a long, long time as I let my brain try to wrap around them. And that should be okay, as long as I don't get stuck in an infinite loop.
  • Reminder to stop writing bugs.
  • I could look for ways to be more efficient with my testing.
  • It seems like don't want to give myself too long of a time estimate, because something that should take 15 minutes is way easier to start on than something that will take 1 hour.