Wednesday, November 28, 2012

Forgetting the Code Signing Password to Sketch-a-bit

There are 3 things I remember completely forgetting in my life. In that a) it sort of matters (mattered) that I remember these things and b) even many years later, I have NOT finally remembered. The first was my ATM PIN. It wasn't even a new pin. It just escaped my memory one day and was gone forever. I called Adam and was like, "Hey, did I ever share my pin with you? No? Darn." Later I called the bank and chose a more memorable one.

The second and third things I've forgotten are the code signing passwords to Sketch-a-bit and apparently Impressionist Fingerpaint. In my defense, Adam picked out and typed those passwords, but did show them to me. So they're not in my muscle memory anywhere at all.

Why this sucks

It means we can't update the Sketch-a-bit application at all.

It used to be that you could use a new key and the update would require people to remove the app and reinstall instead of just normal updating. Apparently this happened to Twitter and a few other apps and it was mostly inconvenient and embarrassing.

Now, security in the Google Play store seems tighter. If I try to upgrade my APK file, it yells at me and says, "The apk must be signed with the same certificates as the previous version."

It also means that we can't update Impressionist Fingerpaint, either. I have a feeling that password (since it was invented much more recently) will be easier/possible to recover from someone's brainmeat, though, and it doesn't have very many users.

But I wanted to make Sketch-a-bit suuuuper awesome!!

I still do!! Ahhh, I wish I could just upgrade the main app!

Now that we've seen how people cope with starting from another random user's random image, I kind of want to see what artistry people are capable of if they get to have more choice and agency. To achieve this, I was thinking of having a gallery and letting people select which sketches they want to start from.

I also want to make the system a little less anonymous by having random user ids instead of no identity whatsoever. Right now I can hypothesize about the same artist drawing several images in a row, but there's no way to know for sure.

At a retreat for my lab at UW, I learned that the color red was probably the first color people were aware of after dark and light. Here's a link to a wikipedia page with more info. Adam and I thought it would be totally awesome to introduce RED into Sketch-a-bit, then.

Decisions, decisions

The questions now are whether or not to spend more time trying to remember those passwords. We tried cracking one and learned that it's probably longer than 5 characters.

And then whether or not to release a Sketch-a-bit 2 and how many features to add. Just the bug fixes? Or more of those desired features that I don't really have time to implement? Seems like an entirely new version warrants some significant changes.

Good enough solution: put the APK online

I realized on the bus this morning that this is what I should do to get the fixes that I'd made to the two people who were requesting them. So now it's online. In a slightly secret location. I want to add the user ids and something on the server that tracks users and version numbers, and THEN I'll publicize the location.

Currently there are 3 people out there in the world capable of drawing in red. Here's what one of them made!

What have we learned here?

If you're going to release some code at 2am right before going to sleep, or 7pm right before grabbing a beer, PICK A MEMORABLE PASSWORD for your code signing key. Make a mnemonic for it. Write it down on a post-it. Email the password or the mnemonic or both to yourself. Whatever. In this case, I think losing the password was more trouble than having other people steal it and publish apps under our name.


Life moves on and we're still enabling people to make awesome things!

Tuesday, November 27, 2012

How to train your classifier (on a mac with opencv)

Or rather, how to experiment with using OpenCV to train a classifier and ultimately fail.

The following two links were my guides. They might make better guides than what follows here, but I'm going to write this stuff down for myself.

I wanted to try training my own OpenCV Haar Cascaded Classifier. OpenCV comes with some pre-trained ones for faces and body parts and whatnot. For fun, I tried to train it to recognize my cat Jpeg, which Picasa humorously recognized as a person in a handful of photos last month:

picasa now apparently detects cat faces

Step 1: Download pictures from my own Flickr account

I use this python program to get Flickr image urls:
I added my flickr API key and changed the main of that python file to look through MY pictures for things tagged 'cat'. 

if __name__ == '__main__':
    pix = photos_search('16977922@N00', False, '', '', 'cat')
    for p in pix:
        print p.getLarge()

Then, because bash is one of my best friends these days:

python > cat_pix.txt && for i in `cat cat_pix.txt`; do wget $i; done

Actually the first time I downloaded the pictures, I wanted to resize them with Imagemagick's mogrify (in-place image editing) command so I ran:

for i in *.jpg; do mogrify -resize 1024x1024 $i; done

Step 2: Mark "objects" to make positive examples

One of the links above has an object marking program for windows. This link has one that I was able to compile on my mac. It's kind of amateur code and I have vague plans to rewrite it.

I compiled it with the following command after commenting out two useless includes that were complaining at me. BRIEF INTERJECTION: I just installed OpenCV with Macports. If you do that, your OpenCV stuff will probably be in the same place. Installation of OpenCV didn't work properly until I made sure my macports was up to date.

g++ -I/opt/local/include -L/opt/local/lib -lopencv_core -lopencv_highgui -lopencv_imgproc -lopencv_objdetect objectmarker.cpp -o objectmarker

For context, I'm inside a directory of cat images. Running that objectmarker program is as so:

../objectmarker positive_examples.txt ./

Right, so, run that program from the directory with your images in it because the code doesn't actually build the right path to the image file if you run it from a different directory, even though it looks like it intended to. Then click boxes and press the appropriate SPACE and Capital B keys until all the objects/cats are marked.

Step 3: Get negative examples by making a list of images that don't have the object in them

I wound up using images from flickr tagged "eddie", which is the name of my mom's chubby chihuahua. Pictures of other dogs named Eddie were also included, but that didn't matter. The only thing that mattered was that Jpeg was not in any of the images. Which is true because he's never met any dogs.

The command went something like this:

ls eddie/ > negative_examples.txt

Step 4: Fix the paths

My images were in folders called "jpeg/" and "eddie/" and some of the jpeg images didn't actually have jpeg's face clearly shown. I had to change the image paths in positive_examples.txt and negative_examples.txt from "eddie_4787878333_4dd7bcd3d8_o.jpg" to "eddie/eddie_4787878333_4dd7bcd3d8_o.jpg". I did that with vim and the mass insert thing!

Step 5: Run opencv_createsamples

Finally! Getting to the running of the actual training code. Almost.
So I had these two files:
  • positive_examples.txt
  • negative_examples.txt
Negative examples just had image paths in it. Positive examples had image paths and # of objects in the image and the bounding rectangles for each image. For some reason, you have to convert your handy list to a binary representation to feed to the next step. Actually that reason is that you can use the createsamples program to take one image showing off your object and warp and distort it a whole bunch to create artificial training data. But if you actually have many positive examples of your data, you need to run this to just process it and get it in the right form.

I had 19 positive examples and 17 negative images so I ran this:

/opt/local/bin/opencv_createsamples -vec jpeg_pos.vec -info postive_examples.txt -show -w 32 -h 32

One very important thing I learned is that you should not put a number like 100 into the height and width fields because that will make your computer freak out (have a lot to process) at the next step.

Step 6: Run opencv_traincascade

Finally, for reals. Let's do it. 

/opt/local/bin/opencv_traincascade -data output_jpeg_cascade/ -vec jpeg_pos.vec -bg negative_examples.txt -numPos 19 -numNeg 17 -w 32 -h 32

Your -w and -h numbers have to be the same. When I tried them at 100, the program was trying to use 6GB of ram and making very slow progress. When I knocked it down to 32x32, it finished in under 30 seconds. I hear from those other links on the internet that training can take 5 days with 2000 positive and negative examples. Maybe that'll be me some day, and maybe it wont.

Step 7: Detect cat

I hacked some code online that used the face detector classifier and replaced "haarcascade_frontalface_alt2.xml" or whatever with "output_jpeg_cascade/cascade.xml". I don't remember where the code came from so I don't have a good link that particular resource.

In many cases, it completely failed to detect Jpeg. Even though this image was part of the training:

In another instance, it got different ends of the cat confused: 

Voodoo magic

This was the first time I ever tried anything like this. There's probably a lot I could learn about how to make it work better. At the same time, I wish I could teach the classifier more specific things to look for, or have it tell me exactly why it thinks the cat butt is the face. Feeding lots of data into a black box and crossing my fingers doesn't seem like a fun approach.

Sunday, November 25, 2012

Design Patterns of Crowdsourced Art

Sometimes I try to google "crowdsourced art" in hopes that I'll find a comprehensive list of all such projects. Maybe there's something out there that I'm unaware of... that'll be on that list... that will really make my day or something.

What I'm finding instead is either stuff that I've already heard about or things that fits into the same general pattern, one that I'm frankly getting a little tired of.

So, I'm going to share my own Comprehensive List. And then talk about the most common pattern I've spotted, as well as a few alternative ones.

30 Crowdsourced Art Projects

(Organized by how people contribute)

Drawing w/ spatial context
Webcam photo
Animating/drawing multiple frames
Cats (er, I mean drawing and writing..)
Playing w/ data
Full project remixing w/ assets
Mixed: photos/videos/text/drawings

Design Patterns

Many of the above are like, "Hey everybody, do a thing that's the same thing everyone else is doing, but since you did it, it'll be sorta unique to you! Then we'll put everyone's things together and maybe it'll be kinda neat to see the similarities and differences!" When everyone goes out and does their own thing, and then the results get smooshed back together, I call it the flat contribution model.

Flat Contribution Model (v1)

The Life in a Day project is a great example of how tons of effort and creativity were filtered down into one consumable artifact. An artifact that is meant to accentuate the artists' differences, yet also our shared humanity and whatnot... The idea was that on July 24, 2010, people all over the world would record videos of their daily lives, maybe answering some questions like "What do you love?" and "What do you fear?" and those would be curated and edited into a single movie. In the end, 80,000 submissions comprising 4,500 hours of material were edited down to 92 minutes. Whoa! Since a mere 0.03% of submissions are actually in the movie, it hardly captures the diversity it could have. On the other hand, 92 minutes was good enough for me as a viewer/consumer. There was a TON of material, a lot of which was likely crap, and some amazing editing went into constructing a coherent viewing experience.

The gist of the model is this: There's some prompt, people go out and do it on their own, submit their creations, and then some curator accepts or denies their contribution and integrates it into the One True Artifact.

Flat Contribution Model (v2) 

While I may have been too snarky about Life in a Day, I reaallly like the Johhny Cash Project. Maybe because I'm impatient and the artifact here is only a few minutes long. Maybe because I really like looking at hand-drawn illustrations. Or maybe because I can actually contribute to this project still; the Life in a Day thing is over and I'm not a video person anyway.

The idea with Johnny Cash is that people re-draw frames of a music video in their own artistic styles. Then when you watch the video, different aesthetics flicker in and out, like realistic and impressionistic and sketchy and haunting and fan art and non sequitur... Here's the link again if you don't want to scroll up and find it. Go take a look!

Some differences for version 2 of this flat contribution model are that a) people work from a common input artifact (frames of a video) and b) they're not working totally in isolation. You can see what other people have drawn before you start drawing your own frame. Working from a common artifact is a lot like the map part of map reduce, and there's research on how to actually use the map reduce model with humans.

Hierarchical Contribution Model

I was moving in this direction by mentioning that in the Johnny Cash Project, people can see what other people have done so far, and that may or may not influence how they draw their own frame.

About 2.5 years ago, Adam and I made an Android app called Sketch-a-bit. It's a gray-scale drawing application, which gives it a similar aesthetic to Johhny Cash, but any time you want to open the app and sketch something, your canvas isn't blank, but an image drawn by another user. Every uploaded sketch is added to the global pool of images for another user to randomly download some day. Every sketch also remembers its parent sketch, so the lineage of each drawing can be traced back to the one white canvas that we seeded the system with. Here's a cool skull drawing that you can trace the ancestry of -- look out for a "SHDH" as a children of one of the ancestors. ;)

The gist of this model, and of Sketch-a-bit, is that people build off other people's work, and the "artifact" -- the 80,000 images that people have drawn so far -- is never really done. It's this thing that's alive and continues to grow and evolve over time.

You may have noticed that a large percentage of those projects described above have Aaron Koblin as their instigator/creative director. His work is super cool, but I'm proud of him for finally making something hierarchical -- The Exquisite Forest -- which debuted this summer, about 2 years after Sketch-a-bit.

The projects in the "voting/consensus" category do follow this hierarchical (or evolutionary) model, but instead of contributing by executing some kind of artistic vision, people contribute simply by voting for their preferred automatically generated offspring.

Of the projects listed above, I would say that Sketch-a-bit, Exquisite Forest, Drawception, and especially Monster Mash, are all somewhat related to the old-timey surrealist parlor game Exquisite Corpse. So this hierarchical contribution model isn't really new, I just haven't seen as many instances of it as I would like.

Upside-down Contribution (Distribution?) Model

I totally just invented this one right now based on the two projects in the "Playing w/ data" category. In the House of Cards project, point cloud data from laser scanning Thom Yorke's face is released on google code and anyone can go make stuff with it. In the White Glove Tracking project, Michael Jackson's gloved hand is hand-tracked (by the crowd) through an entire music video of Billie Jean, and then the tracked hand data is released for people to play around with.

Instead of many people contributing to one artifact, many people consume one artifact (the data) and produce many different things from it.

Towards... an Ecosystem of Ecosystems! 

This deserves its own blog post some day. The idea is that I care about how and what people create based on what they've seen other people create, and that the creation process itself is more valuable than the artifact that you get at the end. We've seen a bunch of crowdsourced or collaborative art projects now, some of which are like little ecosystems in which individuals evolve each other's artifacts. Why not evolve the very nature of those creative ecosystems themselves?


Adam and I wrote a paper about our some of our findings with Sketch-a-bit: Emergent Remix Culture in an Anonymous Collaborative Art System. The pictures are from a slide deck I made for the talk I gave at a Human Computation in Games and Entertainment workshop at AAIDE