23 February 2009

Mathematica Home Edition: US$295

Hi there -

Following two small self-educational / art projects here using Wolfram Mathematica (see through bushes! funkify your portrait!), a Wolfram representative wrote me to let our readers know that you can get a fully-featured license to Mathematica 7 Home Edition for just $295.

This is a good deal. You can learn more about it here.

By the way, we use Mathematica 7 (I defy you to click that link) at Actuality, and have noticed some great enhancements over v6.

For example, it now runs wonderfully on my 2004-vintage IBM ThinkPad T41 - whereas, in the past, 3-D visualization would close the program. It seems to take advantage of multi-core processors. Also, it comes with new image processing functionality that makes it easier to... aw, heck, check it out for yourself. If you use the professional edition at work, I think it comes with a home-use license too.

g-fav

18 February 2009

Good-enough synthetic aperture photography

Hello -

Today I'd like to share progress on my efforts to learn about "synthetic aperture photography," a branch of computational photography.

Jenn and I drove up Mass. Ave. in Lexington (i.e., Paul Revere Lexington), put a video camera out the window, and filmed the buildings whizzing by.

Here is a typical collection of frames of, say, a school (or apartment?) blocked by trees:



After picking, say, 16 successive movie frames, you can stack them up, shear them, and slice them to simulate having a camera that's 20 feet long or so. Why? This manipulates the data so as to create the final image of the building with the trees removed:



This won't be news to those of you in the field of computer graphics since, say, 1997, but there's a lot of activity analyzing optical systems - such as large-lens or multi-lens systems - for benefits in graphics. Marc Levoy, a professor at Stanford, held a class on a collection of topics he called "computational photography." Today, the field, building on work in computer vision and 3-D display, includes:

Snapping a photo, and not worrying about focusing it until later. E.g., when you get home.

Visualizing 3-D scenes from angles between those you've taken pictures, such as the post-inauguration Microsoft / CNN project.

Taking many snapshots of a scene - say of a building obscured by bushes - and then "erasing" the bushes to show what's behind.

I've been attending talks, flipping through papers, and watching colleagues try their hand at this burgeoning field.

How It's Done

(1) Take video at constant velocity along linear track, (2) collect frames into an array - a spatio-perspective volume - and recenter synthetic film plane on the region of interest, and (3) average over all fames & display it.

I like Mathematica 7 for its functionality and elegance (though not its memory management). We took a few minutes of video & chose some few-second clips using our MacBook's video editing software. Exported it as a QuickTime movie, and from QuickTime Pro, exported an AVI. (For some reason my Mathematica won't play nice with MOV.)


video


Then...

Import a good 16 or 32-frame segment into an array of images. I chose frames that showed a buliding in the background with plenty of occluders, minimal vertical jumpiness (remember, we were driving), and relatively constant speed. To save memory, cropped the images to a horizontal region. Mind you, the linear camera motion doesn't need to be at constant velocity, but it will make your life easier if you'd like to automate the process.

Verify the constant velocity by viewing a 2-D slice of the 3-D "spatio-perspective volume," an informal approximation of an "epipolar-plane image," as Bolles and Baker called it in the 1980s.

What's that? For the purposes of the blog post, it is a 2-D image with time occupying the vertical axis and space occupying the horizontal.

Hang with me for a second. Even if you're not a computer graphics nut, I think this offers an interesting way to view the world, in the sprit of an earlier post regarding how engineers sometimes find it easier to manipulate information if it's first converted into a different format.


(Links to: Jan Neumann)

Imagine taking a sequence of frames from that movie, above, printing them out, and then stacking them up like a stacked deck of cards. The back card will be the car's starting point, and the facing card will be the car's ending point. Got it? Now imagine whipping out your sharpest knife, slicing through the deck, and peering down on it.

It would look like this:



Since nearby objects whiz past our field of view quickly, they cover lots of ground in very little time. The nearby trees and signposts therefore are the most horizontally-slanted objects in the image above. On the other hand, the school is in the distance, so it appears to creep along as we drive past it. It's nearly vertical in this representation.

Fortunately most of the tracks through the image are linear, meaning that this process can be completed with a minimum of pain.

What if we wanted to "freeze" the motion of the building, so we could synthetically "focus" on it? We'd need to recenter the film plane by shearing this stack of images. By trial and error, it turns out that the building moves 4 pixels to the right for each successive frame.

We recenter the data by incrementally padding it. If we do it correctly, the building's spatio-temperal tracks will be vertical:


Recentered film plane.

Great!

Now we just need to eliminate those pesky trees. Real-life lenses are good at doing that, because if the lens is big enough, it can "look around" the trees. In aggregate, little pieces of the lens really do get to see the whole face of the building. And so does our video camera.

We can simulate this giant-lens action by simply averaging over all of those frames. (And thanks to my co-worker, Joshua Napoli, for putting it so simply. Here is his similar project of last year - viewing houses through trees - but his blog post [had been] AWOL.)

What does it look like?



Success. Trees are blurred out. Compare to the photos at the top of this post.

How can we take this a step further? We can simulate a variable-focus lens by computing what every possible set of shearing parameters will do. That is, we can tilt our deck of cards by varying degrees so that the space-time path of various objects become vertical, and hence able to be imaged by our gigantic synthetic lens.

Here's how this looks. Let's recenter every 7th pixel so we can "focus" on a tree in the center, and "stop down" the aperture by averaging over fewer frames than we did above.



Whew!

Still awake? If you're interested in this stuff, try out:

-g

ps A big thank-you in advance to anyone who can explain why Blogger: (1) doesn't insert images at the cursor, but rather on top; and (2) why an extra linefeed appears for every paragraph whenever I insert a photo.

16 February 2009

Facebook keeps your stuff even if you quit

Plink co-founder and Polaris Ventures partner Simeon Simeonov wrote a blog post regarding a recent change to the Facebook terms of service.  He links to this piece at Mashable which explains the changes that suggest Facebook owns what you've posted even if you deactivate your account.

You are solely responsible for the User Content that you Post on or through the Facebook Service. You hereby grant Facebook an irrevocable, perpetual, non-exclusive, transferable, fully paid, worldwide license (with the right to sublicense) to (a) use, copy, publish, stream, store, retain, publicly perform or display, transmit, scan, reformat, modify, edit, frame, translate, excerpt, adapt, create derivative works and distribute (through multiple tiers), any User Content you (i) Post on or in connection with the Facebook Service or the promotion thereof subject only to your privacy settings or (ii) enable a user to Post, including by offering a Share Link on your website and (b) to use your name, likeness and image for any purpose, including commercial or advertising, each of (a) and (b) on or in connection with the Facebook Service or the promotion thereof. You represent and warrant that you have all rights and permissions to grant the foregoing licenses.

-g-fav

10 February 2009

Audio signal processing, or "On T-Pain & Kanye West..."

Hi - 

Have you heard Kanye West's "Love Lockdown," an entrancing bit of intriguingly spare but heavily-engineered instrumentation?  Like many 2000s hip-hop songs, it features a vocal effect attributed to pitch-correction software - Auto-tune - from Antares Audio Technologies. [1]

According to a note in XXLmag.com (ref. 1),  we're supposed to understand that the pitch-locking vocal effect is not strictly a plain vocoder, but the proprietary Antares technology. Perhaps it is described by U.S. Pat 5,973,252 [Google Patents], "Pitch detection and intonation apparatus and method."  (Side note: Antares's Andy Hildebrand was a former seismic 3-D data analyst!  Small world, signal processing...)

Commenters on XXLmag.com offer examples of Auto-tune extreme-use (e.g. T-Pain), as opposed to traditional vocoder effects, but, frankly, I can't recall enough from my grad-school speech synthesis days regarding what the differences are between a vocoder, a phase vocoder, and Auto-Tune.  Ah, well.  I know a few of you reading this do indeed know the difference, so please chime in.

Before I leave you with a few YouTube clips, I want to explore the orchestration of "Love Lockdown" because it is a relatively minimal presentation of a theme for Top-40 radio.

First, the core tracks are available on Kanye's website, on the page "Love Lock Down Stems." They include a capella and (quite) distorted, guitar-esque vocals, a 4-beat 808 drum track (||: 1&2 3 ... :||), African percussion, piano (tutorial here, though I don't know if it's in the right key [YouTube]: C#m7, F#m7, B), and outro synth.

The African percussion sounds like a straightforward 1-bar loop which, for a moment there, makes me wonder if it's actually 4 bars in some subtle way.  Anyhow, omitting the lead-in bar crossing and the quarter-note clap:

(0:55)
 1e&a 2e&a 3e&a 4e&a 1e&a 2e&a 3e&a 4e&a
 x xx x x  x xx x xx x xx x x  x xx x xx

 1e&a 2e&a 3e&a 4e&a 1e&a 2e&a 3e&a 4e&a
 x xx x x  x xx x xx x xx x x  x xx x xx

From the listener's point of view, four tracks are active during most of the song: the 808, vocals, piano, and African percussion.  During the first half of the song, they layer (much like standard electronic "house music," i.e. kick, snare, bass, synth, black & white movie samples... :-):

iTunes length 808 voc pno Afrc
0:00   4M      x
0:07   16M     x   x
0:40   8M      x   x   x
0:55   8M      x   x   x   x
1:13   16M     x   x   x

Frankly I got a bit more into it than that, mapping out the song:


Song structure

Then, dear reader, around 10pm I started to wonder why on earth I was putting so much effort into this when I could be, I don't know, doing just about anything else.  I think my intrigue with the song's simplicity became... a spreadsheet.

Anyhow, here are a few YouTube clips for you to consider:

[embedding disabled]
(Video. Kanye West, "Love Lockdown" - commercial version)



(Kanye West, "Love Lockdown" - an evidently leaked SNL version)

[embedding disabled]
(T-Pain and Lonely Island / SNL digital short: "I'm On a Boat")

[embedding disabled]
(Kraftwerk, "The Robots")



(Video. See starting @ 3:00 - poking fun at how poor of a singer one can be & still sound in-tune-ish)

(A plain vocoder sounds quite different: YouTube clip)

Finally, being a New Jerseyite, I must leave you with Bon Jovi's "Living On a Prayer."  No idea what this effect is.



-g-fav



Hot display technologies at CES 2009

Display industry consultancy Insight Media has released their "CES 2009 Best Buzz Awards," which includes a Samsung cellphone with built-in projector, a USB-powered 7" monitor (?), and a Vizio LCD TV.  As usual, Insight Media digs into the details.

g-fav

04 February 2009

Peer Review

Every now and then, I'm asked to help do anonymous review of science or engineering journal manuscripts.

For some reason I got a kick out of this paragraph (this is from the OSA):

Although this paper need not be exceptional, it should add significantly to the field for you to recommend acceptance or revision. Lately, a substantial number of papers have been submitted that can be called "not wrong" papers. These are papers that contain no errors, but they also lack any new and useful information that would move your field forward; they may provide no citable results, or document so little progress that researchers in your field will ignore them. These papers take up your time and ours; they clutter up the literature; and they do not advance research in the field. If you find this paper fits this description, you should recommend that the paper be rejected.

-g

03 February 2009

For my entrepreneur friends who are tired of pitching to VCs

I stumbled across Merlin Mann's entry into the "worst website ever" panel at SxSW 2008.

Dear reader, you need more acronyms! You need to open the kimono! You need DRM and paradigms! You need more socially-networked "friends"!

Here you go. (movie)


g-fav

01 February 2009

Netbooks, baby!

With nice, compact, powerful laptops like the MacBook, what exactly makes netbooks so enticing? I'm trying to figure that out, and seeing if anything's worth buying... yet.

Here is a good top-10 review from C|Net Crave (in the UK). From Oct. 2008, but it seems reasonably up-to-date.

(Do you have and like your netbook? Ever written a document on it? Use Linux / StarOffice? Read a technical paper in PDF? I'm curious.)

g