Somehow I doubt it really is that easy.
I've done it. Granted, it doesn't do real-time video classification, but image classification is piss-easy. The only real difficult part is getting it to work with images of varying dimensions, but one could just resize them to a specific set of dimensions and feed them into the network that way.
Allow me to go into details!
Okay, so for image classification (or anything where you need more variable pattern matching) you don't use a normal (termed "dense") neural network, but a convolutional neural network. These take in
patches of information at a time (say, a 3x3 pixel area) for each input. This allows the network to notice patterns regardless of location. Work this magic with a recurrent neural network (which is purpose built for time-series data like video) and you can have something that can classify any video it is given.
I've
done convolutional nets before; they require no more than a few dozen lines of Python to get trained, if you use the right libraries, all of whom are totally free.
The only other challenge than gathering initial data for training, testing, and validation (which could be legally iffy, given the content in question), is tuning the network's topology to increase accuracy by allowing it to pick out the right set of features in any given image or video. Once its done training and validating that it is actually accurate, running the thing is actually extremely quick. Hell, my local Walmart uses it to detect moving people in a live CCTV feed.
Students learn this in an intermediate-level computer science course.
I learned this in an intermediate level computer science course. Tumblr, with all the money it no doubt has, with all the developers its got on-call, could surely get a couple good servers with high-end GPUs (this is why nVidia is raking in money like crazy, GPUs are insanely good with machine learning), have their trained monkeys spend a few moons (at the absolute most) creating, training, and optimizing classifiers for both static images and video to find and flag (not remove, just flag for human review) potential child pornography.
The resources are out there, and they're totally fuckin' free. Tumblr just needed to get off its ass and spend some of that sweet, sweet advertiser cheddar to get it done. But that requires time, effort, and money. As we all know, that's the devil's work.