Exclusives

Metal Toad CEO: The Machines are Watching

As Metal Toad has gotten more involved in the AWS machine learning ecosystem, I’ve found myself having conversations that sound a little like this:

Friend: “So, what have you been up to?”
Me: “Learning about machine learning. There’s so much you can do with it!”
Friend: “Neat. Like what?”
Me: “The limits are your imagination!”
Friend: ‘… So how’s the weather where you are at?”

While I hope most of my conversations are more dynamic and engaging in reality, the question “What can you do with machine learning?’ is not something that is answered simply. With over two dozen algorithms available on AWS alone, I want to educate people more specifically on particular use cases that I will share as a series. The first one is automated content moderation.

The challenge

Everyone these days knows about user-generated content. Things like videos uploaded to YouTube (or Twitter, Facebook, and other social platforms) are not created by certificated, closely managed companies but rather by individuals. Anyone with a cellphone can, in fact, upload to YouTube.

As of today, approximately 6.648 billion people, or 83.72% of the world’s population, have a cellphone. That means there are potentially this many contributors to all of the world’s video or content platforms.

If we zoom in on YouTube as a microcosm, there are 500 hours of video uploaded to YouTube every minute. That’s 30,000 hours of video every hour, which means to have a person look at every video, you would need a workforce of 30,000 people to watch it all without ever taking a break or blinking. Clearly, this is an almost impossible task.

Solution 1: Do nothing

One simple solution to this challenge would be to allow everything to be uploaded. While throwing up their hands might seem like a rational response, people do some pretty weird things. By way of example, during the pandemic, I decided to attend some Los Angeles City Council meetings virtually. It was early days, so things we still being figured out, so Zoom was being used to host the meeting. Several hundred people ended up dialing in, and within that group, not one but two people decided to use their video feed to display a pornographic movie.

Eventually (after 20 minutes or so), the city council got things under control, but now imagine billions of anonymous contributors. There is going to be a LOT of content that needs to be vetted.

Solution 2: Bring in the robots

If there aren’t enough people to do a job, we are now entering the world of machine algorithms. Described simply, these are software systems that can “watch” video (or other types of content) and decide whether content should be approved, flagged for human review, or deleted. All the big tech companies have their own which they have built over time.

Each company has its own standards of acceptable use and have been able to spend millions of dollars in developing their own content moderation algorithms (which I will sometimes affectionately refer to as “robots”).

These robots have several advantages vs. a human workforce:

• They can watch literally everything
• They never take a break or go on vacation
• They treat content the same (unlike a workforce of 30,000 humans)
• They prevent real people from having to see disturbing things

The AWS solution

Not every company has millions of dollars to build an algorithm, but AWS does. They have called their algorithm Rekognition (I talk more about its capabilities here) and within its capacity to moderate, Rekognition can reliably flag the following:

–Explicit nudity
–Suggestive content
–Violence
–Visually disturbing content (dead bodies, explosions, etc.)
–Rude gestures
–Drugs
–Tobacco
–Alcohol
–Gambling
–Hate symbols

Using the algorithm, any video, image, or audio feed can be reviewed on an on-demand basis. The cost for this service generally runs around $0.10/minute, which means an hour worth of video would cost $6 to review in this way. Great when compared to the human cost, but it’s unlikely that YouTube will be switching over to paying the $180,000 per hour it would cost to review their video upload stream.

The moral perspective

With every machine learning workload, I think it’s important for humans to take a beat to understand the moral implications of delegating work from people to robots.

This means less money being paid out to unskilled workers, but in the case of content moderation, the trade-off of not needed to do highly repetitive, sometimes psychologically disturbing work is more than worth it.

One potential that does exist is the accidental flagging of content that is unintentionally censored. This could be as simple as an elbow being mistaken for buttocks, or a geometric pattern in a background mistakenly being flagged as a swastika. There are also more subjective calls: what is art? What should be censored? Ultimately, the controls within Rekognition allow broad latitude for control on what is or isn’t flagged or deleted, but these are still important questions to ask and answer.

Just because we can do something, doesn’t mean we should, especially when it comes to machine learning. In the case of content moderation, the trade-off is well worth it. We’d love to hear from you or talk more about any ideas this has generated for you.

Joaquin Lippincott, Metal Toad CEO, Founder