Synthetic media is an interconnected field of AI-generative media. This rapidly growing field is directly connected to generative adversarial networks. It is a modification of data by automated means. And even though it is mostly associated with the latest more popular implementations such as deepfakes it has a much wider spectrum including text to speech generation, music synthesis, image synthesis, and more.
Though “synthetic media” is more general term more specific methods such as deepfakes and voice synthesis are sometimes referred to as “deepfakes” as a euphemism (“deepfakes for text”, “deepfakes for voices”, etc.).
In the blink of an eye, media makers got upgraded. Creating a new synthetic digital world where anything is possible is now a reality. We are entering a new digital era. The era of digital doppelgangers where the term you can be anybody get a whole new meaning.
Historically it goes toe to toe with the evolution of computers. First algorithmic and generative experiments can be found as early as the 50s of the 20th Century. Of course like other fields that are tied to computer science, synthetic media was in hibernation till the late 80s, when computational powers started to grow. More consumer-friendly Personal Computers and implementation of the World Wide Web pushed the development of algorithmic researches in the fields of music and visual arts.
In 1997, a paper written by Christoph Bregler, Michele Covell, and Malcolm Slaney developed a Video Rewrite Program. It built upon older work that interpreted faces, synthesized audio from text, and modeled lips in 3D space, but was the first to put this all together and animate it convincingly. Academics laid ground to form the upcoming era of AI. It is fair to say that the rise of synthetic media started in the early 2000s.
Developments in the field of facial recognition drastically improved. Things like motion tracking and voice recognition became commercially accessible.
Until a Reboot Do Us Part
Seems like synthetic branded promotions are becoming our routine. Forget a telemarketer who is shouting at you from TV trying to sell you things you don’t need. Marketing for digital society is different. You don’t need to convince anyone. You just need to be, well, synthetic.
One of Japanese adult game brands launched a PR company to attract more players to game New Wife: Lovely X Cation. One of the unique selling propositions was to marry a virtual girl of your choice. Apparently a lot of young people were very interested to try out this unique experience. As a result, 100 “grooms” were selected. Each “marrying” a character from the erotic game of their choice. But how can you kiss a virtual bride? Well, it is a little tricky. A staff member with the controller and marshmallow will help.
Virtual influencers like Imma or Miquela don’t surprise us anymore. Synthetic celebrities got millions of followers. Hatsune Miku tour and sell tickets for concerts. Your home assistant greets you when you are back home.
So, in the context of synthetic media, content is automatically “translated” and adjusted for its target audience. It is a more personalized or microtargeted approach. These are all in interplay with other trends in AI so affective computing the ability to laser focus.
More flexible microtargeted content equal more engagement. It is a matter of years when avatar-celebrities will be customized to appeal to different audiences like a video game setting where you go through customizations.
Mass automation of creative and journalistic jobs is now possible. We now have the ability to write a headline and then hand that headline over to a computer system and it will write a full article top to bottom based on the headline. This technology makes misinformation campaigns with trained algorithms even scarier.
In the audio domain, we have the technology for synthesizing speech in a person’s voice. You would record a couple of hours of the person speaking and then you would do what’s called text-to-speech. You type and the computer will synthesize anything you want a person to say.
How does deepfake work?
You have two algorithms. One is called the synthesis engine and one is called a detector. The synthesis engine’s job is to generate an image of a person. It takes an image it slaps down random pixels in the image and it hands it over to a detector. Detector’s job is to say if this is a person or not. It has many images at its disposal of real people and so it compares them.
Every time it goes back to the synthesis engine says if it is a real person or not. And the synthesis engine modifies a few pixels and sends it back. This process repeats a hundred million times and eventually what will happen is the synthesis engine based on trial and error will eventually create an image. The algorithm synthesizes everything starting from hair and ending with cloths.
Types of deep fakes: face swap, lip-sync, the puppet master.
Adversarial machine learning has other uses besides generative modeling, and can be applied to models other than neural networks. In control theory, adversarial learning based on neural networks was used in 2006 to train robust controllers in a game-theoretic sense.
Attention arose towards the field of synthetic media when in 2017 media reported on the emergence of pornographic videos with celebrities. The faces of famous actresses altered with the use of AI algorithms.
New synthetic technologies are fast-moving and important fields of humans and machines are a collaboration. On one hand, it is exciting and innovative, and on the other, disturbing and ethically questionable. We are going to witness the shake of public discourse for years to come, no doubt about that.
We have a tool that makes it possible to create both lucid dreams or endless nightmares. But that’s the case with every new tool humanity got to possess right?
Check also 👇👇👇