AI Archives - Friend Michael

Protecting Your Identity in 2025: Proof of Humanity

michael — Fri, 06 Dec 2024 21:26:54 +0000

Imagine receiving a desperate call from someone who sounds exactly like a loved one, pleading for help. Or encountering a seemingly legitimate website with perfectly crafted profiles that mirror real individuals. These scenarios are no longer confined to the realm of science fiction; they represent the alarming reality of AI-driven scams. As artificial intelligence evolves, so do the tactics of scammers, making it imperative for us to adapt and safeguard our digital identities.

The Rise of AI-Driven Scams

The Federal Bureau of Investigation (FBI) has issued a stark warning about the growing sophistication of AI-powered scams. Criminals are leveraging advanced AI tools to create hyper-realistic fake content, from profile photos and identification documents to chatbots on fraudulent websites. These tools eliminate the telltale signs of scams we used to recognize, such as poor grammar or awkwardly doctored images, making it harder than ever to discern truth from deception.

One of the most concerning developments involves the use of AI to clone voices. With just a few seconds of your voice, malicious actors can generate convincing replicas to orchestrate scams or impersonate you. This technology has already been used in distressing ways, such as fake emergency calls designed to manipulate victims into giving away sensitive information. The implications are profound, affecting not only individuals but also businesses and public figures.

Steps to Protect Your Digital Identity

To reduce your risk of falling victim to these scams, the FBI advises limiting the public availability of your voice and images online. Social media, a common repository for personal content, should be approached with caution. Consider making your accounts private and restricting followers to people you know personally. This simple step can significantly reduce the chances of your content being used maliciously.

Another proactive measure involves adopting the concept of a “proof of humanity” word. First introduced by AI developer Asara Near in 2023, this is a unique word or phrase shared only with trusted contacts. The idea is straightforward: if someone receives a suspicious voice or video call claiming to be from you, they can ask for this secret word to verify your identity. While it may seem low-tech in comparison to the high-tech threat, its simplicity is its strength.

Additionally, be mindful of the content you share online. Photos, videos, and even casual voice recordings can be exploited by AI algorithms to create deepfakes. By being selective about what you post and with whom you share it, you can reduce the likelihood of becoming a target. Tools and platforms that prioritize privacy and security can also play a key role in maintaining your digital safety.

The Timeless Power of Simple Solutions

It’s fascinating how an ancient concept like passwords remains relevant in combating modern threats. Long before the internet, passwords were used to verify identities in various historical contexts. Now, amid the rise of AI-generated deepfakes, this old practice is making a comeback in the form of secret words. It serves as a reminder that sometimes the simplest solutions are the most effective, even in the face of cutting-edge technology.

While technology often feels like a double-edged sword, these developments challenge us to think critically about how we engage with it. By implementing thoughtful, proactive measures, we can outsmart even the most advanced scams. Knowledge is our greatest ally in this endeavor, empowering us to protect not only ourselves but also those around us.

Key Takeaways

AI tools are being used to create convincing scams, including deepfake voices and fake profiles.
Limit public access to your voice and images by making social media accounts private and restricting followers.
Adopt a “proof of humanity” word to verify your identity in suspicious situations.
Be cautious about the content you share online and use privacy-focused platforms whenever possible.
Simple solutions, like secret words, can be powerful tools for combating high-tech threats.

Source: Your AI clone could target your family, but there’s a simple defense – Ars Technica

So, what do you do?

michael — Sun, 22 Dec 2019 05:25:09 +0000

One of the most perplexing questions I get, and I’m sure many of you do too, is “What do you do?” For better or worse, it’s how the world used to work. We were products of our situation, our education, and some arbitrary societal rules. You could assume a lot about a person by their answer to that question. It’s usually a horribly simplified version of themselves that doesn’t begin to express their value. And well, we know how assumptions turn out.

There are tens of millions of people that subscribe to a single occupation philosophy, and that’s not going to change any time soon. It’s also true that many of us have evolved to be stateless in our endeavors and exploration. We’ve chosen to pursue many things, things that interest us, challenge us, or motivate us.

As we move forward, some things are true… automation is going to increase its footprint exponentially, and it’s going to impact those with a single occupation more than others. Not just in the manual labor, transportation, and retail markets, but in places that are thought to be human exclusive domains.

AI and machines are doing more in music and the arts that you might expect. Everything from composition (Aiva) to performance, and when you consider things like digital effects (Adobe’s magic wand with Sensei AI). That’s not even scratching the surface of the changes coming in the creative arts.

And the medical field? Look up Giovanni Montana’s work in xray and AI. Legal? Luminance and eBravia. This list is a mile long. All of the incredibly lucrative “college required” fields are subject to automation too. Oh, look up Neocis (dentistry).

What about construction? There’s ICON.

This isn’t a bad thing in my opinion – as someone that simply loves and embraces technology of all kinds. It can be alarming if you’re not ready to hear it – but that fear won’t change it.

Here’s a reality check… our current government is not prepared to handle this outcome. Our “leadership” tasked staffers with developing questions to grill Zuckerberg. This is alarming – Zuckerberg runs intellectual HyperLoop powered circles around our government officials. This should be quite alarming to you. If it’s not, it could be that you’re not fully grasping what it means to have people in our government that don’t “do email.” or text at the very least.

We have a chance to change that in 2020. All but one of the candidates running in the democratic race are more of the same, specifically from a technological background. Our country can no longer afford to be led by someone that doesn’t directly understand how to grill someone like Zuckerberg themselves, and publicly. We need leadership that is intellectually comparable to the best minds in technology – and doesn’t have to rely on staff recommendations and lobbyists as their primary data set.

This is why #ISupportAndrewYang in 2020. He’s the only candidate that makes sense for the 21st century. Learn more about Andrew Yang here. #humanityfirst

What do computer graphics, assisted intelligence, Watson, and actual humans have in common?

michael — Tue, 27 Dec 2016 02:01:44 +0000

Recently I spent some time reading, researching, and watching the state of the art in several areas of interest, notably CGI/computer animation, VR, text to voice, and facial pattern recognition. The takeaway is that all of the pieces are in place for an idea I’ve had for several years. Be advised, many of the videos included below are around a year old, so following up directly with their makers may reveal even better implementations.

What follows is a short story about each piece, then a video showing how this piece fits today. The concepts, taken as a whole, will form the foundation of an idea that could potentially change how we get things done in real life. It’s worth the read, and it’s a doozy. Grab a cup of your favorite whatever.

As always, if you like this content, please share it so others can enjoy it too. You never know who will read these things… they might be the one to take this and actually change the world.

Input: the human voice.

Human-computer interaction (HCI) has taken many forms over the years. We know the keyboard and mouse, the stylus, touch, motion, and now of course we have person to person, or social chat. One of the fastest growing segments in tech is voice recognition, and owning the home of your users. This is voice to text to [internet/cloud/action] to text to voice.

Apple has Siri, Amazon has Alexa, Microsoft has Cortana, and Google has, well, “Hey Google.” All of these services empower their owners to ask questions about seemingly anything, to change the TV channel, to order more diapers, or close the garage. They’re all early from a technology standpoint, and have varying degrees of friendliness and usability. But the core is here, and they’ll only get better with time.

All you have to do is talk, in your native language, and magic (or algorithms) happens.

Input: plain old text.

Another interface is text to text. We know these as chatbots, and you interact with them through several input channels: Twitter, SMS, Facebook Messenger, and dozens of others. Companies like Conversable are doing a great job in this space: “I’d like a large cheese pizza delivered to my house. Use my saved payment method.”

While one is initiated by voice, and the other by text, they’re just hairs apart from a technology perspective. Speech to text is nothing new, it’s been a work in progress since the earliest times in computing. Add assisted intelligence to the text output, and now we’re cooking.

Need something? Just type it, and the tech will respond.

Input: human emotion.

While voice and text are great inputs, video is an even better input. Recognizing a person has become so trivial that it can be done with a simple API call. Technology can be sure it’s “you” before you interact with it. Microsoft uses this to automatically log you into XBOX with a Kinect device.

More than detecting who you are, computers can also detect emotion in video. This used to require a room at a university, and was only done as a part of a research project. Today, we can accomplish this with just about any standard web cam, or a front facing camera on a smartphone.

We know it’s you, and how you’re feeling at this precise moment. “How can I help, Michael?”

Output: human voice.

Even the best synthesized voices still sound, well, electronic. Enter a new technology from Adobe called “vocal editing.” This tech was demoed at Adobe MAX 2016, and uses a recorded voice to allow the “photoshopping” of that voice.

It’s early, but this tech exists, and could be the voice interaction component of this idea. This demo uses a recording, just a few seconds long. Imagine what would be possible with dozens of hours of training recordings. The only input required after that is text. Text is the primary output of all of today’s Assisted Intelligence (AI) applications, like Watson for example (IBM Watson, How to build a chatbot in 6 minutes). This the next logical step:

This technology could easily be used in real time to allow “bots” to make calls to humans, and the humans would be none the wiser. They could even use your voice, if you allow it to. Bots can take voice as input (voice to text), and output text which gets sent back in the form of audio (text to voice), using this technology.

Any text input, from any source, with one remarkably consistent voice response. Maybe even your own.

Display: character-persistent, animated avatar.

When I had the original idea, the avatar the user interacted with was a simple CGI character, an obviously rendered character that would remove any interaction distraction. I wanted every touch point with the the avatar to be as simple as possible, so you’d spend all of your time focused on the task, and not distracted by its interface. This may still be the best option, but I see that gap closing quickly.

Here’s Faceshift at GDC 2015 (since acquired by Apple), but others (like Faceware, nagapi) exist in the market. Notice two completely different actors playing the same character.

Disney Research Labs has similar technology already in use.

The movie viewer never sees the actor, only the character. With the voice tech above, and a character representation, we’ve removed two persistence problems. Voice and character. Any one (or thing, including AI) can provide consistent human voice, and anyone (sex, build, race, location, whatever) can power the physical representation.

Every single time you interact with the tech, the avatar looks and sounds the same – no matter who the actor is on the other side of the animation.

Display: the human presence.

We’ve seen remarkable leaps forward in what is an age old (in computer years) tech called motion capture. Meat space actors wear a variety of sensors and gadgets to allow computers to record the motion for later use. This used to appear in only the best of the best games and movies. Just about everything you see in major releases today (from a CGI standpoint) is based on motion capture, if it involves humans.

Traditional motion capture was just a part of the process though. Scenes would be shot on a green screen, then through (sometimes) years of perfection, a final movie would grace theater screens or appear in an epic game release.

At Siggraph 2016, Epic Games (makers of the Unreal Engine) featured a new process that amounts to “just in time acting.” Instead of capturing the actors, then using that motion later in scenes, Epic used a process that rendered results in real-time. It’s mind blowing – using the camera in the game engine to record a motion captured actor-in game.

Display: enhanced human presence.

The problem with CGI and humans is something called the uncanny valley: “a computer-generated figure or humanoid robot bearing a near-identical resemblance to a human being arouses a sense of unease or revulsion in the person viewing it.”

Explained in Texan: “Well, it might be close, but it ain’t quite right.” It may be getting close enough.

There are several ways humans protect themselves from attack. One of the simplest is recognizing fellow humans. Sometimes they may want to harm us, other times hug us. But either way, we’re really, really good at recognizing deception.

Until now. This piece was created with Lightwave, Sculptris and Krita, and composited with Davinci Resolve Lite – in 2014 (two years ago).

In 2015, a video was released by USC ICT Graphics Laboratory showing how advanced skin rendering techniques can be deceptively good. Another video by Disney’s Research Hub shows a remarkable technology for rendering eyes. And earlier this year, Nvidia released a new demo of Ira.

Display: enhanced facial reenactment method.

An advancement I didn’t expect to see so soon takes standard digital video footage (a newscast or a YouTube video, for example) and allows an actor, using a standard webcam, to transfer expressions to the target actor. It’s a clever technology.

If the technology works as well as it appears to with simple, low resolution sources, imagine what could be done with professional actors creating 48 hours of source video. That could in turn be targeted by digital actors using a combination of the above video technologies. The interface to this technology would be a recorded human actor with transferred facial expressions from a digital actor, all rendered in real time.

Bringing it all together.

Inputs: voice, text, video, and emotion.

Processing: assisted intelligence, APIs: input to voice.

Outputs: text, human voice, and/or photo realistic CGI/animated characters/human.

But wait. There’s one more thing.

This is great for an AI powered personal assistant. Marrying all of this tech together into one simple and cohesive interface would make everything else feel amateur.

But what if we could add an actual person (or 4) to the mix. Real human beings available 24/7 (in shifts) to note your account, or to call your next appointment to let them know you’ve arrived early? What if your assistant could call the local transit agency, or cancel a flight using voice in whatever the local language happens to be?

All of the technologies mentioned above create and intentional gap between the inputs and outputs, allowing any number of “actors” in between. If a task is suitable for a bot to handle, then a bot should handle it, and reply. If a human is required, the user should never know a human stepped in to take control of the interaction. The voice, text, and display will be 100% the same to protect the experience.

Think about it: any language in (from either side), and your specific language and video representation out. If there were a maximum of four people that knew you more intimately than your family, but you knew you’d never, ever have to think about this problem again, would you do it?

In summary, I’ve outlined a highly personalized virtual assistant, with 100% uptime and omnipresence across every device and interface you have (including VR, but let’s save that for another time).

What you won’t know is whether you’re talking to a human, or machine.

If you liked this, please share it. Your friends will dig it too. Thank you!