AI and Privacy in 2024: The Challenges that Lie Ahead

AI use has increased significantly over the last year among the general public with the appearance of tools like ChatGPT, Claude, and others. While this has created a lot of excitement in businesses and consumers alike because of AI’s immense potential to transform and improve our lives, privacy concerns are also on the rise – and rightly so.

This article will inform users about AI’s pros, cons, and privacy risks. We’ll also teach them how to use AI tools and protect their privacy.

What is Artificial Intelligence (AI)?

Artificial intelligence (AI) is a state-of-the-art, ample branch of computer science focused on designing and constructing machines and engines that carry out tasks commonly associated with human intelligence. This kind of computer science goes way beyond writing code, requiring a multidisciplinary approach that includes machine learning and deep learning.

AI aims to create models that mimic or improve human activity, especially intellectual. It includes many things, from self-driving cars to generative AI creation tools. Artificial Intelligence is becoming a staple of current life, gathering much interest from the business community.

Artificial Intelligence: Advantages and disadvantages

AI can deal with large amounts of data that would overwhelm a human researcher and paralyze him. AI models trained with machine learning can quickly sweep that data and turn it into useful information. As we write this, the main problem with AI is how costly it is to process the immense quantities of training data needed to complete a good AI model.

Advantages of AI

Good at detail-oriented jobs: AI is very effective in diagnosing some types of cancer (breast, melanoma), even outsmarting some human specialists.
Reduced time for data-heavy tasks: Industries needing efficient analysis of large data sets to reach valuable conclusions quickly are increasingly turning to AI models. Banking, securities, pharma, and the financial sector are good examples. AI, for instance, is good at detecting fraud in loan applications.
Saves labor and increases productivity: It can make all activities more productive, saving time and other resources and multiplying output.
Delivers consistent results: Despite their intended similarity to human interaction, these are still machines so that they can perform consistently and constantly.
Can improve customer satisfaction through personalization: AI can take personalization to an epic level, thus providing unprecedented user satisfaction.
AI-powered virtual agents are always available: They need no sleep, food, breaks, vacations, or interruptions of any kind.

Disadvantages of AI

Very Expensive
Requires deep technical expertise.
Scarce-qualified workers to build AI tools.
It’s only as good as the training data.
It cannot be generalized.
Supresses human jobs.

Strong vs weak AI

AI comes in two flavors, which is helpful as we try to understand it as best as possible.

Weak AI: It is also known as narrow AI is a one-trick pony. It’s designed and trained to do one thing only. Industrial robots are the most evident example.
Strong AI: It is also known as Artificial General Intelligence (AGI), which describes programming that tries to replicate the cognitive abilities of the human brain. It can use techniques such as fuzzy logic to search for solutions to unfamiliar problems, bringing them from other realms of knowledge. The idea is for this type of AI to pass the legendary Turing test.

Artificial Intelligence examples

The versatility inherent to AI technology allows it to take as many forms as the mind can imagine that it can go from the now very popular chatbots to wearable gizmos. Let us see some of the more common examples of AI in use today:

1. ChatGPT

It is an Artificial Intelligence model language that is accessible to the public as a chatbot capable of “conversations” with users and delivering written text of different kinds, as well as computer code. It became widely available in November 2022, courtesy of OpenAI. It is also available as an iOS app.

Despite its popularity, many countries like Italy banned its usage due to privacy concerns.

2. Google Maps

Google Maps takes your smartphone’s GPS data and user reports on events like traffic accidents or construction sites to see the flow in many cities and find the fastest way to navigate the streets.

3. Smart Assistants

These examples include Alexa, Siri, and Cortana. They use “Natural Language Processing (NLP)” to hear user instructions and follow them within their capabilities. They can set alarms, perform searches, control a room’s lights, and answer general questions. The experience improves as the assistant learns the user’s preferences better quickly.

4. Snapchat filters

ML algorithms allow Snapchat filters to tell the difference between a subject and the background in an image, track facial motion, and adjust the screen to follow the user accurately.

5. Self-driving cars

Deep learning in the form of deep neural networks allows for the creation of self-driving cars so they can “see” surrounding objects, identify the traffic signs, and do all those things needed to move around.

6. Wearables

The wearable toys that are quickly becoming ubiquitous among users apply deep learning models to figure out the health condition of each user, estimating glucose levels, arterial pressure, heart rates, and much more. They can also use the user’s past medical data to plan for the future.

7. MuZero

DeepMind’s MuZero is one of the current leading candidates to become the first true artificial intelligence. It has learned to play games without instructions, from classic Atari titles to Chess. It’s done that by sheer brute force and relentless perseverance.

AI and privacy: The challenges

1. Unconcerned with copyright and IP laws

AI models need data for training, and they get it where they can find it. So, they scan every web corner for useful material, regardless of copyrights and other intellectual property considerations. As a result, many AI vendors are using a lot of copyrighted material, whether artwork or text, without the knowledge or consent of the owners.

Then, the same data trains retrain, fine-tune, and feed the AI models. Can’t you just trace the relevant material to the owner? We hear you ask. Well, no, you can’t. The current models have grown so complicated that tracing training information back to the source is not possible anymore, at least with any degree of confidence and security. After a certain point, even the developers can’t tell what material is included in the model’s training process.

2. Unauthorized incorporation of user data

As users create prompts to use AI models for whatever task they have, there is always the chance that those prompts will train the model in the future. This could be a problem if unsuspecting or sloppy users feed the system with sensitive information.

Not long ago, three geniuses employed by Samsung fed sensitive corporate information about the company to ChatGPT. It became an information leak by any meaningful standard, and now it could become training material for the language model. While many AI vendors are trying to address this issue, there is no way to know that they will manage to keep private data out of the training data set in the future.

3. Lack of regulations and safeguards

Governments and legislation are always far behind technology. It’s no surprise. However, some governments are moving forward, trying to develop AI regulations and safe use guidelines and policies. However, we are far from having any significant standard that could make AI vendors accountable for their actions in creating, training, and publishing their models.

Several AI vendors are already feeling the heat of alleged IP transgressions, murky data collection processes, and even more arcane training. However, as things stand right now, each vendor retains the power to decide everything about their model, from data storage to security and user rules, without external input.

4. Abuse of biometric data

Facial recognition, fingerprints, voice recognition, and other biometric elements are slowly but surely becoming authentication tokens in many devices instead of traditional passwords. And that’s not even the beginning. Public surveillance cameras quickly incorporate face recognition and other biometric markers to scan individuals for swift identification.

Biometric authentication is convenient and practical. It’s also utterly unregulated in every possible way, let alone the use AI companies can give such data once they have it in their power. These data are getting collected, stored, and analyzed with AI, and there’s no way to imagine how it will be used.

5. Stealthy metadata collection

Suppose that you, as a user, interact on the World Wide Web with an ad, a TikTok short video, a social media post, or any other web activity, as you probably already do on a daily basis. Such interaction creates a trail of metadata. Put that together with your search history and other information about your digital life. AI will have many new elements to understand you better to design a targeting strategy that will get to you sooner or later.

This kind of metadata collection is familiar in the digital experience as it has been going on for years. However, AI has the potential to turbocharge it and dramatically increase the scale and the interpretation in which this can happen. It could enable digital corporations to target their messages for specific users precisely and effectively. In contrast, the user never has the slightest clue.

Yes, most user sites have published policies acknowledging this data collection. Of course, it’s what we used to call “little print” in the good old days. This information is lost in the middle of an enormous amount of unreadable legalese text and is only mentioned in passing. So even the very few users who take the time to read such policies can be none the wiser about it. And, of course, every internaut using such a website accepts those terms by default.

6. Weak security features for AI models

Some AI engines have a security baseline as a default element in their architecture. However, it is not a rule, and many others do not have any security protection in place. This means unwanted users (including the criminal element) can easily get other users’ information, including identifiable things.

7. Long storage periods

So, how long will these AI organizations keep the data in store? Where? Why? Not many vendors are very clear about these things, and most keep their records for extended periods.

Take the OpenAI policy of ChatGPT fame. It says that user input and output could remain stored for up to a month “to identify abuse.” And how is this abuse identified? How is the company justifying the closer examination of a user’s information without letting them know? We don’t know. We wonder if they do.

Privacy and the collection of AI data

Web scraping and web crawling

Web scraping and web crawling are free and unrestricted for the most part. And if it’s limited somehow, the blocks are trivial to dodge. And let’s not forget the insane size of the web. So, AI tools have a lot of latitude in acquiring training data of all kinds through these two procedures.

The content is out there, freely available to internet users worldwide. More recently, metadata collection with web scraping and crawling has taken the front seat. It mostly comes from marketing and publicity datasets and websites with clear targeting procedures.

User queries in AI models

When you issue a prompt for an AI model to perform a task, it will likely remain in store for at least a few days. And it may never be used at all for any other purpose. However, many AI tools are keeping this data to use to improve their future training.

Biometric technology

The hardware you can turn into a biometric collector is unlimited. Every piece of surveillance hardware, let alone facial, finger scanners, and even plain microphones, can detect human voices that carry a biometric signature. Those things can feed data into an AI model identifying a human without consent or knowledge.

The rules about using this kind of technology are moving forward at the most local levels, at least for now. However, the current situation allows collecting these data pieces freely, without anybody’s permission or awareness.

IoT sensors and devices

Internet of Things (IoT) sensors and state-of-the-art systems continue collecting unimaginable data points all the time to be processed in a physically nearby center so larger and more powerful calculations can be completed. This highly specialized information from IoT gadgets is very advantageous to AI systems.

APIs

APIs make it easier for users to interact with an AI model because they provide an accessible interface. The thing is, it makes it even easier for corporations to collect valuable data for AI training. The correct API design and deployment can get them great amounts of high-quality information with minimal effort.

Public records

Public records are among the most sought-after documents to use in AI training. And they don’t even need to be digitized. Everything there is to know about public corporations, historical events past and present, criminal records, immigration records, and whatever is out there in the public domain is up for grabs as far as AI models are concerned. They don’t need to ask for anybody’s permission. User surveys and questionnaires.

This method seems out of fashion. However, it remains a highly effective one; it’s tried-and-true, and AI vendors love it.

Users provide relevant information about their experience with the AI service and ways to improve things. But any question will do. This tells the AI the information it wants to target things more narrowly in the future.

So, what can we do to solve AI and privacy concerns?

The end isn’t here. As a matter of fact, we are at the very beginning of the upcoming AI revolution. The good news is that there are things for us to do. A few good practices and resources will go a long way in allowing us to keep harnessing all the advantages AI brings us without giving up our privacy completely. Take the following tips into account:

Identify an appropriate use policy for AI: Internal must be aware of the types of data they can use, how, why, and when interacting with AI tools. This becomes even more important if an organization deals with sensitive customer data.
Invest in data governance and security tools: It’s all about extended detection and response (XDR). AI tools need protection against data loss, threat intelligence, and monitoring software. Specialized tools provide this service, protect your data, and ensure that your data complies with regulations.
Please read the fine print: Oh yes, it’s all about the small letters and those terms of use nobody reads but the paranoid. The documentation is there, even if it’s only at an elementary level. Please read it. Identify the red flags if you find them. If you have questions, ask them to a representative and make sure you understand everything correctly.
Use only non-sensitive data: Your private and sensitive data is not something you should share with AI models; this should be clear from the beginning.

FAQs

What are the problems with privacy with AI?

The main problem is that AI can be too powerful. So, it could learn to infer and spread unwanted sensitive information about a human being, creating risks for identity theft and unfair surveillance beyond the imaginable. Such as location, preferences, and habits.

What are the risks of generative AI privacy?

Several things can go wrong with generative AI privacy; the main concerns are data privacy, bad decisions based on wrong information, employee misuse, ethical risks, and copyright violations.

What is the biggest concern regarding AI in the future?

Job destruction is the most prominent concern right now. According to Oxford University, 47% of US jobs can be replaced by automation in the coming two decades.

What are the ethical issues of using AI in security and surveillance?

Unfair bias and discrimination are real risks if the AI systems trained for surveillance start with a set of incomplete or biased data.

Share this article

About the Author

Jorge Felix

Cybersecurity Expert

232 Posts

Jorge Félix (Mexico City, 1975). Theoretical physicist specialized in Cosmology and Superstring Theory. He's been a writer on scientific and technological issues for more than 23 years. Has ample experience and expertise in computer technology and a keen interest in digital security issues.

Comments

No comments.