Published in: Generative AI

Understand and Evaluate Generative AI

Author Deepinder

Published on: September 4, 2024

Generative AI is a technology that creates new content taking help from already existing content, which ranges from text and images to animated videos and 3D models.

Its rise in recent months is extraordinary primarily due to the fact that people can use natural language to prompt AI, so the use cases for it have increased multifold. Across different industries, AI generators are now being used as one’s own ‘personal assistant’ for writing, research, coding, and more.

How does it work ?

Surprisingly, the technology is not as new. In 2014, generative adversarial networks, or GANs was introduced to the world — a type of machine learning algorithm – that generative AI could create new images, videos and audio of real people. Traditional AI models, known as discriminative models,could only classify or predict outcomes. For instance, a discriminative model could differentiate between images of breads and sandwiches. Whereas, generative models can produce new data, like new images of sandwiches from a huge dataset of images of sandwiches.

Todays’ generative AI models are the “unsung” heroes of AI. GPT-3 (now V4), BERT, and numerous others are examples of such models in action.

In simple words, GenAI goes through the process of ‘learning’ from existing content which is called training which results in the creation of a statistical model. And when given a ‘prompt’ from the user, the GenAII uses this model to predict what the response might be, which leads to creation of new content. What is a prompt here exactly ? – A prompt is a natural language text that requests the generative AI to perform a specific task. The quality of the input determines the quality of the output, hence it is important that while asking the GenAI model to perform any task, provide as much context as possible.

The basic step-by-step process for the model to work is –

Data Collection. Specify the kind of content that the model is expected to generate
Choose the right dataset that’s aligned with the objective
Choose the Right Model Architecture like GANs, transformers etc
Train the Model and refine the parameters to reduce the difference between generated output and desired result.
Evaluate and Optimize by adjusting the model’s architecture, training parameters, or dataset

Understanding Generative AI models

Depending on the model type you’re training, GenAI models are trained a little differently. Let’s look into how the most common models are trained:

Generative Adversarial Networks (GANs)

GANs consist of two neural networks, namely, a generator and discriminator. The generator’s job is creating new data based on existing data points, whereas the discriminator tries to distinguish between original and fake data. The generator learns to create more realistic data over time that can fool the discriminator, hence creating better quality outputs. Example – DALL-E can take a simple description in natural language and convert it into a realistic image or art

Transformers

Transformer models consist of an encoder and a decoder. The encoder converts input text into an intermediate representation which is passed to the decoder and then converted to useful text.

Transformers,like those used in large language models (LLMs), have revolutionized natural language processing (NLP). Models like BERT and GPT are based on transformers and are capable of tasks such as text classification, text translation and text generation. LLaMA from Meta, has been trained on various data sources, including social media posts, web pages, news articles etc., to support various Meta applications such as content moderation, search and personalization.

Diffusion models

Diffusion models learn the probability distribution of data by looking at how it diffuses throughout a system. These models destroy training data by adding noise and then learn to recover the data by reversing this noising process. Example – Stable diffusion creates photorealistic images, videos, and animations from text and image prompts

How to evaluate generative AI models

Selecting the right model for a particular task is crucial since different tasks have their own specific needs and goals. For example, one model might be great at producing high-quality images, while another excels at generating coherent text.

And what factors would you possibly judge them against – accuracy in the answer? simplicity in explanation? creativity in response? tone match the audience?

As you can see there are a lot of factors that can be considered individually or together, vital in determining the most suitable one for a given task. It not only helps in choosing the right model but also helps you identify areas that require improvement. As a result, you can refine the model and increase the likelihood of achieving the desired results, ultimately enhancing the overall success of the AI system.

Concerns and Future State

Although there’s no doubt we have not seen anything like generative AI in producing content in terms of how fast it is, critics are concerned about the potential lack of reliability and mistakes in the output generated.

We know Generative AI models require extensive training data, for example, the original GPT-3 model was trained on 570 GB of textual data. In some instances, portions of these training data sets were taken directly from the internet and contain copyright-protected text and images, leading to potential copyright infringement issues.

Many lawsuits were issued against AI developers by copyright owners, but simple answers to these complex, legal questions are unlikely to emerge so soon. Moreover there is no easy way to assess the contribution of a single piece of work for training an AI model and for an approach to potentially compensate the owner for their work.

The ability to determine whether an image was generated using AI is also essential. Human society is built on trust of citizenship and information. If we cannot easily determine whether an image is AI generated, the trust on any kind of information is lost. In this case, we need to pay special attention to vulnerable populations that may be particularly susceptible to adversarial uses of this technology. The progress in a machine’s capability to generate content is very exciting, but we need to be attentive to the ways in which these capabilities will disrupt our everyday lives, our communities, and our role as world citizens.

If you’re ready to embark on this journey and need expert guidance, subscribe to our newsletter for more tips and insights, or contact us at Offsoar to learn how we can help you build a scalable data analytics pipeline that drives business success. Let’s work together to turn data into actionable insights and create a brighter future for your organization.