Large Language Models and their ecosystems

(Part 1 of a four part series presented at the Masterclass “From chatbots to personalized research assistants: a journey through the new ecosystems related to Large Language Models” at the Medien Triennale Südwest 2023)

  • Take away message: You can “play Lego” with prompts and chains. Large Language Models can be integrated into your own programs.

In this introductory post, we will fast-forward through basic knowledge about Large Language Models, prompts and the chat model before writing the first small applications in LangChain.

What is a Large Language Model?

The term Large Language Model might seem a bit ambiguous initially due to the varying training paradigms employed for such models. For a succinct rundown of the different steps of the training process, check out Andrei Karpathy’s presentation on the state of GPT.

In the strictest sense, Large Language Models refer to deep neural networks trained with the task of predicting the next word or token within a specific context, using vast treasure troves of text data culled from the internet. This phase of training is often termed as pre-training. The models produced during this phase can function independently or serve as foundational models for additional training, where these basic models are transformed into more specialized and human-preference-aligned assistant models. The models emerging from this secondary training are usually also classified as Large Language Models.

The primary knowledge and skill-set of Large Language Models are rooted in the pre-training stage. The subsequent training steps ensure these models remain practical and safe to use.

What are Large Language Models learning?

When it comes to discerning what Large Language Models essentially learn, the jury is still out. While there are still unanswered questions about the depth or breadth of representations and connections these models develop, there’s increasing evidence that the current generation of Large Language Models extend beyond just being stochastic parrots or blurry compressed images of the web. It seems increasingly plausible that Large Language Models learn some form of linguistically mediated world model in order to successfully solve the seemingly simple task of predicting the next token. However, this does not imply that Large Language Models have a human-level understanding of language and the world.

Why are Large Language Models useful?

Large Language Models are remarkably adept at interpreting ambiguous human language, processing it, and translating it into convincing, natural linguistic output. This unique ability makes them incredibly useful in developing conversational interfaces like chatbots, as well as performing text-based tasks like creating summaries, translating content, altering linguistic styles, among others.

Their soft human-like reasoning capabilities facilitate the integration of diverse IT systems via natural language programming interfaces. Generally, and as you’ll see in the subsequent examples in this series, Large Language Models can play a pivotal role in the planning, creation, and execution of tasks in highly flexible information retrieval systems.

The Anatomy of a Prompt

The hottest new programming language is English

Andrej Karpathy, Source

Karpathy references an insight from the GPT-3 paper demonstrating how Large Language Models, such as GPT-3, achieve in-context learning. Essentially, these models can be “programmed” by using a prompt to execute a variety of tasks. The “programming” is articulated using natural language, predominantly English.

This means that rather than creating specialized models to tackle your specific issue, you can exercise control over the Large Language Model via the art of prompting, guiding it towards the resolution of your problem.

A prompt generally includes at least one instruction or query. However, they often embody a more complex arrangement, combining instructions or queries with a handful of illustrative examples, the relevant parts of the conversation history, and a so-called context which supplies (retrieved) data upon which the Large Language Model performs its functions.

Understanding the fundamentals of prompt. Source: https://medium.com/mlearning-ai/i-scanned-1000-prompts-so-you-dont-have-to-10-need-to-know-techniques-a77bcd074d97

The length of prompts, as well as their incorporated examples, history, and context, are measured in token count, which is significantly capped. For instance, the popular gpt-3.5-turbo model confines its context window to a maximum of 4,097 tokens. Tokens can be likened to fragments of words, wherein a general rule of thumb equates around 100 tokens to approximately 75 words.

Translation from text to tokens. Source: https://platform.openai.com/tokenizer

The strict constraints placed on prompt length necessitate the deployment of new strategies for the efficient selection of examples and the extraction of the most pivotal parts of contextual and historical data. Frameworks such as LangChain are striving to build an LLM-centric environment to meet these exact needs.

OpenAI Playground and the Chat Model

When you interact with ChatGPT, your input serves as a crucial component within a tripartite “Chat Model”—composed of a hidden System prompt, User prompts (your contribution), and the Assistant’s responses. OpenAI’s API users can sneak a peek behind the curtain of ChatGPT through the use of the Playground, a tool which allows developers to prototype and evaluate their prompt designs.

The System prompt can be used to give initial instructions to guide the LLM’s behavior. This, for instance, allows customization of a Chatbot’s tone, style, and tasks when responding to User prompts, by attributing it with specific roles and areas of expertise.

Example of OpenAI Playground

Taming the Stochastic Parrot

Large Language Models can be utilized to enhance various types of applications. The LangChain library facilitates developers in this process by providing flexible abstractions and a comprehensive set of tools. Essentially, LangChain offers an ecosystem that caters to programming with Large Language Models:

  • It provides multiple ways to load, store and retrieve your own data
  • It assists in selecting the best examples for few-shot prompts
  • It employs varied strategies to manage short-term and long-term conversation histories
  • It enables the creation of reusable building blocks of LLM-logic, referred to as chains

The LangChain logo 🦜️🔗 cleverly illustrates its purpose – it symbolizes the need to control the unpredictable element, the metaphorical ‘stochastical parrot,’ inherent in all Large Language Models.

Moreover, LangChain serves a crucial role: it abstracts the complexities of specific models and vendors. This abstraction simplifies the process of swapping models and various ecosystem components, seamlessly transitioning from development to production-scale systems.

Building Blocks of LLM-based Apps

The cornerstone components of LangChain encompass:

  • The harmonization with large language models from diverse providers, including locally-operated open-source models
  • Elements that enable storage and query of data in vector databases
  • Tools for data retrieval from various sources
  • An integrated prompt template system and example selection for few-shot prompts
  • Chains designed for encapsulating prompts and chaining several requests to large language models
  • Varied strategies to implement a Memory component, aiming to manage state across individual requests
  • A multiplicity of approaches to actualize so-called Agents, i.e., Chains that determine when and which given tools they wish to utilize to solve a task
LangChain Components. Source: https://www.langchain.com/

Hello, LangChain!

Simple Chain

Sketch of simple chain example

One of the simplest applications in LangChain is a simple chain that wraps a large language model and a prompt template. The resulting chain can then be called like a function with arbitrary values for the defined parameters. The chain takes care of filling in the prompt template, requesting the Large Language Model and processing the response. This creates a building block that can be used for larger programs.

# Cf. https://python.langchain.com/docs/modules/chains/

from langchain.chat_models import ChatOpenAI
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain

llm = ChatOpenAI(temperature=0.9, model="gpt-3.5-turbo-0613")

company_name_prompt = PromptTemplate(
            template="What is a good name for a company that makes {product}?",
            input_variables=["product"],
        )

company_name_chain = LLMChain(llm=llm, prompt=company_name_prompt)

print(company_name_chain.run("colorful socks"))
Tinted Toes

Combining two separate Chains

The simplest combination of two chains is the execution in sequence. In this example, there are two independent chains that are executed one after the other. One chain generates a summary of a play, depending on a title. Another chain generates a review from a summary of a play. The combined chain links the individual chains and passes on the intermediate results to the next chain. This enables simple sequential programs composed of simple queries to Large Language Models.

from langchain.chains import SimpleSequentialChain

# Cf. https://python.langchain.com/docs/modules/chains/foundational/sequential_chains
# This is an LLMChain to write a synopsis given a title of a play.

llm = ChatOpenAI(temperature=0.7, model="gpt-3.5-turbo-0613")

synopsis_template = """You are a playwright. 
Given the title of play, it is your job to write a synopsis for that title.

Title: {title}
Playwright: This is a synopsis for the above play:"""
synopsis_prompt_template = PromptTemplate(input_variables=["title"], template=synopsis_template)
synopsis_chain = LLMChain(llm=llm, prompt=synopsis_prompt_template)

# This is an LLMChain to write a review of a play given a synopsis.
review_template = """You are a play critic from the New York Times. 
Given the synopsis of play, it is your job to write a review for that play.

Play Synopsis:
{synopsis}
Review from a New York Times play critic of the above play:"""
prompt_template = PromptTemplate(input_variables=["synopsis"], template=review_template)
review_chain = LLMChain(llm=llm, prompt=prompt_template)

overall_chain = SimpleSequentialChain(chains=[synopsis_chain, review_chain], verbose=True)

print(overall_chain.run("A Parrot in Chains"))
In "A Parrot in Chains," playwright Jane Smith takes audiences on a captivating and thought-provoking exploration of captivity, freedom, and self-discovery. Set in a vibrant marketplace, the play centers around the extraordinary journey of Polly, a captivating and enchanting parrot who finds herself confined in a small cage, stripped of her natural habitat.

Smith's storytelling is both poignant and relatable, as she skillfully weaves together themes of resilience, perseverance, and the power of hope. Through encounters with a wise old owl, a compassionate flower vendor, and a rebellious street artist, Polly embarks on a quest for liberation, gradually igniting a spark within her that fuels her desire to break free from the chains that bind her.

The play's exploration of both literal and metaphorical captivity is striking. As Polly faces numerous challenges and confronts her deepest fears, audiences are prompted to reflect on their own forms of imprisonment, whether it be societal expectations, personal insecurities, or oppressive systems. Smith's use of symbolism is particularly powerful, as Polly's triumphant escape becomes a metaphor for liberation, inspiring viewers to question their own chains and consider the transformative power of breaking free.

The characters in "A Parrot in Chains" are richly developed and compelling, each representing different facets of human nature. Polly's transformation from a withered bird to a determined and resilient creature is beautifully portrayed, and the supporting characters provide depth and nuance that adds to the play's overall impact.

Visually, the production is stunning, with vibrant set designs and costumes that transport audiences into the bustling marketplace. The play's touch of whimsy adds an element of enchantment, capturing the audience's imagination and creating a sense of wonder throughout.

Smith's writing is both poetic and accessible, allowing audiences of all ages to engage with the play's themes and connect with the characters. The dialogue is filled with poignant conversations and thought-provoking reflections, creating moments of emotional resonance that linger long after the final curtain.

Overall, "A Parrot in Chains" is a truly captivating theatrical experience. Smith's exploration of captivity, freedom, and self-discovery is both timely and timeless, inviting viewers to reflect on their own chains and embrace the transformative power of breaking free. With its compelling characters, rich symbolism, and touch of whimsy, this play is a must-see for theater enthusiasts seeking a thought-provoking and emotionally resonant experience.