import pandas as pdpd.options.display.max_colwidth =Nonedf = pd.read_csv("bills.csv")df
title
0
Relates to wellness programs under life and accident and health insurance policies
1
Requires certain information be provided to prospective maternity patients
2
Relates to the distribution of fines from speed violation in work zones
3
Relates to emergency response plans relating to the notification of downed wires.
Set up a function that you will call for every row. We are going to build a prompt with a fill-in-the-blank for the title name, and try our best to demand only the category name in response. We’re also using temperature=0 because we do not want the model to act “creatively.”
from openai import OpenAIclient = OpenAI(api_key="ABC123")prompt_template ="""Categorize the following legislative bill as ENVIRONMENT, HEALTHCARE, IMMIGRATION, TAXES/FINES, or OTHER. Only respond with the category name.Bill title: {text}"""def llm_request(row): prompt = prompt_template.format(text=row['title']) messages = [ { "role": "system", "content": "You are a legislative assistant."}, { "role": "user", "content": prompt} ] chat_completion = client.chat.completions.create( messages=messages, model="gpt-3.5-turbo", temperature=0 )return chat_completion.choices[0].message.content
We’ll start by testing with the first row, which has the title Relates to wellness programs under life and accident and health insurance policies.
llm_request(df.iloc[0])
'HEALTHCARE'
We can also make a fake row to test things that aren’t in our data set.
llm_request({'title': 'A bill to drill for oil on every streetcorner'})
'ENVIRONMENT'
Seems reasonable! Now let’s use it for every single row.
# Add the new columndf['category'] = df.apply(llm_request, axis=1)df
title
category
0
Relates to wellness programs under life and accident and health insurance policies
HEALTHCARE
1
Requires certain information be provided to prospective maternity patients
HEALTHCARE
2
Relates to the distribution of fines from speed violation in work zones
TAXES/FINES
3
Relates to emergency response plans relating to the notification of downed wires.
ENVIRONMENT
Looks great! You might want to force a JSON response if you want anything more complex (a list of categories, for example).