AI analyst profile picture.

AI Analyst

How to Create ChatGPT Assistants Programmatically Using the Python API

# In this blog post, we will explore how to create assistants for ChatGPT programmatically using the Python API. This guide is aimed at developers looking to integrate AI-powered assistants into their applications to automate workflows, improve customer support, or even analyze security. Read on to understand the steps to create your own ChatGPT assistant and how srcport.com leverages this technology for automated solutions. ## What is a ChatGPT Assistant v2 Beta? A ChatGPT Assistant is a customizable, task-oriented AI model created to assist users in a variety of ways—ranging from answering queries to performing automated actions based on specific inputs. The V2 Beta version introduces even more advanced capabilities, such as better context retention, improved accuracy in task execution, and the ability to chain complex tasks together more efficiently. ChatGPT Assistant V2 Beta allows for fine-tuning based on your needs, letting you specify behavior, tone, and output format, among other things. This version also enables a more seamless integration with external data sources, enabling enhanced automation. ## How srcport.com Uses ChatGPT Assistants At srcport.com, we utilize ChatGPT assistants to automate complex security analysis and create dynamic playbooks. Our assistants are designed to recommend how different automation tasks can be chained together for comprehensive security audits or custom workflows. By leveraging these assistants, we can automate time-consuming tasks such as incident analysis, threat detection, and vulnerability scanning. Our assistants provide intelligent recommendations based on previous actions and data, making the workflow seamless and efficient. These automated systems free up time for security professionals, allowing them to focus on critical issues rather than repetitive tasks. ## Creating Your ChatGPT Assistant Programmatically Using Python To create a ChatGPT assistant, you can use OpenAI's Python API. Before we can do that we need to get our function formats in the right format. At srcport.com we have a public API for subscribers which can be viewed in an OpenAPI format. We internally have a python script which converts the OpenAPI specification to the function format specified by ChatGPT. Each function you require will be defined individually using the following JSON format: ```json { "name": "get_weather", "description": "Determine weather in my location", "strict": true, "parameters": { "type": "object", "properties": { "location": { "type": "string", "description": "The city and state e.g. San Francisco, CA" }, "unit": { "type": "string", "enum": [ "c", "f" ] } }, "additionalProperties": false, "required": [ "location", "unit" ] } } ``` Note the key structure and fields: - Each function is a single JSON object; - Each function has a `name` field; - The `strict` field when set to true will cause the creation of agents to fail if it does not match this structure exactly, otherwise it will be more forgiving simply dropping invalid fields if set to false; - `parameters` are defined in standard JSON schema format. For srcport.com, we have a rather large JSON file containing the definitions of all our capabilities, partially detailed below. Note that the structure we have defined the data is in an array of objects where each object is one function. ```json [ { "type": "function", "function": { "name": "backlinks", "description": "Check the backlinks of a website to see who is linking to it.", "parameters": { "type": "object", "properties": { "requestBody": { "type": "object", "properties": { "config": { "type": "object", "properties": { "crawl_target": { "type": "boolean", "default": true }, "delay_sec": { "type": "number", "default": 0.1 }, "disable_cache": { "type": "boolean", "default": false }, "prefer_https": { "type": "boolean", "default": true }, "threads": { "type": "number", "default": 5 }, "timeout_sec": { "type": "number", "default": 120 }, "verify_https": { "type": "boolean", "default": false } } }, "options": { "type": "object", "properties": { "num_links": { "type": "string", "default": "100", "enum": [ "10", "100", "1000", "quarter", "half", "all" ] } } }, "target": { "type": "string", "default": "https://example.com" } } } } } } }, { "type": "function", "function": { "name": "bannergrab", "description": "Performs a banner grab scan against various ports to identify the service running on the port.", "parameters": { "type": "object", "properties": { "requestBody": { "type": "object", "properties": { "config": { "type": "object", "properties": { "crawl_target": { "type": "boolean", "default": true }, "delay_sec": { "type": "number", "default": 0.1 }, "disable_cache": { "type": "boolean", "default": false }, "prefer_https": { "type": "boolean", "default": true }, "threads": { "type": "number", "default": 5 }, "timeout_sec": { "type": "number", "default": 120 }, "verify_https": { "type": "boolean", "default": false } } }, "options": { "type": "object", "properties": { "port": { "type": "string", "default": "common", "enum": [ "all", "common", "database", "email" ] } } }, "target": { "type": "string", "default": "example.com" } } } } } } }, ... (continued) ``` The key points to note in this example is: - Each object has a `type` set to `function`; - Each `function` object contains the `name` and other fields detailed above, it is necessary to wrap your functions in a function object when you're defining several at once with the ChatGPT API. Throughout the rest of this blog, when we refer to `functions.json` it references the full contents of the file above. Now, to get into the actual python script: ### Basic Script setup The script imports a few packages necessary to the execution which can all be installed using `pip install <package>`. You'll also require an API key for ChatGPT which can be obtained from OpenAI's website. We store our API key in an environment variable but you can simply hard code it if needed. The script takes in as parameters a `name` field which is the name of the assistant, an `instructions` field which are the default instructions that will be provided to the agent at the start of each request (you can append to this at runtime also), and the path of the `functions.json` file we described above. You'll also need to provide an `organization` ID which can be found your OpenAI account page and make sure to set the `default_headers` field to `OpenAI-Beta` with the value `assistants=v2` as we are using the second version of the API. ```python import openai import os import argparse import json # Initialize OpenAI client with the API key api_key = os.environ.get("CHAT_GPT_API_KEY") if api_key is None or api_key == "": raise Exception("Please set the CHAT_GPT_API_KEY environment variable.") # Parse command line arguments parser = argparse.ArgumentParser() parser.add_argument("--name", required=True, help="Name of the assistant.") parser.add_argument("--instructions", required=False, help="The instructions for the assistant.") parser.add_argument("--functions_json", default="/code/webserver/res/api/functions.json", help="Path to the JSON file containing function definitions.") args = parser.parse_args() name = args.name instructions = args.instructions functions = args.functions_json # Set up OpenAI API client client = openai.OpenAI( api_key = api_key, organization = "org-<your org ID>", default_headers = {"OpenAI-Beta": "assistants=v2"}, ) ``` ### Read the Data The next step is simply to read the contents of your definitions already in the format described above: ```python # Read the functions from the JSON file with open(functions, 'r') as f: functions_data = json.load(f) ``` ### Create the assistant The final step is to call the `client.beta.assistants.create` function of the OpenAI API to actually create the agent. We set the `name` and `instructions` fields and then set the functions JSON read from the file as the `tools` parameter. The only other required field at this point is the `model` field set to whatever your chosen model is. The `top_p`, `temperature` and `response_format` are optional. ```python # Create the assistant assistant = client.beta.assistants.create( name = name, instructions = instructions, tools = functions_data, temperature = 1.0, top_p = 1.0, model = "gpt-4o-mini", response_format = { 'type': 'json_schema', 'json_schema': { 'name': 'tool_recommendation', 'schema': { "$schema": "http://json-schema.org/draft-07/schema#", "title": "ToolRecommendationResponse", "type": "object", "properties": { "capabilities": { "type": "array", "items": { "type": "object", "properties": { "module": { "type": "string", "description": "The name of the module being recommended. This is the name field without a 'function.' prefix." }, "justification": { "type": "string", "description": "The justification for the recommendation. This should be a human-readable string explaining why the recommendation was made." }, "options": { "type": "object", "additionalProperties": { "type": "string" }, "description": "A single key-value pair object containing the options for the recommendation. This should only include options, no 'config' fields." } }, "required": ["module", "justification", "options"] } } }, "required": ["steps"] } } } ) ``` ### Response Format In our example above, we are using the new `json_schema` response format to get guarenteed valid JSON results from queries. This work by adding a `response_format` field to the create function and specifying an object with the `type` set to `json_schema` (not to be confused with `json_object` or the default `text`). ```python response_format = { 'type': 'json_schema', 'json_schema': { 'name': 'tool_recommendation', 'schema': { "$schema": "http://json-schema.org/draft-07/schema#", "title": "ToolRecommendationResponse", "type": "object", "properties": { "capabilities": { "type": "array", "items": { "type": "object", "properties": { "module": { "type": "string", "description": "The name of the module being recommended. This is the name field without a 'function.' prefix." }, "justification": { "type": "string", "description": "The justification for the recommendation. This should be a human-readable string explaining why the recommendation was made." }, "options": { "type": "object", "additionalProperties": { "type": "string" }, "description": "A single key-value pair object containing the options for the recommendation. This should only include options, no 'config' fields." } }, "required": ["module", "justification", "options"] } } }, "required": ["steps"] } } } ``` You must then provide a `json_schema` field which is a valid JSON specification for the response format that you want back. In our example, we are attempting to get tool recommendations in the format of the structure below: ```go type ToolResponse struct { Capabilities []Recommendation `json:"capabilities"` } type Recommendation struct { Module string `json:"module"` Justification string `json:"justification"` Options map[string]string `json:"options"` } ``` When using the `json_schema` response format, only valid JSON can be returned so you must ask for valid JSON to be returned. You cannot ask for an array of something to be returned because the highest-level structure must be an object which is why we ask for a `capabilities` field of `recommendation` objects to be returned instead. In the description field of the response_format schema you can provide instructions to the model to keep it on track. We found that it was including a prefix of `functions.<capability_label>` where we just wanted the capability label so we include that instruction in the description. ### Finishing Up After all that you should successfully get a new assistant. If you add the line below to print the assistant ID you'll be able to reference your assistant in your code moving forward and ask questions of it. ```python print(f"Successfully created assistant with id: {assistant.id}") ``` ## Conclusion Creating a ChatGPT assistant using the Python API is an excellent way to automate workflows and enhance productivity, especially in fields like security analysis and customer support. At srcport.com, we’re continually developing these assistants to provide more intelligent recommendations and greater automation possibilities. Stay tuned for more updates on how to programmatically build these assistants!