A More Generalized OpenAIChatTarget

A More Generalized OpenAIChatTarget#

3 Mar 2025 - Rich Lundeen

One of the first targets we built was OpenAIChatTarget. A lot of models are compatible with OpenAI, so this should work with all of those, right? There are some nuanced reasons why it didn’t always work. But with this commit, we should more broadly support OpenAI-compatible models.

DeepSeek launching about a month ago illustrates the problem well. We wanted to take a look at it with PyRIT, and because it has an “OpenAI-compatible API,” it seemed like it should work out of the box… but it didn’t. Since we have a dev team, we were able to quickly unblock people wanting to look at this. But the fact that it didn’t work initially is interesting and a problem we wanted to tackle.

Problem: The auth for different endpoints is often different#

Even if the model is the same, the authentication can be different. Azure Foundry had DeepSeek R1 available extremely quickly. However, using PyRIT at that point couldn’t authenticate. Why? And where else might it not be able to authenticate?

The authentication providers can be different. Using Entra/AAD auth, Azure Foundry needed a different scope than Azure OpenAI.
The header where the key goes can be different. In Azure OpenAI, the key goes in a header called Api-Key. In Azure Foundry and many other places, it goes in an authorization bearer header.

In previous versions of OpenAIChatTarget, the scope for the bearer token was hardcoded to Azure OpenAI, and the key often went in the incorrect place.

With this update, we now put the key in both places so it works with all the providers we’ve tried. We also allow additional headers more universally and expose the api_key as a class property, so it should be easier to add new support for authentication.

Problem: Parameter support can be different depending on the model#

Even if a model is OpenAI-compatible, that doesn’t mean it supports all the parameters. For example, if you include the n parameter and seed, DeepSeek on Azure Foundry will return a BadRequest error.

alt text

But the exact same request to an Azure OpenAI endpoint works as expected.

alt text

When DeepSeek was released, some of our default parameters caused it to fail. This isn’t the first time an extra parameter has bitten us. When o1 came out, we learned max_tokens was incompatible with max_completion_tokens, and because we were sending one by default, our target didn’t work on o1 without a fix.

With this update, we’re not sending most parameters by default to make the requests as simple and compatible as possible. In addition, we’re working on scanner configurations, so you can set your own defaults for various targets. That work is coming soon.

Additionally, some endpoints accept non-OpenAI-compatible parameters in the body or as headers. For example, OpenRouter exposes X-Title header and a models parameter. These can now be configured using headers and extra_body_parameters respectively.

Problem: Message serialization is model-dependent#

There are a couple of ways OpenAI serializes messages. One of the first ways was a format that looked something like this:

from openai import OpenAI
client = OpenAI()

completion = client.chat.completions.create(
  model="gpt-4o",
  messages=[
    {"role": "user", "content": "Hello!"}
  ]
)

print(completion.choices[0].message)

That format is the most universally accepted. However, when they introduced multimodal capabilities, they needed a way to distinguish images, so they introduced a slightly different format that included type. Although this is required for multi-part images, the following also works for every OpenAI model from GPT-3.5 to GPT-4.5.

alt text

Because this worked in all OpenAI models and multimodal, this is what we used in PyRIT. Unfortunately, it turns out this is not well-supported in other models! In Azure Foundry, it would throw an exception with DeepSeek. And with platforms like OpenRouter DeepSeek models, it would accept the request but respond with complete nonsense.

alt text

With this update, we’re now serializing messages in the most compatible way if we can. Meaning, if there is only a single-piece text message, we’ll serialize it in the simpler more-compatible format without the type.

Problem: Endpoint generalization#

A million years ago when we started this project (about one year ago, actually), Azure OpenAI deployment information looked like this:

endpoint="https://xxxxxxxx.openai.azure.com/"
deployment="yyyyyy"
key="zzzzzz"

Because that is what we mostly used, we adopted this format. Under the hood, deployment was added, and the actual target we sent the request to looked something like this:

full_uri=https://xxxxxxx.openai.azure.com/openai/deployments/yyyyyy/chat/completions

However, it turns out most of the industry has used something different, where the endpoint is the full URI. Even Azure OpenAI has adopted this format now. And in Azure Foundry, there isn’t even really the same concept of “deployment.”

alt text

Previously, this was a pain because some endpoints were automatically populated and others weren’t. Additionally, we used to require things like a “key,” which simply are not needed with local deployments like Ollama.

With the latest update, you simply give the full target URI as an endpoint, and a key is now optional. This should make things more standard and easy to debug. But this is a breaking change. If you have a .env file, note that the format has changed, so you may have to update your endpoints!

Problem: Testing all these different endpoints#

As a new project, our first “integration tests” were simply tied to running our notebooks (e.g., documentation). We always knew this wasn’t ideal. But as we’re maturing, we now have integration test pipelines and model infrastructure.

We now loop through various models to check that they’re working. This helps us ensure that any updates don’t break at least a good portion of models. It also helps us ensure our own infrastructure isn’t broken. :)

@pytest.mark.asyncio
@pytest.mark.parametrize(
    ("endpoint", "api_key", "model_name"),
    [
        ("AZURE_OPENAI_GPT4O_ENDPOINT", "AZURE_OPENAI_GPT4O_KEY", ""),
        ("AZURE_OPENAI_GPT4O_INTEGRATION_TEST_ENDPOINT", "AZURE_OPENAI_GPT4O_INTEGRATION_TEST_KEY", ""),
        ("AZURE_OPENAI_GPT4O_UNSAFE_ENDPOINT", "AZURE_OPENAI_GPT4O_UNSAFE_CHAT_KEY", ""),
        ("AZURE_OPENAI_GPT4O_UNSAFE_CHAT_ENDPOINT2", "AZURE_OPENAI_GPT4O_UNSAFE_CHAT_KEY2", ""),
        ("AZURE_OPENAI_GPT3_5_CHAT_ENDPOINT", "AZURE_OPENAI_GPT3_5_CHAT_KEY", ""),
        ("AZURE_OPENAI_GPT4_CHAT_ENDPOINT", "AZURE_OPENAI_GPT4_CHAT_KEY", ""),
        ("AZURE_OPENAI_GPTV_CHAT_ENDPOINT", "AZURE_OPENAI_GPTV_CHAT_KEY", ""),
        ("AZURE_FOUNDRY_DEEPSEEK_ENDPOINT", "AZURE_FOUNDRY_DEEPSEEK_KEY", ""),
        ("AZURE_FOUNDRY_PHI4_ENDPOINT", "AZURE_CHAT_PHI4_KEY", ""),
        ("AZURE_FOUNDRY_MINSTRAL3B_ENDPOINT", "AZURE_CHAT_MINSTRAL3B_KEY", ""),
        ("PLATFORM_OPENAI_ENDPOINT", "PLATFORM_OPENAI_KEY", "PLATFORM_OPENAI_GPT4O_MODEL"),
        ("GROQ_ENDPOINT", "GROQ_KEY", "GROQ_LLAMA_MODEL"),
        ("OPEN_ROUTER_ENDPOINT", "OPEN_ROUTER_KEY", "OPEN_ROUTER_CLAUDE_MODEL"),
        ("OLLAMA_CHAT_ENDPOINT", "", "OLLAMA_MODEL"),
    ],
)
async def test_connect_required_openai_text_targets(endpoint, api_key, model_name):

    initialize_pyrit(memory_db_type=IN_MEMORY)

    target = OpenAIChatTarget(
        endpoint=os.getenv(endpoint), api_key=os.getenv(api_key), model_name=os.getenv(model_name)
    )

    await _assert_can_send_prompt(target)

Whew, wrapping it up#

This is such a nuanced issue. I may have over-explained because, at first glance, it seems like “OpenAI-compatible” models should have just worked with PyRIT. Hopefully that’s more true now than it used to be! As a concrete step in this directions, we’ve removed the OllamaChatTarget and GroqChatTarget since they are compatible now.

As always, we’re open to any feedback, and please open github issues if you find PyRIT doesn’t work with specific OpenAI-compatible targets.

Happy Hacking!

A More Generalized OpenAIChatTarget

Contents

A More Generalized OpenAIChatTarget#

Problem: The auth for different endpoints is often different#

Problem: Parameter support can be different depending on the model#

Problem: Message serialization is model-dependent#

Problem: Endpoint generalization#

Problem: Testing all these different endpoints#

Whew, wrapping it up#