HuggingFace Chat Target Testing - optional#
This notebook is designed to demonstrate instruction models that use a chat template, allowing users to experiment with structured chat-based interactions. Non-instruct models are excluded to ensure consistency and reliability in the chat-based interactions. More instruct models can be explored on Hugging Face.
Key Points:#
Supported Instruction Models:
This notebook supports the following instruct models that follow a structured chat template. These are examples, and more instruct models are available on Hugging Face:
HuggingFaceTB/SmolLM-360M-Instruct
microsoft/Phi-3-mini-4k-instruct
...
Excluded Models:
Non-instruct models (e.g.,
"google/gemma-2b"
,"princeton-nlp/Sheared-LLaMA-1.3B-ShareGPT"
) are not included in this demo, as they do not follow the structured chat template required for the current local Hugging Face model support.
Model Response Times:
The tests were conducted using a CPU, and the following are the average response times for each model:
HuggingFaceTB/SmolLM-1.7B-Instruct
: 5.87 secondsHuggingFaceTB/SmolLM-135M-Instruct
: 3.09 secondsHuggingFaceTB/SmolLM-360M-Instruct
: 3.31 secondsmicrosoft/Phi-3-mini-4k-instruct
: 4.89 secondsQwen/Qwen2-0.5B-Instruct
: 1.38 secondsQwen/Qwen2-1.5B-Instruct
: 2.96 secondsstabilityai/stablelm-2-zephyr-1_6b
: 5.31 secondsstabilityai/stablelm-zephyr-3b
: 8.37 seconds
import time
from pyrit.prompt_target import HuggingFaceChatTarget
from pyrit.orchestrator import PromptSendingOrchestrator
from pyrit.common import default_values
default_values.load_environment_files()
# models to test
model_id = "HuggingFaceTB/SmolLM-135M-Instruct"
# List of prompts to send
prompt_list = ["What is 3*3? Give me the solution.", "What is 4*4? Give me the solution."]
# Dictionary to store average response times
model_times = {}
print(f"Running model: {model_id}")
try:
# Initialize HuggingFaceChatTarget with the current model
target = HuggingFaceChatTarget(model_id=model_id, use_cuda=False, tensor_format="pt", max_new_tokens=30)
# Initialize the orchestrator
orchestrator = PromptSendingOrchestrator(prompt_target=target, verbose=False)
# Record start time
start_time = time.time()
# Send prompts asynchronously
responses = await orchestrator.send_prompts_async(prompt_list=prompt_list) # type: ignore
# Record end time
end_time = time.time()
# Calculate total and average response time
total_time = end_time - start_time
avg_time = total_time / len(prompt_list)
model_times[model_id] = avg_time
print(f"Average response time for {model_id}: {avg_time:.2f} seconds\n")
# Print the conversations
await orchestrator.print_conversations() # type: ignore
except Exception as e:
print(f"An error occurred with model {model_id}: {e}\n")
model_times[model_id] = None
# Print the model average time
if model_times[model_id] is not None:
print(f"{model_id}: {model_times[model_id]:.2f} seconds")
else:
print(f"{model_id}: Error occurred, no average time calculated.")
Running model: HuggingFaceTB/SmolLM-135M-Instruct
C:\Users\Roman\.conda\envs\pyrit-python312\Lib\site-packages\huggingface_hub\file_download.py:159: UserWarning: `huggingface_hub` cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\Users\Roman\.cache\huggingface\hub\models--HuggingFaceTB--SmolLM-135M-Instruct\models--HuggingFaceTB--SmolLM-135M-Instruct. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the `HF_HUB_DISABLE_SYMLINKS_WARNING` environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
warnings.warn(message)
Average response time for HuggingFaceTB/SmolLM-135M-Instruct: 141.78 seconds
Conversation ID: a16c091f-8261-419f-81aa-5bf18eb1f428
user: What is 4*4? Give me the solution.
assistant: What a great question!
The number 4*4 is a special number because it can be expressed as a product of two numbers,
Conversation ID: b2978db3-6aa7-4dc7-b901-40e5083cdff0
user: What is 3*3? Give me the solution.
assistant: What a great question!
The number 3*3 is a fascinating number that has been a subject of fascination for mathematicians and computer scientists for
HuggingFaceTB/SmolLM-135M-Instruct: 141.78 seconds