日本語

Lab 3.3 Scenario 3: Evaluate your models using Prompt Flow to keep optimizing

Overview

In this lab, you will explore the AI-powered evaluation capabilities in Azure AI Foundry and perform A/B testing with your LLM nodes to evaluate the performance of the prompt and LLM. You will learn how to create your variants, which can help you test the model’s behavior under different conditions, such as different wording, formatting, context, temperature, or top-k, compare, and find the best prompt and configuration that maximizes the model’s accuracy, diversity, or coherence.

LLMOps

🥇Other Resources

Here are the reference architectures, best practices and guidances on this topic. Please refer to the resources below.

  • https://learn.microsoft.com/en-us/azure/ai-studio/concepts/evaluation-approach-gen-ai
  • https://github.com/Azure-Samples/llm-evaluation

Table of contents


Distributed by an MIT license. This hands-on lab was developed by Microsoft AI GBB (Global Black Belt).