Using custom system prompts with LMEval
Install the TrustyAI operator.
Then create a test namespace.
Then apply the LMEvalJob
to the test namespace.
Explanation of the Custom Resource (CR)
The Custom Resource (CR) introduces enhancements to the LMEvalJob
configuration, specifically designed to work seamlessly with unitxt, a tool for managing and evaluating text generation tasks. The CR focuses on optimizing the system prompts and task recipes to ensure that the evaluation process is both efficient and effective.
Key Features of the CR:
Custom System Prompts: The CR allows for the integration of custom system prompts that can be tailored to specific evaluation scenarios. This flexibility enables users to define the context and expectations for the model’s responses, improving the relevance and accuracy of the evaluations.
Enhanced Task Recipes: The task recipes have been updated to include more detailed instructions and formats that align with unitxt’s requirements. This ensures that the input and output formats are compatible, facilitating smoother data handling and processing.
Improved Evaluation Metrics: The CR introduces new metrics for evaluating the performance of the model based on the outputs generated in response to the custom prompts. These metrics are designed to provide deeper insights into the model’s capabilities and areas for improvement.
Integration with unitxt: The changes made in the CR are specifically aimed at enhancing compatibility with unitxt, allowing users to leverage its features for more comprehensive evaluations. This integration streamlines the workflow, making it easier to assess the model’s performance in real-world applications.
By implementing this CR, users can expect a more robust evaluation framework that not only meets the needs of their specific use cases but also enhances the overall effectiveness of the LMEvalJob in generating and assessing text outputs.
apiVersion: trustyai.opendatahub.io/v1alpha1
kind: LMEvalJob
metadata:
name: custom-card-template
namespace: tas
spec:
allowOnline: true
allowCodeExecution: true
model: hf
modelArgs:
- name: pretrained
value: google/flan-t5-base
taskList:
taskRecipes:
- template:
ref: tp_0
systemPrompt:
ref: sp_0
card:
name: "cards.wnli"
custom:
templates:
- name: tp_0
value: |
{
"__type__": "input_output_template",
"input_format": "{text_a_type}: {text_a}\n{text_b_type}: {text_b}",
"output_format": "{label}",
"target_prefix": "The {type_of_relation} class is ",
"instruction": "Given a {text_a_type} and {text_b_type} classify the {type_of_relation} of the {text_b_type} to one of {classes}.",
"postprocessors": [
"processors.take_first_non_empty_line",
"processors.lower_case_till_punc"
]
} systemPrompts:
- name: sp_0
value: "Be concise. At every point give the shortest acceptable answer."
logSamples: true