parent
42debdeab0
commit
591cac8bfa
12 changed files with 55 additions and 81 deletions
@ -1 +1,5 @@ |
|||||||
# Citing Sources |
# Citing Sources |
||||||
|
|
||||||
|
LLMs for the most part cannot accurately cite sources. This is because they do not have access to the Internet, and do not exactly remember where their information came from. They will frequently generate sources that look good, but are entirely inaccurate. |
||||||
|
|
||||||
|
Strategies like search augmented LLMs (LLMs that can search the Internet and other sources) can often fix this problem though. |
||||||
|
@ -1,2 +1,4 @@ |
|||||||
# Bias |
# Bias |
||||||
|
|
||||||
|
LLMs are often biased towards generating stereotypical responses. Even with safe guards in place, they will sometimes say sexist/racist/homophobic things. Be careful when using LLMs in consumer-facing applications, and also be careful when using them in research (they can generate biased results). |
||||||
|
|
||||||
|
@ -1,35 +1,3 @@ |
|||||||
# Math |
# Math |
||||||
|
|
||||||
When working with language models, it's essential to understand the challenges and limitations when incorporating mathematics. In this section, we'll discuss some common pitfalls related to math in the context of prompt engineering and provide suggestions for addressing them. |
LLMs are often bad at math. They have difficulty solving simple math problems, and they are often unable to solve more complex math problems. |
||||||
|
|
||||||
## Numerical Reasoning Limitations |
|
||||||
|
|
||||||
Language models like GPT-3 have limitations when it comes to numerical reasoning, especially with large numbers or complex calculations. They might not always provide accurate answers or interpret the numerical context correctly. |
|
||||||
|
|
||||||
**Recommendation:** For tasks that require precise numerical answers or involve complex calculations, consider using specialized math software or verifying the model's output using other means. |
|
||||||
|
|
||||||
## Ambiguous Math Questions |
|
||||||
|
|
||||||
Ambiguous or ill-defined math questions are likely to receive incorrect or nonsensical answers. Vague inputs make it challenging for the model to understand the context and provide sensible responses. |
|
||||||
|
|
||||||
**Recommendation**: Try to make math questions as clear and specific as possible. Provide sufficient context and use precise language to minimize ambiguities. |
|
||||||
|
|
||||||
## Units and Conversion |
|
||||||
|
|
||||||
Language models might not automatically take units into account or perform the necessary unit conversions when working with mathematical problems, which could result in incorrect answers. |
|
||||||
|
|
||||||
**Recommendation**: Explicitly mention the desired units and, when needed, ask the model to perform unit conversions to ensure the output aligns with the expected format or measure. |
|
||||||
|
|
||||||
## Incorrect Interpretation of Notation |
|
||||||
|
|
||||||
Mathematics often uses specialized notation or symbols that the language model might misinterpret. Especially when inputting symbols or notation that differ from the standard plain text, the risk of misunderstanding increases. |
|
||||||
|
|
||||||
**Recommendation**: Make sure to use clear and common notation when presenting math problems to the model. If necessary, explain the notation or provide alternative representations to minimize confusion. |
|
||||||
|
|
||||||
## Building on Incorrect Responses |
|
||||||
|
|
||||||
If a sequence of math problems depends on previous answers, the model might not correct its course after providing an incorrect response. This could cascade and result in multiple subsequent errors. |
|
||||||
|
|
||||||
**Recommendation**: Be cautious when using the model's output as the basis for subsequent calculations or questions. Verify the correctness of the intermediate steps before proceeding. |
|
||||||
|
|
||||||
By being aware of these math-related pitfalls and applying the recommendations, you can improve the effectiveness and accuracy of your prompts when engaging language models with mathematical tasks. |
|
||||||
|
@ -1,31 +1,27 @@ |
|||||||
# Pitfalls of LLMs |
# Pitfalls of LLMs |
||||||
|
|
||||||
## LLM Pitfalls |
LLMs are extremely powerful, but they are by no means perfect. There are many pitfalls that you should be aware of when using them. |
||||||
|
|
||||||
In this section, we'll discuss some of the common pitfalls that you might encounter when working with Language Models (LLMs), particularly in the context of prompt engineering. By understanding these pitfalls, you can more effectively develop prompts and avoid potential issues that may affect the performance and utility of your model. |
### Model Guessing Your Intentions |
||||||
|
|
||||||
### 1. Model Guessing Your Intentions |
|
||||||
|
|
||||||
Sometimes, LLMs might not fully comprehend the intent of your prompt and may generate generic or safe responses. To mitigate this, make your prompts more explicit or ask the model to think step-by-step before providing a final answer. |
Sometimes, LLMs might not fully comprehend the intent of your prompt and may generate generic or safe responses. To mitigate this, make your prompts more explicit or ask the model to think step-by-step before providing a final answer. |
||||||
|
|
||||||
### 2. Sensitivity to Prompt Phrasing |
### Sensitivity to Prompt Phrasing |
||||||
|
|
||||||
LLMs can be sensitive to the phrasing of your prompts, which might result in completely different or inconsistent responses. Ensure that your prompts are well-phrased and clear to minimize confusion. |
LLMs can be sensitive to the phrasing of your prompts, which might result in completely different or inconsistent responses. Ensure that your prompts are well-phrased and clear to minimize confusion. |
||||||
|
|
||||||
### 3. Model Generating Plausible but Incorrect Answers |
### Model Generating Plausible but Incorrect Answers |
||||||
|
|
||||||
In some cases, LLMs might generate answers that sound plausible but are actually incorrect. One way to deal with this is by adding a step for the model to verify the accuracy of its response or by prompting the model to provide evidence or a source for the given information. |
In some cases, LLMs might generate answers that sound plausible but are actually incorrect. One way to deal with this is by adding a step for the model to verify the accuracy of its response or by prompting the model to provide evidence or a source for the given information. |
||||||
|
|
||||||
### 4. Verbose or Overly Technical Responses |
### Verbose or Overly Technical Responses |
||||||
|
|
||||||
LLMs, especially larger ones, may generate responses that are unnecessarily verbose or overly technical. To avoid this, explicitly guide the model by making your prompt more specific, asking for a simpler response, or requesting a particular format. |
LLMs, especially larger ones, may generate responses that are unnecessarily verbose or overly technical. To avoid this, explicitly guide the model by making your prompt more specific, asking for a simpler response, or requesting a particular format. |
||||||
|
|
||||||
### 5. LLMs Not Asking for Clarification |
### LLMs Not Asking for Clarification |
||||||
|
|
||||||
When faced with an ambiguous prompt, LLMs might try to answer it without asking for clarification. To encourage the model to seek clarification, you can prepend your prompt with "If the question is unclear, please ask for clarification." |
When faced with an ambiguous prompt, LLMs might try to answer it without asking for clarification. To encourage the model to seek clarification, you can prepend your prompt with "If the question is unclear, please ask for clarification." |
||||||
|
|
||||||
### 6. Model Failure to Perform Multi-part Tasks |
### Model Failure to Perform Multi-part Tasks |
||||||
|
|
||||||
Sometimes, LLMs might not complete all parts of a multi-part task or might only focus on one aspect of it. To avoid this, consider breaking the task into smaller, more manageable sub-tasks or ensure that each part of the task is clearly identified in the prompt. |
Sometimes, LLMs might not complete all parts of a multi-part task or might only focus on one aspect of it. To avoid this, consider breaking the task into smaller, more manageable sub-tasks or ensure that each part of the task is clearly identified in the prompt. |
||||||
|
|
||||||
By being mindful of these pitfalls and implementing the suggested solutions, you can create more effective prompts and optimize the performance of your LLM. |
|
@ -1 +1,18 @@ |
|||||||
# Math |
# Math |
||||||
|
|
||||||
|
As a prompt engineer, you can take the following steps to improve the reliability of Language Models (LMs) for mathematical tasks: |
||||||
|
|
||||||
|
- Clear and specific prompts: Craft clear and specific prompts that provide the necessary context for the mathematical task. Specify the problem type, expected input format, and desired output format. Avoid ambiguous or vague instructions that can confuse the LM. |
||||||
|
- Formatting cues: Include formatting cues in the prompts to guide the LM on how to interpret and generate mathematical expressions. For example, use LaTeX formatting or explicit notations for mathematical symbols, equations, or variables. |
||||||
|
- Example-based prompts: Provide example-based prompts that demonstrate the desired input-output behavior. Show the model correct solutions for different problem types to help it understand the expected patterns and formats. |
||||||
|
- Step-by-step instructions: Break down complex mathematical problems into step-by-step instructions. Provide explicit instructions on how the model should approach the problem, such as defining variables, applying specific rules or formulas, or following a particular sequence of operations. |
||||||
|
- Error handling: Anticipate potential errors or misconceptions the LM might make, and explicitly instruct it on how to handle those cases. Provide guidance on common mistakes and offer corrective feedback to help the model learn from its errors. |
||||||
|
- Feedback loop: Continuously evaluate the model's responses and iterate on the prompts based on user feedback. Identify areas where the LM is consistently making errors or struggling, and modify the prompts to address those specific challenges. |
||||||
|
- Context injection: Inject additional context into the prompt to help the model better understand the problem. This can include relevant background information, specific problem constraints, or hints to guide the LM towards the correct solution. |
||||||
|
- Progressive disclosure: Gradually reveal information or subtasks to the LM, rather than providing the entire problem at once. This can help the model focus on smaller subproblems and reduce the cognitive load, leading to more reliable outputs. |
||||||
|
- Sanity checks: Include sanity checks in the prompt to verify the reasonableness of the model's output. For example, you can ask the model to show intermediate steps or validate the solution against known mathematical properties. |
||||||
|
- Fine-tuning and experimentation: Fine-tune the LM on a dataset that specifically focuses on mathematical tasks. Experiment with different prompt engineering techniques and evaluate the impact on the model's reliability. Iterate on the fine-tuning process based on the results obtained. |
||||||
|
|
||||||
|
By applying these prompt engineering strategies, you can guide the LM towards more reliable and accurate responses for mathematical tasks, improving the overall usability and trustworthiness of the model. |
||||||
|
|
||||||
|
Learn more at [learnprompting.org](https://learnprompting.org/docs/reliability/intro) |
||||||
|
@ -1 +1,9 @@ |
|||||||
# Improving Reliability |
# Improving Reliability |
||||||
|
|
||||||
|
To a certain extent, most of the previous techniques covered have to do with improving completion accuracy, and thus reliability, in particular self-consistency. However, there are a number of other techniques that can be used to improve reliability, beyond basic prompting strategies. |
||||||
|
|
||||||
|
LLMs have been found to be more reliable than we might expect at interpreting what a prompt is trying to say when responding to misspelled, badly phrased, or even actively misleading prompts. Despite this ability, they still exhibit various problems including hallucinations, flawed explanations with CoT methods, and multiple biases including majority label bias, recency bias, and common token bias. Additionally, zero-shot CoT can be particularly biased when dealing with sensitive topics. |
||||||
|
|
||||||
|
Common solutions to some of these problems include calibrators to remove a priori biases, and verifiers to score completions, as well as promoting diversity in completions. |
||||||
|
|
||||||
|
Learn more at [learnprompting.org](https://learnprompting.org/docs/reliability/intro) |
||||||
|
Loading…
Reference in new issue