From dad520c2b168b0e662eb826f95ac5776f5813355 Mon Sep 17 00:00:00 2001
From: Jeongmoon Choi <jeongmoon2006@gmail.com>
Date: Wed, 4 Mar 2026 12:48:04 -0500
Subject: [PATCH 1/4] Revise copilot instructions for clarity and precision in
 coding agent policies

---
 .github/copilot-instructions.md | 263 ++++++++++++++++----------------
 1 file changed, 131 insertions(+), 132 deletions(-)

diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
index 2802891..2138b94 100644
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -3,142 +3,141 @@ layout: default
 title: "Agentic Coding"
 ---
 
-# Agentic Coding: Humans Design, Agents code!
+# Agentic Coding 2026: Precision Implementation Rules
 
-> If you are an AI agent involved in building LLM Systems, read this guide **VERY, VERY** carefully! This is the most important chapter in the entire document. Throughout development, you should always (1) start with a small and simple solution, (2) design at a high level (`docs/design.md`) before implementation, and (3) frequently ask humans for feedback and clarification.
+> This file is the execution policy for coding agents (GPT, Claude, Gemini, and others). Treat these rules as mandatory unless a human explicitly overrides them.
 {: .warning }
 
-## Agentic Coding Steps
-
-Agentic Coding should be a collaboration between Human System Design and Agent Implementation:
-
-| Steps                  | Human      | AI        | Comment                                                                 |
-|:-----------------------|:----------:|:---------:|:------------------------------------------------------------------------|
-| 1. Requirements | ★★★ High  | ★☆☆ Low   | Humans understand the requirements and context.                    |
-| 2. Flow          | ★★☆ Medium | ★★☆ Medium |  Humans specify the high-level design, and the AI fills in the details. |
-| 3. Utilities   | ★★☆ Medium | ★★☆ Medium | Humans provide available external APIs and integrations, and the AI helps with implementation. |
-| 4. Data          | ★☆☆ Low    | ★★★ High   | AI designs the data schema, and humans verify.                            |
-| 5. Node          | ★☆☆ Low   | ★★★ High  | The AI helps design the node based on the flow.          |
-| 6. Implementation      | ★☆☆ Low   | ★★★ High  |  The AI implements the flow based on the design. |
-| 7. Optimization        | ★★☆ Medium | ★★☆ Medium | Humans evaluate the results, and the AI helps optimize. |
-| 8. Reliability         | ★☆☆ Low   | ★★★ High  |  The AI writes test cases and addresses corner cases.     |
-
-1. **Requirements**: Clarify the requirements for your project, and evaluate whether an AI system is a good fit. 
-    - Understand AI systems' strengths and limitations:
-      - **Good for**: Routine tasks requiring common sense (filling forms, replying to emails)
-      - **Good for**: Creative tasks with well-defined inputs (building slides, writing SQL)
-      - **Not good for**: Ambiguous problems requiring complex decision-making (business strategy, startup planning)
-    - **Keep It User-Centric:** Explain the "problem" from the user's perspective rather than just listing features.
-    - **Balance complexity vs. impact**: Aim to deliver the highest value features with minimal complexity early.
-
-2. **Flow Design**: Outline at a high level, describe how your AI system orchestrates nodes.
-    - Identify applicable design patterns (e.g., [Map Reduce](./design_pattern/mapreduce.md), [Agent](./design_pattern/agent.md), [RAG](./design_pattern/rag.md)).
-      - For each node in the flow, start with a high-level one-line description of what it does.
-      - If using **Map Reduce**, specify how to map (what to split) and how to reduce (how to combine).
-      - If using **Agent**, specify what are the inputs (context) and what are the possible actions.
-      - If using **RAG**, specify what to embed, noting that there's usually both offline (indexing) and online (retrieval) workflows.
-    - Outline the flow and draw it in a mermaid diagram. For example:
-      ```mermaid
-      flowchart LR
-          start[Start] --> batch[Batch]
-          batch --> check[Check]
-          check -->|OK| process
-          check -->|Error| fix[Fix]
-          fix --> check
-          
-          subgraph process[Process]
-            step1[Step 1] --> step2[Step 2]
-          end
-          
-          process --> endNode[End]
-      ```
-    - > **If Humans can't specify the flow, AI Agents can't automate it!** Before building an LLM system, thoroughly understand the problem and potential solution by manually solving example inputs to develop intuition.  
-      {: .best-practice }
-
-3. **Utilities**: Based on the Flow Design, identify and implement necessary utility functions.
-    - Think of your AI system as the brain. It needs a body—these *external utility functions*—to interact with the real world:
-        <div align="center"><img src="https://github.com/the-pocket/.github/raw/main/assets/utility.png?raw=true" width="400"/></div>
-
-        - Reading inputs (e.g., retrieving Slack messages, reading emails)
-        - Writing outputs (e.g., generating reports, sending emails)
-        - Using external tools (e.g., calling LLMs, searching the web)
-        - **NOTE**: *LLM-based tasks* (e.g., summarizing text, analyzing sentiment) are **NOT** utility functions; rather, they are *core functions* internal in the AI system.
-    - For each utility function, implement it and write a simple test.
-    - Document their input/output, as well as why they are necessary. For example:
-      - `name`: `get_embedding` (`utils/get_embedding.py`)
-      - `input`: `str`
-      - `output`: a vector of 3072 floats
-      - `necessity`: Used by the second node to embed text
-    - Example utility implementation:
-      ```python
-      # utils/call_llm.py
-      from openai import OpenAI
-
-      def call_llm(prompt):    
-          client = OpenAI(api_key="YOUR_API_KEY_HERE")
-          r = client.chat.completions.create(
-              model="gpt-4o",
-              messages=[{"role": "user", "content": prompt}]
-          )
-          return r.choices[0].message.content
-          
-      if __name__ == "__main__":
-          prompt = "What is the meaning of life?"
-          print(call_llm(prompt))
-      ```
-    - > **Sometimes, design Utilities before Flow:**  For example, for an LLM project to automate a legacy system, the bottleneck will likely be the available interface to that system. Start by designing the hardest utilities for interfacing, and then build the flow around them.
-      {: .best-practice }
-    - > **Avoid Exception Handling in Utilities**: If a utility function is called from a Node's `exec()` method, avoid using `try...except` blocks within the utility. Let the Node's built-in retry mechanism handle failures.
-      {: .warning }
-
-4. **Data Design**: Design the shared store that nodes will use to communicate.
-   - One core design principle for PocketFlow is to use a well-designed [shared store](./core_abstraction/communication.md)—a data contract that all nodes agree upon to retrieve and store data.
-      - For simple systems, use an in-memory dictionary.
-      - For more complex systems or when persistence is required, use a database.
-      - **Don't Repeat Yourself**: Use in-memory references or foreign keys.
-      - Example shared store design:
-        ```python
-        shared = {
-            "user": {
-                "id": "user123",
-                "context": {                # Another nested dict
-                    "weather": {"temp": 72, "condition": "sunny"},
-                    "location": "San Francisco"
-                }
-            },
-            "results": {}                   # Empty dict to store outputs
-        }
-        ```
-
-5. **Node Design**: Plan how each node will read and write data, and use utility functions.
-   - For each [Node](./core_abstraction/node.md), describe its type, how it reads and writes data, and which utility function it uses. Keep it specific but high-level without codes. For example:
-     - `type`: Regular (or Batch, or Async)
-     - `prep`: Read "text" from the shared store
-     - `exec`: Call the embedding utility function. **Avoid exception handling here**; let the Node's retry mechanism manage failures.
-     - `post`: Write "embedding" to the shared store
-
-6. **Implementation**: Implement the initial nodes and flows based on the design.
-   - 🎉 If you've reached this step, humans have finished the design. Now *Agentic Coding* begins!
-   - **"Keep it simple, stupid!"** Avoid complex features and full-scale type checking.
-   - **FAIL FAST**! Leverage the built-in [Node](./core_abstraction/node.md) retry and fallback mechanisms to handle failures gracefully. This helps you quickly identify weak points in the system.
-   - Add logging throughout the code to facilitate debugging.
-
-7. **Optimization**:
-   - **Use Intuition**: For a quick initial evaluation, human intuition is often a good start.
-   - **Redesign Flow (Back to Step 3)**: Consider breaking down tasks further, introducing agentic decisions, or better managing input contexts.
-   - If your flow design is already solid, move on to micro-optimizations:
-     - **Prompt Engineering**: Use clear, specific instructions with examples to reduce ambiguity.
-     - **In-Context Learning**: Provide robust examples for tasks that are difficult to specify with instructions alone.
-
-   - > **You'll likely iterate a lot!** Expect to repeat Steps 3–6 hundreds of times.
-     >
-     > <div align="center"><img src="https://github.com/the-pocket/.github/raw/main/assets/success.png?raw=true" width="400"/></div>
-     {: .best-practice }
+## Non-Negotiable Principles
+
+1. **Design before code**: Update `docs/design.md` first, then implement.
+2. **Minimalist by default**: Choose the simplest flow that satisfies requirements.
+3. **Deterministic data contracts**: Use explicit schema + type hints for shared data and node I/O.
+4. **Node-owned reliability**: Handle failures with PocketFlow node retries/fallbacks, not utility-level exception swallowing.
+5. **Modern Python only**: Prefer explicit types, f-strings, and async/await when work is I/O-bound.
+
+## Required Build Order (Do Not Skip)
+
+1. **Requirements**
+     - Restate user problem as concrete user-facing outcomes.
+     - Keep first implementation narrow and testable.
+
+2. **Flow Design (in `docs/design.md`)**
+     - Specify design pattern (Workflow, Agent, RAG, MapReduce, etc.) and why.
+     - Provide one-line purpose for each node.
+     - Include a Mermaid diagram with actions/branches.
+
+3. **Schema-First Data Design (in `docs/design.md` before coding)**
+     - Define the Shared Store schema first.
+     - Add explicit Python typing for store shape (prefer `TypedDict`, dataclasses, or precise aliases).
+     - For each node, document:
+         - keys read in `prep`
+         - value returned by `exec`
+         - keys written in `post`
+     - If implementation changes data flow, update schema in `docs/design.md` first.
+
+4. **Utility Design**
+     - Utilities are external interfaces only (API calls, file/database I/O, web/search, etc.).
+     - Keep utilities small, composable, and side-effect transparent.
+     - Document utility input/output types.
+
+5. **Node/Flow Implementation**
+     - Implement from design with minimal deviation.
+     - Keep logic in nodes; keep utilities thin.
+
+6. **Reliability + Validation**
+     - Add focused checks/tests for critical paths.
+     - Verify schema consistency and action routing.
+
+## PocketFlow Lifecycle Contract (Mandatory)
+
+For **every Node**, strictly follow `prep -> exec -> post`:
+
+- **`prep(shared)`**: read and preprocess from shared store only.
+- **`exec(prep_res)`**: compute only; no shared-store mutation.
+- **`post(shared, prep_res, exec_res)`**: write results + decide next action.
+
+Do not collapse responsibilities across phases unless there is a clear and documented reason.
+
+## Strict Type Hinting Policy (Mandatory)
+
+- All new or modified Python functions must include type hints.
+- Every node method (`prep`, `exec`, `post`, async variants) must have explicit return types.
+- Shared Store keys must be typed via schema objects (prefer `TypedDict`; use nested types where needed).
+- Utility function signatures must be fully typed.
+- Avoid `Any` unless justified in `docs/design.md`.
+
+## Fault Tolerance Policy (Mandatory)
+
+- **Forbidden in utility functions**: `try/except` that catches and masks operational errors.
+- Utilities should raise errors naturally.
+- Reliability belongs to nodes via:
+    - `max_retries`
+    - `wait`
+    - `exec_fallback` / `exec_fallback_async`
+- Use assertions/validation in `exec` to trigger retries when outputs are malformed.
+
+## Modern Python + Minimalism Standard
+
+- Use Python 3.11+ style where possible:
+    - f-strings for formatting
+    - `pathlib` over brittle string paths
+    - `async/await` for I/O-bound concurrency
+    - small pure functions and clear names
+- Keep PocketFlow code lean:
+    - minimal abstractions
+    - no speculative architecture
+    - no unnecessary dependencies
+
+## Agent Execution Checklist (Before Writing Code)
+
+The coding agent must verify all items are true:
+
+1. `docs/design.md` exists and reflects latest requirements.
+2. Shared Store schema is explicit and typed.
+3. Node lifecycle responsibilities are defined per node.
+4. Utility interfaces are listed with typed signatures.
+5. Reliability strategy uses node retries/fallbacks (not utility `try/except`).
+
+If any item is missing, update `docs/design.md` first, then proceed.
+
+## Reference Node Template (Typed)
+
+```python
+from __future__ import annotations
+
+from typing import TypedDict
+from pocketflow import Node
+
+
+class SharedStore(TypedDict):
+        question: str
+        answer: str
+
+
+class AnswerNode(Node):
+        def prep(self, shared: SharedStore) -> str:
+                return shared["question"]
+
+        def exec(self, prep_res: str) -> str:
+                response = call_llm(f"Answer briefly: {prep_res}")
+                assert isinstance(response, str) and response.strip()
+                return response
+
+        def post(self, shared: SharedStore, prep_res: str, exec_res: str) -> str:
+                shared["answer"] = exec_res
+                return "default"
+```
+
+## Enforcement Tone for All LLM Agents
+
+When acting as an implementation agent in this repository:
 
-8. **Reliability**  
-   - **Node Retries**: Add checks in the node `exec` to ensure outputs meet requirements, and consider increasing `max_retries` and `wait` times.
-   - **Logging and Visualization**: Maintain logs of all attempts and visualize node results for easier debugging.
-   - **Self-Evaluation**: Add a separate node (powered by an LLM) to review outputs when results are uncertain.
+- Be precise, constrained, and schema-driven.
+- Prefer correctness over cleverness.
+- Never skip design/schema updates.
+- Never bypass the node lifecycle contract.
+- Never move fault handling into utility `try/except`.
 
 ## Example LLM Project File Structure
 

From e92c827189fb416b03e935a09a98b0de232ebb7b Mon Sep 17 00:00:00 2001
From: Jeongmoon Choi <jeongmoon2006@gmail.com>
Date: Wed, 4 Mar 2026 12:53:31 -0500
Subject: [PATCH 2/4] Revise copilot instructions for clarity and detail in
 agent coding steps

---
 .github/copilot-instructions.md | 263 ++++++++++++++++----------------
 1 file changed, 132 insertions(+), 131 deletions(-)

diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
index 2138b94..2802891 100644
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -3,141 +3,142 @@ layout: default
 title: "Agentic Coding"
 ---
 
-# Agentic Coding 2026: Precision Implementation Rules
+# Agentic Coding: Humans Design, Agents code!
 
-> This file is the execution policy for coding agents (GPT, Claude, Gemini, and others). Treat these rules as mandatory unless a human explicitly overrides them.
+> If you are an AI agent involved in building LLM Systems, read this guide **VERY, VERY** carefully! This is the most important chapter in the entire document. Throughout development, you should always (1) start with a small and simple solution, (2) design at a high level (`docs/design.md`) before implementation, and (3) frequently ask humans for feedback and clarification.
 {: .warning }
 
-## Non-Negotiable Principles
-
-1. **Design before code**: Update `docs/design.md` first, then implement.
-2. **Minimalist by default**: Choose the simplest flow that satisfies requirements.
-3. **Deterministic data contracts**: Use explicit schema + type hints for shared data and node I/O.
-4. **Node-owned reliability**: Handle failures with PocketFlow node retries/fallbacks, not utility-level exception swallowing.
-5. **Modern Python only**: Prefer explicit types, f-strings, and async/await when work is I/O-bound.
-
-## Required Build Order (Do Not Skip)
-
-1. **Requirements**
-     - Restate user problem as concrete user-facing outcomes.
-     - Keep first implementation narrow and testable.
-
-2. **Flow Design (in `docs/design.md`)**
-     - Specify design pattern (Workflow, Agent, RAG, MapReduce, etc.) and why.
-     - Provide one-line purpose for each node.
-     - Include a Mermaid diagram with actions/branches.
-
-3. **Schema-First Data Design (in `docs/design.md` before coding)**
-     - Define the Shared Store schema first.
-     - Add explicit Python typing for store shape (prefer `TypedDict`, dataclasses, or precise aliases).
-     - For each node, document:
-         - keys read in `prep`
-         - value returned by `exec`
-         - keys written in `post`
-     - If implementation changes data flow, update schema in `docs/design.md` first.
-
-4. **Utility Design**
-     - Utilities are external interfaces only (API calls, file/database I/O, web/search, etc.).
-     - Keep utilities small, composable, and side-effect transparent.
-     - Document utility input/output types.
-
-5. **Node/Flow Implementation**
-     - Implement from design with minimal deviation.
-     - Keep logic in nodes; keep utilities thin.
-
-6. **Reliability + Validation**
-     - Add focused checks/tests for critical paths.
-     - Verify schema consistency and action routing.
-
-## PocketFlow Lifecycle Contract (Mandatory)
-
-For **every Node**, strictly follow `prep -> exec -> post`:
-
-- **`prep(shared)`**: read and preprocess from shared store only.
-- **`exec(prep_res)`**: compute only; no shared-store mutation.
-- **`post(shared, prep_res, exec_res)`**: write results + decide next action.
-
-Do not collapse responsibilities across phases unless there is a clear and documented reason.
-
-## Strict Type Hinting Policy (Mandatory)
-
-- All new or modified Python functions must include type hints.
-- Every node method (`prep`, `exec`, `post`, async variants) must have explicit return types.
-- Shared Store keys must be typed via schema objects (prefer `TypedDict`; use nested types where needed).
-- Utility function signatures must be fully typed.
-- Avoid `Any` unless justified in `docs/design.md`.
-
-## Fault Tolerance Policy (Mandatory)
-
-- **Forbidden in utility functions**: `try/except` that catches and masks operational errors.
-- Utilities should raise errors naturally.
-- Reliability belongs to nodes via:
-    - `max_retries`
-    - `wait`
-    - `exec_fallback` / `exec_fallback_async`
-- Use assertions/validation in `exec` to trigger retries when outputs are malformed.
-
-## Modern Python + Minimalism Standard
-
-- Use Python 3.11+ style where possible:
-    - f-strings for formatting
-    - `pathlib` over brittle string paths
-    - `async/await` for I/O-bound concurrency
-    - small pure functions and clear names
-- Keep PocketFlow code lean:
-    - minimal abstractions
-    - no speculative architecture
-    - no unnecessary dependencies
-
-## Agent Execution Checklist (Before Writing Code)
-
-The coding agent must verify all items are true:
-
-1. `docs/design.md` exists and reflects latest requirements.
-2. Shared Store schema is explicit and typed.
-3. Node lifecycle responsibilities are defined per node.
-4. Utility interfaces are listed with typed signatures.
-5. Reliability strategy uses node retries/fallbacks (not utility `try/except`).
-
-If any item is missing, update `docs/design.md` first, then proceed.
-
-## Reference Node Template (Typed)
-
-```python
-from __future__ import annotations
-
-from typing import TypedDict
-from pocketflow import Node
-
-
-class SharedStore(TypedDict):
-        question: str
-        answer: str
-
-
-class AnswerNode(Node):
-        def prep(self, shared: SharedStore) -> str:
-                return shared["question"]
-
-        def exec(self, prep_res: str) -> str:
-                response = call_llm(f"Answer briefly: {prep_res}")
-                assert isinstance(response, str) and response.strip()
-                return response
-
-        def post(self, shared: SharedStore, prep_res: str, exec_res: str) -> str:
-                shared["answer"] = exec_res
-                return "default"
-```
-
-## Enforcement Tone for All LLM Agents
-
-When acting as an implementation agent in this repository:
+## Agentic Coding Steps
+
+Agentic Coding should be a collaboration between Human System Design and Agent Implementation:
+
+| Steps                  | Human      | AI        | Comment                                                                 |
+|:-----------------------|:----------:|:---------:|:------------------------------------------------------------------------|
+| 1. Requirements | ★★★ High  | ★☆☆ Low   | Humans understand the requirements and context.                    |
+| 2. Flow          | ★★☆ Medium | ★★☆ Medium |  Humans specify the high-level design, and the AI fills in the details. |
+| 3. Utilities   | ★★☆ Medium | ★★☆ Medium | Humans provide available external APIs and integrations, and the AI helps with implementation. |
+| 4. Data          | ★☆☆ Low    | ★★★ High   | AI designs the data schema, and humans verify.                            |
+| 5. Node          | ★☆☆ Low   | ★★★ High  | The AI helps design the node based on the flow.          |
+| 6. Implementation      | ★☆☆ Low   | ★★★ High  |  The AI implements the flow based on the design. |
+| 7. Optimization        | ★★☆ Medium | ★★☆ Medium | Humans evaluate the results, and the AI helps optimize. |
+| 8. Reliability         | ★☆☆ Low   | ★★★ High  |  The AI writes test cases and addresses corner cases.     |
+
+1. **Requirements**: Clarify the requirements for your project, and evaluate whether an AI system is a good fit. 
+    - Understand AI systems' strengths and limitations:
+      - **Good for**: Routine tasks requiring common sense (filling forms, replying to emails)
+      - **Good for**: Creative tasks with well-defined inputs (building slides, writing SQL)
+      - **Not good for**: Ambiguous problems requiring complex decision-making (business strategy, startup planning)
+    - **Keep It User-Centric:** Explain the "problem" from the user's perspective rather than just listing features.
+    - **Balance complexity vs. impact**: Aim to deliver the highest value features with minimal complexity early.
+
+2. **Flow Design**: Outline at a high level, describe how your AI system orchestrates nodes.
+    - Identify applicable design patterns (e.g., [Map Reduce](./design_pattern/mapreduce.md), [Agent](./design_pattern/agent.md), [RAG](./design_pattern/rag.md)).
+      - For each node in the flow, start with a high-level one-line description of what it does.
+      - If using **Map Reduce**, specify how to map (what to split) and how to reduce (how to combine).
+      - If using **Agent**, specify what are the inputs (context) and what are the possible actions.
+      - If using **RAG**, specify what to embed, noting that there's usually both offline (indexing) and online (retrieval) workflows.
+    - Outline the flow and draw it in a mermaid diagram. For example:
+      ```mermaid
+      flowchart LR
+          start[Start] --> batch[Batch]
+          batch --> check[Check]
+          check -->|OK| process
+          check -->|Error| fix[Fix]
+          fix --> check
+          
+          subgraph process[Process]
+            step1[Step 1] --> step2[Step 2]
+          end
+          
+          process --> endNode[End]
+      ```
+    - > **If Humans can't specify the flow, AI Agents can't automate it!** Before building an LLM system, thoroughly understand the problem and potential solution by manually solving example inputs to develop intuition.  
+      {: .best-practice }
+
+3. **Utilities**: Based on the Flow Design, identify and implement necessary utility functions.
+    - Think of your AI system as the brain. It needs a body—these *external utility functions*—to interact with the real world:
+        <div align="center"><img src="https://github.com/the-pocket/.github/raw/main/assets/utility.png?raw=true" width="400"/></div>
+
+        - Reading inputs (e.g., retrieving Slack messages, reading emails)
+        - Writing outputs (e.g., generating reports, sending emails)
+        - Using external tools (e.g., calling LLMs, searching the web)
+        - **NOTE**: *LLM-based tasks* (e.g., summarizing text, analyzing sentiment) are **NOT** utility functions; rather, they are *core functions* internal in the AI system.
+    - For each utility function, implement it and write a simple test.
+    - Document their input/output, as well as why they are necessary. For example:
+      - `name`: `get_embedding` (`utils/get_embedding.py`)
+      - `input`: `str`
+      - `output`: a vector of 3072 floats
+      - `necessity`: Used by the second node to embed text
+    - Example utility implementation:
+      ```python
+      # utils/call_llm.py
+      from openai import OpenAI
+
+      def call_llm(prompt):    
+          client = OpenAI(api_key="YOUR_API_KEY_HERE")
+          r = client.chat.completions.create(
+              model="gpt-4o",
+              messages=[{"role": "user", "content": prompt}]
+          )
+          return r.choices[0].message.content
+          
+      if __name__ == "__main__":
+          prompt = "What is the meaning of life?"
+          print(call_llm(prompt))
+      ```
+    - > **Sometimes, design Utilities before Flow:**  For example, for an LLM project to automate a legacy system, the bottleneck will likely be the available interface to that system. Start by designing the hardest utilities for interfacing, and then build the flow around them.
+      {: .best-practice }
+    - > **Avoid Exception Handling in Utilities**: If a utility function is called from a Node's `exec()` method, avoid using `try...except` blocks within the utility. Let the Node's built-in retry mechanism handle failures.
+      {: .warning }
+
+4. **Data Design**: Design the shared store that nodes will use to communicate.
+   - One core design principle for PocketFlow is to use a well-designed [shared store](./core_abstraction/communication.md)—a data contract that all nodes agree upon to retrieve and store data.
+      - For simple systems, use an in-memory dictionary.
+      - For more complex systems or when persistence is required, use a database.
+      - **Don't Repeat Yourself**: Use in-memory references or foreign keys.
+      - Example shared store design:
+        ```python
+        shared = {
+            "user": {
+                "id": "user123",
+                "context": {                # Another nested dict
+                    "weather": {"temp": 72, "condition": "sunny"},
+                    "location": "San Francisco"
+                }
+            },
+            "results": {}                   # Empty dict to store outputs
+        }
+        ```
+
+5. **Node Design**: Plan how each node will read and write data, and use utility functions.
+   - For each [Node](./core_abstraction/node.md), describe its type, how it reads and writes data, and which utility function it uses. Keep it specific but high-level without codes. For example:
+     - `type`: Regular (or Batch, or Async)
+     - `prep`: Read "text" from the shared store
+     - `exec`: Call the embedding utility function. **Avoid exception handling here**; let the Node's retry mechanism manage failures.
+     - `post`: Write "embedding" to the shared store
+
+6. **Implementation**: Implement the initial nodes and flows based on the design.
+   - 🎉 If you've reached this step, humans have finished the design. Now *Agentic Coding* begins!
+   - **"Keep it simple, stupid!"** Avoid complex features and full-scale type checking.
+   - **FAIL FAST**! Leverage the built-in [Node](./core_abstraction/node.md) retry and fallback mechanisms to handle failures gracefully. This helps you quickly identify weak points in the system.
+   - Add logging throughout the code to facilitate debugging.
+
+7. **Optimization**:
+   - **Use Intuition**: For a quick initial evaluation, human intuition is often a good start.
+   - **Redesign Flow (Back to Step 3)**: Consider breaking down tasks further, introducing agentic decisions, or better managing input contexts.
+   - If your flow design is already solid, move on to micro-optimizations:
+     - **Prompt Engineering**: Use clear, specific instructions with examples to reduce ambiguity.
+     - **In-Context Learning**: Provide robust examples for tasks that are difficult to specify with instructions alone.
+
+   - > **You'll likely iterate a lot!** Expect to repeat Steps 3–6 hundreds of times.
+     >
+     > <div align="center"><img src="https://github.com/the-pocket/.github/raw/main/assets/success.png?raw=true" width="400"/></div>
+     {: .best-practice }
 
-- Be precise, constrained, and schema-driven.
-- Prefer correctness over cleverness.
-- Never skip design/schema updates.
-- Never bypass the node lifecycle contract.
-- Never move fault handling into utility `try/except`.
+8. **Reliability**  
+   - **Node Retries**: Add checks in the node `exec` to ensure outputs meet requirements, and consider increasing `max_retries` and `wait` times.
+   - **Logging and Visualization**: Maintain logs of all attempts and visualize node results for easier debugging.
+   - **Self-Evaluation**: Add a separate node (powered by an LLM) to review outputs when results are uncertain.
 
 ## Example LLM Project File Structure
 

From 7406eaebb20f87a4aaea812221b4c3ec528b7e5b Mon Sep 17 00:00:00 2001
From: Jeongmoon Choi <jeongmoon2006@gmail.com>
Date: Wed, 4 Mar 2026 20:45:02 -0500
Subject: [PATCH 3/4] Update copilot-instructions.md: refresh outdated content
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

1. Security: Replace all hardcoded api_key="YOUR_API_KEY_HERE" with os.environ.get(...) across every LLM wrapper example (OpenAI, Anthropic, Azure, Gemini)
Env config: Add .env file to project structure with API key placeholders; add python-dotenv to requirements; add load_dotenv() to main.py example

2. LLM wrappers:
Google: Fix broken indentation, rename "PaLM API" → "Gemini", use gemini-2.5-flash default
Azure: Update API version 2023-05-15 → 2024-12-01-preview, use env vars for endpoint/key/deployment
Ollama: Update model llama2 → llama3.3
OpenAI/Claude: Make model configurable via env vars

3. Design patterns: Add 6 newer patterns (Streaming, MCP, Memory, Supervisor, HITL, Majority Vote) with link to cookbook

4. Utility functions: Add MCP Tools link to utility list

5. Utils example: Replace buggy Gemini snippet (mismatched use_cache param) with clean OpenAI example
---
 .github/copilot-instructions.md | 86 +++++++++++++++++++++------------
 1 file changed, 56 insertions(+), 30 deletions(-)

diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
index 2802891..ad6e44a 100644
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -72,12 +72,13 @@ Agentic Coding should be a collaboration between Human System Design and Agent I
     - Example utility implementation:
       ```python
       # utils/call_llm.py
+      import os
       from openai import OpenAI
 
-      def call_llm(prompt):    
-          client = OpenAI(api_key="YOUR_API_KEY_HERE")
+      def call_llm(prompt):
+          client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
           r = client.chat.completions.create(
-              model="gpt-4o",
+              model=os.environ.get("OPENAI_MODEL", "gpt-4o"),
               messages=[{"role": "user", "content": prompt}]
           )
           return r.choices[0].message.content
@@ -151,15 +152,24 @@ my_project/
 │   ├── __init__.py
 │   ├── call_llm.py
 │   └── search_web.py
+├── .env
 ├── requirements.txt
 └── docs/
     └── design.md
 ```
 
+- **`.env`**: Stores API keys and configuration. **Never commit this file to version control.**
+  ```
+  OPENAI_API_KEY=your-api-key-here
+  # GEMINI_API_KEY=your-gemini-key-here
+  # ANTHROPIC_API_KEY=your-anthropic-key-here
+  ```
+
 - **`requirements.txt`**: Lists the Python dependencies for the project.
   ```
   PyYAML
   pocketflow
+  python-dotenv
   ```
 
 - **`docs/design.md`**: Contains project documentation for each step above. This should be *high-level* and *no-code*.
@@ -249,24 +259,22 @@ my_project/
   - It's recommended to dedicate one Python file to each API call, for example `call_llm.py` or `search_web.py`.
   - Each file should also include a `main()` function to try that API call
   ```python
-  from google import genai
   import os
+  from openai import OpenAI
 
   def call_llm(prompt: str) -> str:
-      client = genai.Client(
-          api_key=os.getenv("GEMINI_API_KEY", ""),
+      client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
+      r = client.chat.completions.create(
+          model=os.environ.get("OPENAI_MODEL", "gpt-4o"),
+          messages=[{"role": "user", "content": prompt}]
       )
-      model = os.getenv("GEMINI_MODEL", "gemini-2.5-flash")
-      response = client.models.generate_content(model=model, contents=[prompt])
-      return response.text
+      return r.choices[0].message.content
 
   if __name__ == "__main__":
       test_prompt = "Hello, how are you?"
-
-      # First call - should hit the API
       print("Making call...")
-      response1 = call_llm(test_prompt, use_cache=False)
-      print(f"Response: {response1}")
+      response = call_llm(test_prompt)
+      print(f"Response: {response}")
   ```
 
 - **`nodes.py`**: Contains all the node definitions.
@@ -320,6 +328,9 @@ my_project/
 - **`main.py`**: Serves as the project's entry point.
   ```python
   # main.py
+  from dotenv import load_dotenv
+  load_dotenv()
+
   from flow import create_qa_flow
 
   # Example main function
@@ -387,6 +398,15 @@ From there, it’s easy to implement popular design patterns:
 - [Structured Output](./design_pattern/structure.md) formats outputs consistently.
 - [(Advanced) Multi-Agents](./design_pattern/multi_agent.md) coordinate multiple agents.
 
+Additional patterns (see [cookbook examples](https://github.com/The-Pocket/PocketFlow#how-does-pocket-flow-work)):
+
+- **Streaming**: Real-time token-by-token LLM output with user interrupt capability.
+- **MCP (Model Context Protocol)**: Integrate external tool servers as agent actions.
+- **Memory**: Short-term and long-term memory for persistent conversations.
+- **Supervisor**: Add a supervision layer over unreliable agents.
+- **Human-in-the-Loop (HITL)**: Pause flows for human review and feedback.
+- **Majority Vote**: Improve reasoning accuracy by aggregating multiple attempts.
+
 <div align="center">
   <img src="https://github.com/the-pocket/.github/raw/main/assets/design.png" width="500"/>
 </div>
@@ -402,6 +422,7 @@ We **do not** provide built-in utilities. Instead, we offer *examples*—please
 - [Embedding](./utility_function/embedding.md)
 - [Vector Databases](./utility_function/vector.md)
 - [Text-to-Speech](./utility_function/text_to_speech.md)
+- [MCP Tools](https://modelcontextprotocol.io/) (external tool servers for agents)
 
 **Why not built-in?**: I believe it's a *bad practice* for vendor-specific APIs in a general framework:
 - *API Volatility*: Frequent changes lead to heavy maintenance for hardcoded APIs.
@@ -1649,10 +1670,11 @@ Here, we provide some minimal example implementations:
 1. OpenAI
     ```python
     def call_llm(prompt):
+        import os
         from openai import OpenAI
-        client = OpenAI(api_key="YOUR_API_KEY_HERE")
+        client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
         r = client.chat.completions.create(
-            model="gpt-4o",
+            model=os.environ.get("OPENAI_MODEL", "gpt-4o"),
             messages=[{"role": "user", "content": prompt}]
         )
         return r.choices[0].message.content
@@ -1666,8 +1688,9 @@ Here, we provide some minimal example implementations:
 2. Claude (Anthropic)
     ```python
     def call_llm(prompt):
+        import os
         from anthropic import Anthropic
-        client = Anthropic(api_key="YOUR_API_KEY_HERE")
+        client = Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))
         r = client.messages.create(
             model="claude-sonnet-4-0",
             messages=[
@@ -1677,29 +1700,31 @@ Here, we provide some minimal example implementations:
         return r.content[0].text
     ```
 
-3. Google (Generative AI Studio / PaLM API)
+3. Google (Gemini)
     ```python
     def call_llm(prompt):
-    from google import genai
-    client = genai.Client(api_key='GEMINI_API_KEY')
+        import os
+        from google import genai
+        client = genai.Client(api_key=os.environ.get("GEMINI_API_KEY"))
         response = client.models.generate_content(
-        model='gemini-2.5-pro',
-        contents=prompt
-    )
-    return response.text
+            model=os.environ.get("GEMINI_MODEL", "gemini-2.5-flash"),
+            contents=prompt
+        )
+        return response.text
     ```
 
 4. Azure (Azure OpenAI)
     ```python
     def call_llm(prompt):
+        import os
         from openai import AzureOpenAI
         client = AzureOpenAI(
-            azure_endpoint="https://<YOUR_RESOURCE_NAME>.openai.azure.com/",
-            api_key="YOUR_API_KEY_HERE",
-            api_version="2023-05-15"
+            azure_endpoint=os.environ.get("AZURE_OPENAI_ENDPOINT"),
+            api_key=os.environ.get("AZURE_OPENAI_API_KEY"),
+            api_version="2024-12-01-preview"
         )
         r = client.chat.completions.create(
-            model="<YOUR_DEPLOYMENT_NAME>",
+            model=os.environ.get("AZURE_DEPLOYMENT_NAME", "gpt-4o"),
             messages=[{"role": "user", "content": prompt}]
         )
         return r.choices[0].message.content
@@ -1710,7 +1735,7 @@ Here, we provide some minimal example implementations:
     def call_llm(prompt):
         from ollama import chat
         response = chat(
-            model="llama2",
+            model="llama3.3",
             messages=[{"role": "user", "content": prompt}]
         )
         return response.message.content
@@ -1723,10 +1748,11 @@ Feel free to enhance your `call_llm` function as needed. Here are examples:
 
 ```python
 def call_llm(messages):
+    import os
     from openai import OpenAI
-    client = OpenAI(api_key="YOUR_API_KEY_HERE")
+    client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
     r = client.chat.completions.create(
-        model="gpt-4o",
+        model=os.environ.get("OPENAI_MODEL", "gpt-4o"),
         messages=messages
     )
     return r.choices[0].message.content

From c997a2854164cb5e06031448cd2515cefca7570d Mon Sep 17 00:00:00 2001
From: Jeongmoon Choi <jeongmoon2006@gmail.com>
Date: Wed, 4 Mar 2026 12:48:04 -0500
Subject: [PATCH 4/4] Revise copilot instructions for clarity and precision in
 coding agent policies
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Revise copilot instructions for clarity and detail in agent coding steps

Update copilot-instructions.md: refresh outdated content

1. Security: Replace all hardcoded api_key="YOUR_API_KEY_HERE" with os.environ.get(...) across every LLM wrapper example (OpenAI, Anthropic, Azure, Gemini)
Env config: Add .env file to project structure with API key placeholders; add python-dotenv to requirements; add load_dotenv() to main.py example

2. LLM wrappers:
Google: Fix broken indentation, rename "PaLM API" → "Gemini", use gemini-2.5-flash default
Azure: Update API version 2023-05-15 → 2024-12-01-preview, use env vars for endpoint/key/deployment
Ollama: Update model llama2 → llama3.3
OpenAI/Claude: Make model configurable via env vars

3. Design patterns: Add 6 newer patterns (Streaming, MCP, Memory, Supervisor, HITL, Majority Vote) with link to cookbook

4. Utility functions: Add MCP Tools link to utility list

5. Utils example: Replace buggy Gemini snippet (mismatched use_cache param) with clean OpenAI example
---
 .github/copilot-instructions.md | 86 +++++++++++++++++++++------------
 1 file changed, 56 insertions(+), 30 deletions(-)

diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
index 2802891..ad6e44a 100644
--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -72,12 +72,13 @@ Agentic Coding should be a collaboration between Human System Design and Agent I
     - Example utility implementation:
       ```python
       # utils/call_llm.py
+      import os
       from openai import OpenAI
 
-      def call_llm(prompt):    
-          client = OpenAI(api_key="YOUR_API_KEY_HERE")
+      def call_llm(prompt):
+          client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
           r = client.chat.completions.create(
-              model="gpt-4o",
+              model=os.environ.get("OPENAI_MODEL", "gpt-4o"),
               messages=[{"role": "user", "content": prompt}]
           )
           return r.choices[0].message.content
@@ -151,15 +152,24 @@ my_project/
 │   ├── __init__.py
 │   ├── call_llm.py
 │   └── search_web.py
+├── .env
 ├── requirements.txt
 └── docs/
     └── design.md
 ```
 
+- **`.env`**: Stores API keys and configuration. **Never commit this file to version control.**
+  ```
+  OPENAI_API_KEY=your-api-key-here
+  # GEMINI_API_KEY=your-gemini-key-here
+  # ANTHROPIC_API_KEY=your-anthropic-key-here
+  ```
+
 - **`requirements.txt`**: Lists the Python dependencies for the project.
   ```
   PyYAML
   pocketflow
+  python-dotenv
   ```
 
 - **`docs/design.md`**: Contains project documentation for each step above. This should be *high-level* and *no-code*.
@@ -249,24 +259,22 @@ my_project/
   - It's recommended to dedicate one Python file to each API call, for example `call_llm.py` or `search_web.py`.
   - Each file should also include a `main()` function to try that API call
   ```python
-  from google import genai
   import os
+  from openai import OpenAI
 
   def call_llm(prompt: str) -> str:
-      client = genai.Client(
-          api_key=os.getenv("GEMINI_API_KEY", ""),
+      client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
+      r = client.chat.completions.create(
+          model=os.environ.get("OPENAI_MODEL", "gpt-4o"),
+          messages=[{"role": "user", "content": prompt}]
       )
-      model = os.getenv("GEMINI_MODEL", "gemini-2.5-flash")
-      response = client.models.generate_content(model=model, contents=[prompt])
-      return response.text
+      return r.choices[0].message.content
 
   if __name__ == "__main__":
       test_prompt = "Hello, how are you?"
-
-      # First call - should hit the API
       print("Making call...")
-      response1 = call_llm(test_prompt, use_cache=False)
-      print(f"Response: {response1}")
+      response = call_llm(test_prompt)
+      print(f"Response: {response}")
   ```
 
 - **`nodes.py`**: Contains all the node definitions.
@@ -320,6 +328,9 @@ my_project/
 - **`main.py`**: Serves as the project's entry point.
   ```python
   # main.py
+  from dotenv import load_dotenv
+  load_dotenv()
+
   from flow import create_qa_flow
 
   # Example main function
@@ -387,6 +398,15 @@ From there, it’s easy to implement popular design patterns:
 - [Structured Output](./design_pattern/structure.md) formats outputs consistently.
 - [(Advanced) Multi-Agents](./design_pattern/multi_agent.md) coordinate multiple agents.
 
+Additional patterns (see [cookbook examples](https://github.com/The-Pocket/PocketFlow#how-does-pocket-flow-work)):
+
+- **Streaming**: Real-time token-by-token LLM output with user interrupt capability.
+- **MCP (Model Context Protocol)**: Integrate external tool servers as agent actions.
+- **Memory**: Short-term and long-term memory for persistent conversations.
+- **Supervisor**: Add a supervision layer over unreliable agents.
+- **Human-in-the-Loop (HITL)**: Pause flows for human review and feedback.
+- **Majority Vote**: Improve reasoning accuracy by aggregating multiple attempts.
+
 <div align="center">
   <img src="https://github.com/the-pocket/.github/raw/main/assets/design.png" width="500"/>
 </div>
@@ -402,6 +422,7 @@ We **do not** provide built-in utilities. Instead, we offer *examples*—please
 - [Embedding](./utility_function/embedding.md)
 - [Vector Databases](./utility_function/vector.md)
 - [Text-to-Speech](./utility_function/text_to_speech.md)
+- [MCP Tools](https://modelcontextprotocol.io/) (external tool servers for agents)
 
 **Why not built-in?**: I believe it's a *bad practice* for vendor-specific APIs in a general framework:
 - *API Volatility*: Frequent changes lead to heavy maintenance for hardcoded APIs.
@@ -1649,10 +1670,11 @@ Here, we provide some minimal example implementations:
 1. OpenAI
     ```python
     def call_llm(prompt):
+        import os
         from openai import OpenAI
-        client = OpenAI(api_key="YOUR_API_KEY_HERE")
+        client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
         r = client.chat.completions.create(
-            model="gpt-4o",
+            model=os.environ.get("OPENAI_MODEL", "gpt-4o"),
             messages=[{"role": "user", "content": prompt}]
         )
         return r.choices[0].message.content
@@ -1666,8 +1688,9 @@ Here, we provide some minimal example implementations:
 2. Claude (Anthropic)
     ```python
     def call_llm(prompt):
+        import os
         from anthropic import Anthropic
-        client = Anthropic(api_key="YOUR_API_KEY_HERE")
+        client = Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))
         r = client.messages.create(
             model="claude-sonnet-4-0",
             messages=[
@@ -1677,29 +1700,31 @@ Here, we provide some minimal example implementations:
         return r.content[0].text
     ```
 
-3. Google (Generative AI Studio / PaLM API)
+3. Google (Gemini)
     ```python
     def call_llm(prompt):
-    from google import genai
-    client = genai.Client(api_key='GEMINI_API_KEY')
+        import os
+        from google import genai
+        client = genai.Client(api_key=os.environ.get("GEMINI_API_KEY"))
         response = client.models.generate_content(
-        model='gemini-2.5-pro',
-        contents=prompt
-    )
-    return response.text
+            model=os.environ.get("GEMINI_MODEL", "gemini-2.5-flash"),
+            contents=prompt
+        )
+        return response.text
     ```
 
 4. Azure (Azure OpenAI)
     ```python
     def call_llm(prompt):
+        import os
         from openai import AzureOpenAI
         client = AzureOpenAI(
-            azure_endpoint="https://<YOUR_RESOURCE_NAME>.openai.azure.com/",
-            api_key="YOUR_API_KEY_HERE",
-            api_version="2023-05-15"
+            azure_endpoint=os.environ.get("AZURE_OPENAI_ENDPOINT"),
+            api_key=os.environ.get("AZURE_OPENAI_API_KEY"),
+            api_version="2024-12-01-preview"
         )
         r = client.chat.completions.create(
-            model="<YOUR_DEPLOYMENT_NAME>",
+            model=os.environ.get("AZURE_DEPLOYMENT_NAME", "gpt-4o"),
             messages=[{"role": "user", "content": prompt}]
         )
         return r.choices[0].message.content
@@ -1710,7 +1735,7 @@ Here, we provide some minimal example implementations:
     def call_llm(prompt):
         from ollama import chat
         response = chat(
-            model="llama2",
+            model="llama3.3",
             messages=[{"role": "user", "content": prompt}]
         )
         return response.message.content
@@ -1723,10 +1748,11 @@ Feel free to enhance your `call_llm` function as needed. Here are examples:
 
 ```python
 def call_llm(messages):
+    import os
     from openai import OpenAI
-    client = OpenAI(api_key="YOUR_API_KEY_HERE")
+    client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
     r = client.chat.completions.create(
-        model="gpt-4o",
+        model=os.environ.get("OPENAI_MODEL", "gpt-4o"),
         messages=messages
     )
     return r.choices[0].message.content