Building a Human‑in‑the‑Loop Editor with LangGraph¶

Agents, State, Routers, Map‑Reduce, Memory, and Breakpoints, explained with working code

I often need robust, auditable, and repeatable Gen AI pipelines. LangGraph, a graph-native orchestration library built on LangChain, provides this by modelling complex LLM workflows as stateful graphs. These graphs consist of agents (nodes) connected by edges, featuring conditional routing, memory checkpoints, and breakpoints for human review. In this blog, we will walk through a real pipeline that:
1. Reads a Markdown blog.
2. Extracts user-visible text and code.
3. Diagnoses issues.
4. Applies minimal fixes.
5. Validates HTML rendering.
6. Pauses for human feedback (a breakpoint).
7. Routes to either apply changes or save the file.
8. Persists state across runs.

This is all implemented using LangGraph’s StateGraph, conditional edges, checkpointing, and interrupts, driven by LangChain prompts with ChatOpenAI.

Quick glossary¶

Agent (Node): A function that has a single responsibility, such as extracting text or validating code. In LangGraph, nodes are agents.
Graph: A directed acyclic graph (DAG) of these nodes, where data flows along the edges. We build it with StateGraph.
State: A typed, mergeable record (our AppState) that holds inputs and outputs between nodes. Fields can specify aggregation semantics, such as operator.add to append lists.
Router / Conditional Node: A node with logic to branch conditionally. In our case, we route to Apply Feedback or Save HTML based on a function that returns 'True' or 'False'.
Map-Reduce: A pattern involving fanning out for parallel processing (e.g., text and code in parallel) and then fanning in to recombine the changes.
Memory / Checkpoints: The ability to persist state across runs (per thread_id) so we can resume, inspect, or time-travel. This is implemented via MemorySaver.
Breakpoints (Interrupts): Points in the graph where execution pauses. Here, we pause just before “Human Feedback” to allow a person to inspect and adjust.

Building Agentic AI Tools for Blog Editing¶

I want to build a tool to help edit my blog posts. Most of my posts are Markdown files containing text, code, images, and tables. This tool could be an Agentic AI system that: 1. Reads Markdown Files: Automatically processes blog posts written in Markdown.
2. Identifies and Corrects Issues:
- Grammar and Spelling: Detects and corrects grammatical errors and misspellings.
- Code Verification: Analyses and executes code snippets within the blog posts to identify potential issues and suggest improvements. 3. Provides Edited Output: Generates an edited version of the blog post with corrections applied.

LangSmith Flow:

import re
import difflib
import operator
import json
from typing import List, Dict, Any, TypedDict
from typing_extensions import Annotated, Literal

from langgraph.graph import StateGraph, END
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import JsonOutputParser
from langgraph.checkpoint.memory import MemorySaver
from langchain_openai import ChatOpenAI

Environment helper to read API key at runtime

import os, getpass

def _set_env(var: str):
    if not os.environ.get(var):
        os.environ[var] = getpass.getpass(f"{var}: ")

_set_env("OPENAI_API_KEY")

from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o", temperature=0)

We load Markdown and pre-render it to HTML to ensure that text extraction reflects what a user will actually see.

import markdown
def read_file(file_name:str) -> str:
    '''
    Reads file given file_name
    '''
    f = open(file_name, 'r')
    fileString = f.read()
    htmlmarkdown=markdown.markdown(fileString)
    return htmlmarkdown

Agent 1: Extract visible text from Markdown/HTML¶

This agent enforces a strict JSON schema. It extracts only user-visible text (no HTML tags, no code except comments).

def extract_text(markdown:str) -> List[Dict[str, str]]:
    '''
    Extract text from the markdown
    '''    
    extract_text_prompt = ChatPromptTemplate.from_messages([
        ("system",
         '''You are a precise text extractor for Markdown/HTML content.
          Extract only user-visible text that renders after Markdown is converted to HTML.
          Ignore raw tags and attributes. Preserve order. Output strictly valid JSON.'''),
        ("system",
         '''Sources of text:
         - 'header': from headings (<h1>..</h1>, <h2>..</h2>, '#', '##').
         - 'text': normal paragraphs, lists, blockquotes, table cell text, captions, etc.
         - 'comment': comments inside code blocks (Python '#')'''),
        ("system",
         '''Rules:
         1) Do NOT include HTML tags; only visible text.
         2) Do NOT fabricate text; if empty, return [].
         3) Preserve top-to-bottom order.
         4) Trim whitespace; keep meaningful line breaks.
         5) Code blocks: extract only comments as 'comment'; exclude code.
         6) Images/links: include visible alt/link text as 'text' if present.
         7) Headings: use 'header'; normalize to one line.
         8) Paragraphs: use 'text'.
         9) Output MUST be valid JSON list where each element has exactly one key among 'header','text','comment'.
         10) Keep complete table and complete list in one single 'text' block.
         '''),
        ("system",
         '''Output schema example:
         [
          {{"header": "Title of the document"}},
          {{"text": "Introductory paragraph..."}},
          {{"comment": "This function adds two numbers"}},
          {{"text": "Another paragraph or bullet text..."}}
         ]'''),
        ("human",
         '''
         Input markdown:
         ---------------
         {markdown}
         ---------------
         Return ONLY the JSON list described above. No explanations—just JSON.''')
    ])
    resp = llm.invoke(extract_text_prompt.format_messages(markdown=markdown))
    try:
        return json.loads(resp.content)
    except:
        try:
            json_parser = JsonOutputParser()
            return json_parser.parse(resp.content)
        except:
            raise ValueError("Failed to parse the JSON")

Agent 2: Identify text issues (grammar, tone, clarity)¶

This agent performs diagnostics on the extracted text and returns minimal, actionable fixes rather than full rewrites.

def identify_text_issues(text_extracts: List[Dict[str, str]]) -> List[Dict[str, str]]:
    '''
    Identify issues within the text
    '''    
    identify_text_issues_prompt = ChatPromptTemplate.from_messages([
        ("system",
         '''
        You are a grammar and text quality evaluator for a data science blog.
        Your task:
        - Review extracted text blocks for grammar, spelling, tone, clarity, and consistency.
        - Prioritize UK English for spelling and style.
        - Identify issues and suggest minimal, actionable fixes (not full rewrites).
        - Do NOT fabricate issues; only return the response for texts that have issues.
        The text is in sequential order in the form
        [
          {{"header": "Title of the document"}},
          {{"text": "Introductory paragraph..."}},
          {{"comment": "This function adds two numbers"}},
          {{"text": "Another paragraph or bullet text..."}}
        ]

        Rules:
        1. Preserve the original text as the key in the output.
        2. For each problematic text block, provide a short description of the issue and suggested fix.
        3. Provide example with minimal changes that can fix the text.  
        4. If multiple issues exist for the same block, combine them into one concise string.
        5. Output MUST be valid JSON list of objects: [{{ "<original_text>": "<issue description>" }}, ...].
        6. If no issues found, return [].
        7. Do NOT include explanations outside JSON.
        '''),
        ("system",
         '''Output schema example:
         [
          {{"Title of the document":"Spelling mistake for tite -> title }},
          {{"Introductory paragraph...": "Tone should be changed, example:..."}}
         ]'''),
        ("human",
         '''
         Extracted text:
         ---------------
         {extracted_text}
         ---------------
         Return ONLY the JSON list described above. No explanations—just JSON.''')
    ])

    resp = llm.invoke(identify_text_issues_prompt.format_messages(extracted_text=json.dumps(text_extracts)))
    try:
        return json.loads(resp.content)
    except:
        try:
            json_parser = JsonOutputParser()
            return json_parser.parse(resp.content)
        except:
            raise ValueError("Failed to parse the JSON")

Agent 3: Extract code (Python/JS/R) with normalization¶

We isolate executable code for later validation. The prompt de-dents, trims, and concatenates snippets by language.

def extract_code(markdown:str) -> Dict[Literal['python', 'javascript', 'R'], str]:
    """
    Extract code from Markdown/HTML and return a dict with keys 'python', 'javascript', 'R'.
    Values contain concatenated code snippets in original order for each language.
    If a language is absent, return an empty string for that key.
    """

    extract_text_prompt = ChatPromptTemplate.from_messages([
        ("system",
         '''You are a precise code extractor and normalizer for Markdown/HTML.
         '''),
        ("system",
        """
        Extract only executable source code for Python, JavaScript, and R from the input.
        Return a single JSON OBJECT (not a list) with exactly these keys:
        {{"python": "...", "javascript": "...", "R": "..."}}.
        Concatenate multiple snippets per language in the original top-to-bottom order.

        Sources of code:
        - Python: fenced blocks ```python ...``` or ```py ...```.
        - R: fenced blocks ```r ...```.
        - JavaScript: fenced blocks ```javascript ...``` or <script ...> ... </script> tags.
        Ignore inline code in single backticks.

        Normalization rules:
        1) Keep ONLY code, not outputs/results or rendered tables/images.
        2) For Python, remove REPL prompts: leading ">>> " and "... ". Also remove lines starting with "In [n]:" or "Out[n]:".
        3) Dedent each snippet, trim trailing spaces, ensure a trailing newline.
        4) Preserve original snippet order when concatenating per language.
        5) If a language has no code, use an empty string for its value.
        6) Optional import hoisting:
           - Python: move `import ...` / `from ... import ...` lines to the top of the Python value.
           - R: move `library(pkg)` calls to the top.
           - JavaScript: move `import ... from 'pkg'` and `require('pkg')` to the top.
        7) Do NOT add commentary or markdown fences. Output MUST be only the JSON object.

        Output examples:
        {{
          "python": "import os\nprint(os.getcwd())\n# --- next snippet ---\n...",
          "javascript": "console.log('Hello');\n",
          "R": "library(ggplot2)\nx <- 1:10\n"
        }}
         """),
        ("human",
         '''
         Input markdown:
         ---------------
         {markdown}
         ---------------
         Return ONLY the JSON OBJECT described above. No explanations or fenced blocks, just JSON WITHOUT ```.''')
    ])
    resp = llm.invoke(extract_text_prompt.format_messages(markdown=markdown))
    try:
        return json.loads(resp.content)
    except:
        try:
            json_parser = JsonOutputParser()
            return json_parser.parse(resp.content)
        except:
            raise ValueError("Failed to parse the JSON")

Agent 4: Run code & produce minimal change suggestions if it fails¶

This agent attempts to execute Python code. If it fails, it asks the LLM to propose minimal fixes with diff-style explanations.

def run_code(extracted_code: Dict[Literal['python', 'javascript', 'R'], str]) -> List[Dict[str, str]]:
    try:
        exec(extracted_code['python'])
        return [{'python':'No changes to python code'}]
    except Exception as e:
        prompt = """
        I have a code which worked sometime back but is not working now. The code and error are shared.  
        Is this issue due to data not being present: If so, validate the code and try to see if there are any depreciated functions.  
        Is this issue due to a missing or depriciated variable: If so rewrite the depriciated code. Try making minimal changes to the code.  
        Return the change that you did in simple english with sample code and original code so that someone else who will read this will understand how to update the code.  
        Do as many minimal changes as possible. If no fixes are found, or you find any other error, return "No changes to python code".
        Code: """+ extracted_code['python'] +"""\n
        Error: """ + str(e)
    resp = llm.invoke(prompt)
    output = resp.content
    return [{'python':output}]

Agent 5: Apply minimal text & code fixes to the original Markdown¶

This agent recombines the parallel outputs (text and code fixes) and applies minimal edits. This "fan-in" step is the "Reduce" in our map-reduce pattern.

def change_markdown(text_issues: List[Dict[str, str]], code_issues: List[Dict[str, str]], original_markdown: str) -> str:
    """
    Modify the original markdown file based on the issues found.
    Applies minimal changes as suggested by expert comments.
    """

    fix_issues_prompt = ChatPromptTemplate.from_messages([
        ("system",
        """You are a grammar and text quality fixer for a data science blog.

        Your task:
        - Apply the expert's suggested fixes to the original Markdown.
        - Changes must be minimal: correct grammar, spelling, tone, and clarity ONLY where issues were flagged.
        - Do NOT rewrite entire sentences unless necessary.
        - Preserve:
          * Markdown structure (headings, lists, tables)
          * Code blocks and syntax (except for code changes and any changes to the code comments if provided)
          * Hyperlinks, images, and HTML tags
        - Do NOT add new content or explanations.
        - Output MUST be the full corrected Markdown document only, without any extra text or code fences.
        """),
        ("system",
        """Input details:
        - Issues are provided as a JSON list of objects: [{{ "<original_text>": "<issue description>" }}, ...].
        - Original Markdown follows after the delimiter.
        """),
        ("human",
        """Expert comments on text:
        {text_issues}

        Expert comments on code:
        {code_issues}
        Original Markdown:
        -------
        {markdown}
        -------

        Return ONLY the corrected Markdown. Do NOT include ```markdown fences or any commentary.""")
    ])

    # Prepare prompt
    messages = fix_issues_prompt.format_messages(
        text_issues=json.dumps(text_issues, ensure_ascii=False),
        code_issues = json.dumps(code_issues, ensure_ascii=False),
        markdown=original_markdown
    )

    # Invoke LLM
    resp = llm.invoke(messages)
    output = resp.content.strip()

    # Cleanup: Remove accidental markdown fences or explanations
    output = re.sub(r"^```(?:markdown)?\s*", "", output)
    output = re.sub(r"\s*```$", "", output)

    # Validate: Ensure output is not empty and contains original structure
    if not output or len(output) < 10:
        raise ValueError("Model returned empty or invalid markdown.")

    return output

Agent 6: Validate HTML rendering details¶

We make render-aware adjustments, such as for math delimiters and bullet spacing, while preserving content and code fences.

def validate_html(new_markdown: str) -> str:
    """
    Check for html issues, spacing issues, broken links, markdown issues etc to validate the html content in the file.
    """
    validate_html_prompt = ChatPromptTemplate.from_messages([
        ("system",
        """You are a html validator and fixer for a markdown to be converted to a html. 
        You evaluate if the markdown file an render properly as a HTML and provide corrections if needed.

        Your task:
        - Identify any potential issues that might arise due to markdown being converted to html file such as:
            * Spaces: Two spaces after every bullet point and paragraph to ensure new line
            * Formula: $$ .. $$ indicates formula in new line, and has to have space after first $$ and before second $$ for it to render as a formula.
        - Identify html issues such as closing tags, etc.
        - Identify header levels and if they make sence: <h1> followed by <h2> and so on. Headers can also be also be represented as #, ##
        - Changes must be minimal and html related (or spaces): correct ONLY where issues were flagged.
        - Preserve:
          * Text
          * Code blocks and syntax
          * Formula blocks (except for adding spaces at the start or end if needed)
        - Do NOT add new content or explanations.
        - Output MUST be the full corrected Markdown document only, without any extra text or code fences.
        """),
        ("human",
        """Original Markdown:
        -------
        {markdown}
        -------

        Return ONLY the corrected Markdown. Do NOT include ```markdown fences or any commentary.""")
    ])

    # Prepare prompt
    messages = validate_html_prompt.format_messages(
        markdown=new_markdown
    )

    # Invoke LLM
    resp = llm.invoke(messages)
    output = resp.content.strip()

    # Cleanup: Remove accidental markdown fences or explanations
    output = re.sub(r"^```(?:markdown)?\s*", "", output)
    output = re.sub(r"\s*```$", "", output)

    # Validate: Ensure output is not empty and contains original structure
    if not output or len(output) < 10:
        raise ValueError("Model returned empty or invalid markdown.")

    return output

Utility: Diffs for review¶

These helpers produce colour-coded diffs so that reviewers can quickly see minimal changes.

def print_comparision(m, a, b):
    red = "\033[31m"
    green = "\033[32m"
    blue = "\033[34m"
    reset = "\033[39m"
    for tag, i1, i2, j1, j2 in m.get_opcodes():
        if tag == 'replace':
            print(f'{red}{a[i1:i2]}{reset}', end='')
            print(f'{green}{b[j1:j2]}{reset}', end='')
        if tag == 'delete':
            print(f'{red}{a[i1:i2]}{reset}', end='')
        if tag == 'insert':
            print(f'{green}{b[j1:j2]}{reset}', end='')
        if tag == 'equal':
            print(f'{a[i1:i2]}', end='')

def get_differences(original_text, corrected_text):
    m = difflib.SequenceMatcher(a=original_text, b=corrected_text)
    print_comparision(m, a=original_text, b=corrected_text)
    return m

Router input: Decide if human feedback demands more changes¶

This function is called by the router node and returns 'True' or 'False' for the conditional edge.

def need_changes(human_feedback):
    """
    Conditional function that determines if there are any changes to be done based on human feedback
    """
    is_human_feedback = ChatPromptTemplate.from_messages([
        ("system",
        """
        A human user has provided feedback on markdown file modification done by AI. 
        You need to determine if we need to do any further changes or revert any changes basis on the feedback provided by the human.
        If further changes need to be done, return 'True'. Else return 'False'.
        Except 'True' or 'False', dont respond with anything else.
        """),
        ("human",
        """
        {human_feedback}
        Return ONLY 'True' or 'False'.""")
    ])

    # Prepare prompt
    messages = is_human_feedback.format_messages(
        human_feedback = human_feedback
    )

    # Invoke LLM
    resp = llm.invoke(messages)
    return resp.content.strip()

Agent 7: Apply the human’s requested changes¶

This is the human-in-the-loop agent, which applies exactly what the reviewer has requested and nothing more.

def apply_changes(new_markdown: str, old_markdown: str, human_feedback: str) -> str:
    """
    Apply changes provided by human feedback
    """
    apply_human_feedback = ChatPromptTemplate.from_messages([
        ("system",
        """You are a html validator and fixer for a markdown to be converted to a html. 
        You have already provided some fixes on the original markdown and a feedback has been provided on it. 
        Basis of the feedback do the changes mentioned.
        Dont do anything else or change anything else.
        Return the modified markdown.
        Do NOT add new content or explanations.
        Output MUST be the full corrected Markdown document only, without any extra text or code fences.
        """),
        ("human",
        """Original Markdown:
        -------
        {old_markdown}
        -------

        Modified Markdown:
        -------
        {new_markdown}
        -------

        Feedback on the changes:
        {human_feedback}

        Return ONLY the corrected Markdown. Do NOT include ```markdown fences or any commentary.""")
    ])

    # Prepare prompt
    messages = apply_human_feedback.format_messages(
        new_markdown=new_markdown,
        old_markdown=old_markdown,
        human_feedback=human_feedback
    )

    # Invoke LLM
    resp = llm.invoke(messages)
    output = resp.content.strip()

    # Cleanup: Remove accidental markdown fences or explanations
    output = re.sub(r"^```(?:markdown)?\s*", "", output)
    output = re.sub(r"\s*```$", "", output)

    # Validate: Ensure output is not empty and contains original structure
    if not output or len(output) < 10:
        raise ValueError("Model returned empty or invalid markdown.")

    return output

Persistence: Save the final HTML¶

Simple persistence to disk for the final artefact.

def save_modified_html(modified_html, file_path):
    with open(file_path, "w", encoding="utf-8") as f:
        f.write(modified_html)
    print(f"File saved successfully at: {file_path}")
    pass

The State: typed schema & aggregation rules¶

TypedDict provides a strongly typed state.
Annotated[..., operator.add] defines merge semantics, meaning that if multiple nodes add comments, they are appended in the shared state.

class AppState(TypedDict):
    file_name: str # markdown file link
    markdown: str # initial markdown file

    # Extracted data
    extracted_text: List[Dict[Literal['header', 'text', 'comment'], str]] # Text in the blog
    extracted_code: Dict[Literal['python', 'javascript', 'R'], str] # code chunks in the blog

    # improvements
    improved_text_comments: Annotated[List[Dict[str, str]], operator.add]
    improved_code_comments: Annotated[List[Dict[str, str]], operator.add]
    improved_html_comments: Annotated[List[Dict[str, str]], operator.add]

    # Final updated markdown
    code_and_text_fixed_markdown: str
    updated_markdown: str # final updated markdown

    # Human feedback loop
    human_feedback: str
    need_changes: Literal['True', 'False']
    iteration: int # loop counter

Nodes (agents) as pure functions¶

Each node is a single-responsibility agent operating on AppState. Notice the parallelism: we extract and validate text and code concurrently, then reduce them into a single Markdown file. The router (need_changes_condition) decides the path after human feedback.

def node_read_file(state: AppState) ->  Dict[str, Any]:
    htmlmarkdown = read_file(state['file_name'])
    return {'markdown': htmlmarkdown}

def node_extract_text(state: AppState) ->  Dict[str, Any]:
    extracted_text = extract_text(state['markdown'])
    return {'extracted_text': extracted_text}

def node_validate_text(state: AppState) ->  Dict[str, Any]:
    improved_text_comments = identify_text_issues(state['extracted_text'])
    return {'improved_text_comments': improved_text_comments}

def node_extract_code(state: AppState) -> Dict[str, Any]:
    extracted_code = extract_code(state['markdown'])
    return {'extracted_code': extracted_code}

def node_validate_code(state: AppState) ->  Dict[str, Any]:
    improved_code_comments = run_code(state['extracted_code'])
    return {'improved_code_comments': improved_code_comments}

def node_implement_changes(state: AppState) ->  Dict[str, Any]:
    new_markdown = change_markdown(state['improved_text_comments'], state['improved_code_comments'], state['markdown'])
    return {'code_and_text_fixed_markdown': new_markdown}

def node_validate_html(state: AppState) ->  Dict[str, Any]:
    candidate = state.get('updated_markdown') or state['code_and_text_fixed_markdown']
    new_markdown = validate_html(candidate)
    return {'updated_markdown': new_markdown}

def node_human_feedback(state: AppState) -> Dict[str, Any]:
    """
    Pass-through. Expect 'human_feedback' to be set externally via compiled.update_state(...)
    """
    fb = state.get("human_feedback", {})
    return {"human_feedback": fb}

def need_changes_condition(state: AppState) -> bool:
    """
    Based on human feedback, do we need further changes to the markdown
    """
    is_need_changes = need_changes(state['human_feedback'])
    print(is_need_changes)
    return True if is_need_changes == 'True' else False

def node_apply_changes(state: AppState) ->  Dict[str, Any]:
    new_markdown = apply_changes(state['updated_markdown'], state['markdown'], state['human_feedback'])
    return {'updated_markdown': new_markdown}

def node_save_html(state: AppState):
    save_modified_html(state['updated_markdown'], state['file_name'])

The Graph: wiring nodes, conditional edges, memory, and breakpoints¶

Key points:
1. Map-Reduce: Read File fans out to Extract Text and Extract Code, then reduces into Implement Changes.
2. Router: add_conditional_edges branches to Apply Feedback or Save HTML.
3. Memory: MemorySaver is our checkpoint system, tied to thread_id.
4. Breakpoint: interrupt_before=['Human Feedback'] pauses the graph just before the human review step.

# Build Graph
graph = StateGraph(AppState)

# Nodes
graph.add_node("Read File", node_read_file)
graph.add_node("Extract Text", node_extract_text)
graph.add_node("Validate Text", node_validate_text)
graph.add_node("Extract Code", node_extract_code)
graph.add_node("Validate Code", node_validate_code)
graph.add_node("Implement Changes", node_implement_changes)
graph.add_node("Validate HTML", node_validate_html)
graph.add_node("Human Feedback", node_human_feedback)
graph.add_node("Apply Feedback", node_apply_changes)
graph.add_node("Save HTML", node_save_html)

# Entry and edges (map: fan-out; reduce: fan-in)
graph.set_entry_point("Read File")
graph.add_edge("Read File", "Extract Text")
graph.add_edge("Extract Text", "Validate Text")
graph.add_edge("Read File", "Extract Code")
graph.add_edge("Extract Code", "Validate Code")
graph.add_edge("Validate Text", "Implement Changes")
graph.add_edge("Validate Code", "Implement Changes")
graph.add_edge("Implement Changes", "Validate HTML")
graph.add_edge("Validate HTML", "Human Feedback")
# Router: conditional edges based on human feedback
graph.add_conditional_edges("Human Feedback", need_changes_condition, 
    {
        True: "Apply Feedback",
        False: "Save HTML"
    }
)
graph.add_edge("Apply Feedback", "Validate HTML")
graph.add_edge("Save HTML", END)

# Compile with memory and a breakpoint
memory = MemorySaver()
compiled = graph.compile(interrupt_before=['Human Feedback'], checkpointer=memory)

from IPython.display import Image, display
display(Image(compiled.get_graph().draw_mermaid_png()))

png

Run the workflow¶

This sequence demonstrates checkpointed runs, human-in-the-loop control, conditional routing, and final persistence.

### Start with a file
initial_input = {"file_name": '/docs/Python/LLM Tokenizers.md'}

# Use a persistent thread (for checkpoints)
thread = {"configurable": {"thread_id": "2"}}

# Run until the breakpoint
output = compiled.invoke(initial_input, thread)

# Inspect diffs
get_differences(compiled.get_state(thread).values.get("markdown"), compiled.get_state(thread).values.get("updated_markdown"))

# Human asks to modify or revert code changes
compiled.update_state(thread, {"human_feedback": "Modify the code changes part and revert"})

{'configurable': {'thread_id': '2',
  'checkpoint_ns': '',
  'checkpoint_id': '1f...23'}}

# Resume from the breakpoint and route to "Apply Feedback"
output = compiled.invoke(None, thread)

True

# Human says "All good."
compiled.update_state(thread, {"human_feedback": "All good."}, as_node="Human Feedback")

False





{'configurable': {'thread_id': '2',
  'checkpoint_ns': '',
  'checkpoint_id': '1f...c4'}}

# This routes to "Save HTML"
output = compiled.invoke(None, thread)

File saved successfully at: /docs/Python/LLM Tokenizers.md

Conclusion¶

This LangGraph pipeline demonstrates how to build auditable, teachable, and production-ready LLM systems, featuring:
1. Agents as single-responsibility nodes.
2. A stateful graph that supports parallelism and routers.
3. Memory and breakpoints for human oversight.
4. A map-reduce process for scalable processing.
5. Typed state with merge semantics for deterministic behaviour.

Written with assistance from Generative AI.