Module 14 — Cloud Phase

No-Code Agents with n8n

n8n is an open-source workflow automation platform that lets you build sophisticated AI agents and automation pipelines through a visual drag-and-drop interface — no coding required. With 400+ integrations (Slack, Gmail, Google Sheets, databases, APIs) and native AI nodes for OpenAI, Anthropic, and local models, n8n bridges the gap between business users who need AI automation and the technical infrastructure that powers it. This module covers n8n architecture, building AI workflows, creating tool-using agents, RAG pipelines, and deploying n8n for production use.

n8n Architecture
AI Agent Nodes
Visual RAG Pipelines
Webhook Triggers
400+ Integrations
Self-Hosted Deploy
Open in Colab Open Notebook in Colab
01

What Is n8n

Plain Language

n8n (pronounced "n-eight-n" or "nodemation") is an open-source workflow automation tool similar to Zapier or Make, but with three crucial differences: it is self-hostable (your data never leaves your infrastructure), it has a code node that lets you write arbitrary JavaScript/Python when the visual interface is not enough, and it has native AI capabilities with dedicated nodes for LLM chat, embeddings, vector stores, and autonomous agents. This combination of no-code simplicity and code-when-needed flexibility makes n8n uniquely powerful for building AI-powered workflows.

The core concept is a workflow — a visual graph of connected nodes where each node performs one action: receive a webhook, call an API, transform data, send an email, query a database, or invoke an LLM. Data flows through the connections between nodes, with each node receiving input from its predecessor and passing output to its successor. Workflows can be triggered by schedules (cron), webhooks (HTTP requests), events (new email, new Slack message), or manual execution. The visual editor makes it easy to build, test, and debug workflows without writing code.

For GenAI applications, n8n's value is in orchestration and integration. While Python code excels at building the core AI logic (RAG pipelines, agent loops, prompt engineering), n8n excels at connecting that AI logic to the rest of your business systems. Need to trigger a RAG query when a customer emails support, send the answer to Slack, log it in a Google Sheet, and create a Jira ticket if the confidence is low? In n8n, this is 10 minutes of drag-and-drop rather than hours of API integration code. The AI does the thinking; n8n does the plumbing.

n8n is particularly popular with teams that need to prototype AI workflows quickly and iterate based on business feedback. A product manager can modify the workflow — changing which Slack channel receives notifications, adjusting the LLM prompt, adding a new approval step — without waiting for an engineering sprint. This rapid iteration cycle is invaluable in the early stages of AI adoption, when requirements change frequently as the team learns what works and what does not.

Deep Dive

# Install n8n locally with Docker
docker run -it --rm \
  --name n8n \
  -p 5678:5678 \
  -v n8n_data:/home/node/.n8n \
  -e GENERIC_TIMEZONE="America/Chicago" \
  n8nio/n8n

# Access the UI at http://localhost:5678

# Or install via npm for development
npm install n8n -g
n8n start

n8n workflows are stored as JSON and can be version-controlled. Here is the structure of a minimal workflow that receives a webhook, calls an LLM, and returns the response:

// Workflow JSON structure (simplified)
{
  "name": "AI Chat Endpoint",
  "nodes": [
    {
      "type": "n8n-nodes-base.webhook",
      "parameters": {
        "httpMethod": "POST",
        "path": "chat",
        "responseMode": "responseNode"
      }
    },
    {
      "type": "@n8n/n8n-nodes-langchain.openAi",
      "parameters": {
        "model": "gpt-4o",
        "prompt": "={{ $json.body.message }}",
        "systemMessage": "You are a helpful assistant."
      }
    },
    {
      "type": "n8n-nodes-base.respondToWebhook",
      "parameters": {
        "respondWith": "json",
        "responseBody": "={{ { response: $json.text } }}"
      }
    }
  ],
  "connections": { /* node connections */ }
}
Self-Hosted Advantage

Unlike Zapier or Make, n8n can run entirely on your infrastructure. API keys, customer data, and LLM interactions never leave your network. This is critical for enterprise compliance (HIPAA, SOC 2, GDPR) and for connecting to internal services that are not exposed to the internet.

02

AI & LLM Nodes

Plain Language

n8n provides a comprehensive set of AI-specific nodes that cover the entire GenAI stack. The Chat Model nodes connect to OpenAI, Anthropic, Google, Ollama, or any OpenAI-compatible endpoint (including your vLLM or self-hosted models). The Embedding nodes generate vector embeddings. The Vector Store nodes connect to Pinecone, Qdrant, Supabase, or in-memory stores. The Text Splitter nodes handle document chunking. And the Agent node orchestrates a complete ReAct agent with tools — all configurable through the visual interface.

The AI nodes use LangChain under the hood, which means they support the same patterns and configurations that the LangChain Python library offers: chat memory for multi-turn conversations, output parsers for structured responses, chain-of-thought prompting, and tool-using agents. The difference is that instead of writing Python code, you configure these components through dropdown menus and text fields in the n8n UI. When the visual interface is not flexible enough, the Code node lets you write custom JavaScript or Python that runs inline within the workflow.

A particularly powerful pattern is using n8n's sub-workflow feature to create reusable AI components. You can build a "RAG query" sub-workflow (with retrieval, reranking, and synthesis), a "document ingestion" sub-workflow (with loading, chunking, and embedding), and a "content moderation" sub-workflow (with PII detection and toxicity filtering). These sub-workflows become building blocks that you compose into larger applications, just like functions in traditional programming.

Deep Dive

The key AI nodes available in n8n and their purpose:

NodeCategoryPurpose
OpenAI Chat ModelLLMGPT-4o, GPT-4o-mini chat completions
Anthropic Chat ModelLLMClaude 3.5 Sonnet, Haiku
Ollama Chat ModelLLMLocal models via Ollama
AI AgentAgentReAct agent with configurable tools
Vector Store (Pinecone, Qdrant)RAGStore and query embeddings
Embeddings (OpenAI, Cohere)RAGGenerate text embeddings
Text SplitterRAGChunk documents for indexing
Document LoaderRAGLoad PDF, CSV, JSON, web pages
Output ParserUtilityParse structured output (JSON, lists)
Memory (Buffer, Summary)UtilityConversation history management
Code (JS/Python)CustomRun arbitrary code in workflow
03

Building AI Agents in n8n

Plain Language

The n8n AI Agent node creates a complete ReAct agent through the visual interface. You configure the LLM (which model to use), attach tools (other n8n nodes that the agent can invoke), set the system prompt, and configure memory for multi-turn conversations. The agent then operates in the same Reason → Act → Observe loop described in Module 09, but without writing any code. When a user sends a message, the agent reasons about which tools to use, calls them through the n8n workflow, observes the results, and generates a response.

Tools in n8n agents are incredibly flexible because any n8n node can be a tool. This means your agent can: query a database (PostgreSQL, MySQL, MongoDB nodes), search the web (HTTP Request node), read Google Sheets, send Slack messages, create Jira tickets, query your RAG pipeline (Vector Store Retriever node), execute calculations (Code node), or call any REST API. Each tool is described to the agent with a name and description, and the agent decides when to use each tool based on the user's request — exactly like function calling in the code-based approach, but configured visually.

A common production pattern is a customer support agent built entirely in n8n. The workflow starts with a webhook that receives chat messages from your frontend. The AI Agent node uses Claude or GPT-4o as the reasoning engine and has tools attached for: querying the customer database (PostgreSQL node), searching the knowledge base (Vector Store Retriever), checking order status (HTTP Request to your API), and escalating to a human (Slack message node). The agent handles 80% of queries autonomously and escalates the remaining 20% to human support with full context.

Deep Dive

Building an agent in n8n involves connecting five types of nodes to the central AI Agent node:

1. Chat Model — The LLM brain. Connect an OpenAI, Anthropic, or Ollama node to the agent's "model" input. Configure model selection, temperature, and max tokens.

2. Tools — Actions the agent can take. Each tool is a sub-workflow or node connected to the agent's "tools" input with a name and description. The agent sees these descriptions and decides when to invoke each tool.

3. Memory — Conversation history. Connect a Buffer Memory (stores last N messages) or Summary Memory (LLM-summarized history) node to maintain context across turns.

4. Output Parser — Optional structured output. Connect an Output Parser to force the agent's final response into a specific JSON schema.

5. Trigger — How the agent receives input. Typically a Webhook node (for API access), a Chat Trigger node (for the built-in chat widget), or a Schedule Trigger (for periodic tasks).

Webhook POST /chat AI Agent ReAct Loop Reason → Act → Observe Max 10 iterations Claude 3.5 Sonnet Buffer Memory (20 msgs) RAG Retriever Vector Store + Rerank DB Query PostgreSQL read-only Send Slack Escalation channel HTTP Request Order status API Tools Respond to Webhook
Figure 1 — n8n AI Agent workflow: Webhook trigger → Agent with tools → Response
04

RAG Workflows

Plain Language

Building a complete RAG pipeline in n8n requires two workflows: an ingestion workflow that processes documents and stores embeddings, and a query workflow that retrieves context and generates answers. The ingestion workflow uses: a trigger (manual, schedule, or file upload webhook), a Document Loader node (reads PDFs, CSVs, web pages), a Text Splitter node (chunks the content), an Embeddings node (generates vectors), and a Vector Store Insert node (stores in Pinecone/Qdrant/Supabase). The query workflow uses: a Webhook trigger, a Vector Store Retriever node (similarity search), and an LLM node (synthesis with retrieved context).

n8n's RAG implementation is particularly effective for automating document processing pipelines. You can set up a workflow that watches a Google Drive folder for new PDFs, automatically ingests them into the vector store, and notifies a Slack channel when the knowledge base is updated. Users then query the knowledge base through a chatbot interface, and the answers include citations pointing back to the original documents. This entire system — from document upload to answered query — is built without writing a single line of code.

Deep Dive

The ingestion workflow node chain looks like: Google Drive Trigger (new file detected) → Google Drive Download (get file) → Document Loader (extract text from PDF) → Recursive Text Splitter (chunk at 1000 chars, 200 overlap) → OpenAI Embeddings (text-embedding-3-small) → Pinecone Vector Store InsertSlack (notify: "New document indexed: filename.pdf").

The query workflow: Webhook (POST /query) → Pinecone Vector Store Retriever (top 5 results) → OpenAI Chat (system prompt: "Answer based on context...") → Respond to Webhook (JSON with answer + sources).

n8n + Custom Code

When the visual RAG nodes are not flexible enough — for example, implementing reranking or hybrid search — use the Code node to call your own FastAPI endpoint that implements the advanced logic. n8n handles the workflow orchestration (triggers, integrations, error handling), and your custom code handles the AI-specific complexity. Best of both worlds.

05

Triggers & Integrations

Plain Language

The real power of n8n for GenAI is in its 400+ pre-built integrations. Every integration is both a trigger (start a workflow when something happens) and an action (do something as part of a workflow). This means your AI agent can be triggered by virtually any business event and can take action in virtually any business system. New email in Gmail? Trigger an AI classification workflow. New row in Google Sheets? Trigger document processing. New message in Slack? Trigger a RAG query and post the answer.

Some of the most powerful AI workflow patterns: Email triage — classify incoming emails by urgency and topic, draft responses for routine queries, and escalate complex ones to the right team. Document summarization — watch a SharePoint folder, summarize new documents, and post summaries to Teams. Data enrichment — new CRM lead triggers an AI agent that researches the company, summarizes findings, and updates the CRM record. Content creation — schedule triggers a workflow that generates social media posts based on recent blog content. Monitoring — poll an API for metrics, use an LLM to analyze trends, and alert on anomalies via PagerDuty.

Deep Dive

Trigger Typen8n NodesAI Use Case
WebhookWebhook, Chat TriggerChatbot API, form submission AI
ScheduleSchedule Trigger, CronDaily report generation, batch processing
EmailGmail, Outlook TriggerEmail classification, auto-response
MessagingSlack, Teams, DiscordSlash command bots, channel assistants
FileGoogle Drive, S3, FTPDocument ingestion, media processing
DatabasePostgreSQL, MySQL TriggerNew record enrichment, change analysis
CRMHubSpot, SalesforceLead scoring, account research
06

Production Deployment

Plain Language

For production, n8n should be deployed with persistent storage, proper authentication, and monitoring. The recommended approach is Docker Compose with PostgreSQL for workflow storage (instead of the default SQLite), Redis for queue-based execution (so webhook workflows do not block each other), and a reverse proxy (Nginx, Traefik, or Caddy) for HTTPS termination. On AWS, this maps to an ECS or EC2 deployment with RDS PostgreSQL and ElastiCache Redis.

Key production configurations: enable queue mode so workflows execute asynchronously via Redis queues (prevents webhook timeouts), set execution data pruning to avoid database bloat (keep 30 days of execution history), configure webhook authentication (basic auth, header tokens, or OAuth) to prevent unauthorized triggering, and set up environment variables for API keys rather than storing them in workflow JSON. Also important: configure n8n's built-in error workflow that triggers when any workflow fails, sending alerts to your monitoring system.

Deep Dive

# docker-compose.yml for production n8n
version: "3.8"
services:
  n8n:
    image: n8nio/n8n:latest
    restart: always
    ports:
      - "5678:5678"
    environment:
      - DB_TYPE=postgresdb
      - DB_POSTGRESDB_HOST=postgres
      - DB_POSTGRESDB_DATABASE=n8n
      - DB_POSTGRESDB_USER=n8n
      - DB_POSTGRESDB_PASSWORD=${DB_PASSWORD}
      - EXECUTIONS_MODE=queue
      - QUEUE_BULL_REDIS_HOST=redis
      - N8N_ENCRYPTION_KEY=${ENCRYPTION_KEY}
      - WEBHOOK_URL=https://n8n.yourdomain.com/
      - N8N_BASIC_AUTH_ACTIVE=true
      - N8N_BASIC_AUTH_USER=${N8N_USER}
      - N8N_BASIC_AUTH_PASSWORD=${N8N_PASSWORD}
      - EXECUTIONS_DATA_MAX_AGE=720  # 30 days in hours
      - GENERIC_TIMEZONE=America/Chicago
    volumes:
      - n8n_data:/home/node/.n8n
    depends_on:
      - postgres
      - redis

  postgres:
    image: postgres:16
    restart: always
    environment:
      POSTGRES_DB: n8n
      POSTGRES_USER: n8n
      POSTGRES_PASSWORD: ${DB_PASSWORD}
    volumes:
      - postgres_data:/var/lib/postgresql/data

  redis:
    image: redis:7-alpine
    restart: always

  # n8n worker for queue mode (scale horizontally)
  n8n-worker:
    image: n8nio/n8n:latest
    command: worker
    restart: always
    environment:
      - DB_TYPE=postgresdb
      - DB_POSTGRESDB_HOST=postgres
      - DB_POSTGRESDB_DATABASE=n8n
      - DB_POSTGRESDB_USER=n8n
      - DB_POSTGRESDB_PASSWORD=${DB_PASSWORD}
      - EXECUTIONS_MODE=queue
      - QUEUE_BULL_REDIS_HOST=redis
      - N8N_ENCRYPTION_KEY=${ENCRYPTION_KEY}
    depends_on:
      - postgres
      - redis

volumes:
  n8n_data:
  postgres_data:
Security: API Keys

Never store API keys (OpenAI, Anthropic) in workflow JSON files. Use n8n's credential system, which encrypts keys at rest using the N8N_ENCRYPTION_KEY. If you export workflows for version control, credentials are excluded by default. Set the encryption key once and never change it — changing it invalidates all stored credentials.

🎯

Interview Ready

Elevator Pitch

n8n is an open-source, self-hostable workflow automation platform with native AI nodes that lets you build LLM-powered agents, RAG pipelines, and multi-step automations through a visual drag-and-drop interface. It bridges the gap between business users who need AI workflows fast and engineering teams who need data to stay on-prem — offering 400+ integrations, webhook triggers, and a code node for when the visual builder is not enough.

Top 5 Interview Questions

#QuestionWhat They're Really Asking
1What is n8n and how does it differ from Zapier or Make?Do you understand the self-hosted, open-source, and code-node advantages?
2How would you build an AI agent in n8n?Can you describe the Agent node, tool connections, memory, and trigger setup?
3When should you use n8n versus writing custom Python code?Do you know the trade-offs between no-code prototyping and production code?
4How do webhook triggers work in n8n AI workflows?Can you explain the request-response lifecycle and async queue mode?
5How would you deploy n8n for production use?Do you understand PostgreSQL storage, Redis queues, encryption, and scaling workers?

Model Answers

Q1 — What is n8n and how does it differ from Zapier or Make?

n8n is an open-source workflow automation platform that connects 400+ services through a visual node-based editor. Unlike Zapier and Make, n8n is fully self-hostable, meaning sensitive data — API keys, customer PII, LLM interactions — never leaves your infrastructure. It also provides a Code node for writing arbitrary JavaScript or Python inline when the visual interface is not flexible enough, and it has native AI/LLM nodes (chat models, embeddings, vector stores, agents) that Zapier and Make lack. This combination of no-code simplicity with code-when-needed power makes it uniquely suited for building AI automations in compliance-sensitive environments.

Q2 — How would you build an AI agent in n8n?

You place the AI Agent node at the center and connect five types of inputs. First, a Chat Model node (OpenAI, Anthropic, or Ollama) as the reasoning engine. Second, tool nodes — any n8n node can be a tool, such as a PostgreSQL node for database queries, an HTTP Request node for API calls, a Vector Store Retriever for RAG, or a Slack node for escalation. Each tool gets a name and description so the agent knows when to use it. Third, a Memory node (Buffer or Summary) for multi-turn conversation history. Fourth, an optional Output Parser for structured JSON responses. Finally, a trigger — typically a Webhook or Chat Trigger node. The agent then runs a ReAct loop: it reasons about which tool to call, executes it, observes the result, and repeats until it has enough information to respond.

Q3 — When should you use n8n versus writing custom Python code?

Use n8n for orchestration and integration — connecting AI logic to business systems like Slack, Gmail, Google Sheets, CRMs, and databases. It excels at rapid prototyping where requirements change frequently, because a product manager can modify the workflow without an engineering sprint. Use custom Python code for core AI logic that requires fine-grained control: custom reranking algorithms, advanced prompt chaining, evaluation pipelines, or latency-critical inference. The best approach is often hybrid — n8n handles the workflow plumbing (triggers, routing, notifications, error handling) and calls a custom FastAPI endpoint via the HTTP Request or Code node for the complex AI-specific logic.

Q4 — How do webhook triggers work in n8n AI workflows?

A Webhook node creates an HTTP endpoint (e.g., POST /chat) that starts the workflow when it receives a request. In "Response Node" mode, the workflow holds the HTTP connection open until a Respond to Webhook node sends back the result — this is how you build synchronous chat APIs. The data from the request body is accessible to all downstream nodes via expressions like {{ $json.body.message }}. In production, you enable queue mode with Redis so each webhook execution runs asynchronously on a worker process, preventing long-running LLM calls from blocking other incoming requests and avoiding webhook timeouts.

Q5 — How would you deploy n8n for production use?

Production n8n uses Docker Compose (or ECS/Kubernetes) with three services: the n8n main instance, PostgreSQL for persistent workflow and execution storage (replacing the default SQLite), and Redis for queue-based execution. You configure queue mode so workflows run on separate worker containers that scale horizontally. API keys are stored through n8n's credential system, encrypted at rest with the N8N_ENCRYPTION_KEY. A reverse proxy (Nginx, Traefik, or Caddy) handles HTTPS termination. You set execution data pruning to 30 days to prevent database bloat, configure webhook authentication to prevent unauthorized triggers, and set up an error workflow that sends alerts to Slack or PagerDuty when any workflow fails.

System Design Scenario

Prompt: "Design an automated customer support system using n8n that handles incoming emails, searches a knowledge base, drafts responses, and escalates to humans when confidence is low."

Approach: Two workflows. Ingestion workflow: Google Drive Trigger watches a folder for new support docs → Document Loader extracts text → Text Splitter chunks at 1000 chars → OpenAI Embeddings generates vectors → Pinecone Vector Store stores them → Slack notifies the team. Support workflow: Gmail Trigger fires on new emails → Code node extracts subject and body → AI Agent node with Claude as the LLM, a Vector Store Retriever tool for knowledge base search, and a PostgreSQL tool for customer lookup → Output Parser extracts the answer and a confidence score → IF node branches on confidence: high confidence sends an auto-reply via Gmail, low confidence posts to a Slack channel with the draft answer and retrieved context for human review → both paths log to Google Sheets for analytics. Deploy with queue mode so email processing does not block, and set up an error workflow for failures.

Common Mistakes

1. Using n8n for everything instead of recognizing its limits. n8n is excellent for orchestration and integration, but it is not the right tool for latency-critical inference pipelines, custom model training, or complex data transformations. The visual interface becomes unwieldy for workflows beyond 30-40 nodes. Know when to offload logic to a dedicated Python service and call it from n8n via HTTP Request.

2. Storing API keys in workflow JSON instead of using the credential system. Workflow JSON files are often exported, shared, or version-controlled. If you paste API keys directly into node parameters, they end up in plaintext in Git repositories. Always use n8n's built-in credential system, which encrypts keys at rest and excludes them from workflow exports.

3. Running production n8n without queue mode. Without queue mode, all workflows execute in the main n8n process. A single long-running LLM call (30+ seconds) blocks every other webhook and scheduled workflow. Queue mode with Redis workers is essential for production — it isolates execution, enables horizontal scaling, and prevents cascading timeouts.

← Previous
13 · AWS Cloud
Next →
15 · Capstone I