🧠 AI Agents Interview Questions

Master AI agent concepts, frameworks, orchestration, and building autonomous systems

⚔

15-Minute AI Agents Cheatsheet

Quick reference for last-minute interview preparation

šŸ¤– What are AI Agents?

Autonomous systems that perceive, reason, and act
LLM-powered decision making
Use tools to interact with environment
Can plan and break down complex tasks
Maintain memory across interactions

🧩 Core Components

Perception: Process inputs from environment
Reasoning: LLM thinks about what to do
Planning: Break goals into steps
Memory: Short-term and long-term storage
Action: Execute tools and APIs

šŸ”„ Common Patterns

ReAct: Thought → Action → Observation loop
Plan-and-Execute: Plan first, then execute
Reflexion: Self-critique and improve
Tree of Thoughts: Explore multiple paths
Multi-agent: Specialized agents collaborate

🧠 Memory Types

Short-term: Current conversation context
Long-term: Persistent facts and experiences
Semantic: Vector embeddings for retrieval
Episodic: Past task executions
Procedural: Learned task patterns

šŸ› ļø Popular Frameworks

LangChain: Chains, agents, tools ecosystem
LangGraph: Graph-based agent workflows
AutoGen: Microsoft multi-agent framework
CrewAI: Role-based agent collaboration
OpenAI Assistants: Managed agent API

šŸ”§ Tool Use Patterns

Function calling: Structured JSON output
Tool descriptions: Help LLM choose right tool
Error handling: Retry and fallback logic
Tool chaining: Output of one → input of next
Parallel tools: Execute independent tools together

šŸ“š RAG Essentials

RAG: Retrieval-Augmented Generation
Chunking: Split docs 500-1500 chars
Embeddings: OpenAI, Cohere, HuggingFace
Vector DBs: Pinecone, Chroma, Weaviate
Hybrid search: Semantic + keyword (BM25)

🦜 LangChain Quick Ref

LCEL: pipe operator for chains
Chains: Fixed, predictable workflows
Agents: Dynamic tool selection
LangGraph: Cycles, state, multi-agent
LangSmith: Debug, trace, evaluate

šŸ”„ ReAct Pattern (Most Common)

1. Thought: I need to search for information about X
2. Action: search[query about X]
3. Observation: Results: found Y and Z...
4. Thought: Now I have enough info to answer
5. Answer: Based on my research...

āš ļø Key Challenges to Discuss

• Hallucination: Agents may invent false info
• Infinite loops: Need max iteration limits
• Error recovery: Graceful handling of failures
• Cost control: LLM calls can be expensive
• Security: Validate tool inputs/outputs
• Observability: Log agent reasoning steps

AI Agents are autonomous systems that can perceive their environment, make decisions, and take actions to achieve goals. They go beyond simple question-answering.

Key Differences:

Traditional Chatbots

  • Follow predefined scripts
  • Rule-based or simple ML
  • Reactive (respond to inputs)
  • Limited context
  • No tool use

AI Agents

  • Autonomous decision-making
  • LLM-powered reasoning
  • Proactive (plan and execute)
  • Long-term memory
  • Use tools and APIs

Agent Components:

  • Perception: Understand environment through inputs
  • Reasoning: LLM-based decision making
  • Planning: Break down goals into steps
  • Memory: Store and retrieve context
  • Action: Use tools to interact with world
  • Learning: Improve from feedback
Python

ReAct is a powerful agent pattern that combines reasoning (thinking) with acting (using tools). The agent iteratively thinks about what to do, takes action, and observes results.

Python

LangChain and LangGraph are popular frameworks for building AI agents with pre-built components and orchestration capabilities.

Python

Advanced agents require sophisticated planning, memory management, and self-reflection capabilities to handle complex tasks autonomously.

Python
šŸ“š

RAG (Retrieval-Augmented Generation)

Master RAG concepts for building knowledge-grounded AI systems

RAG (Retrieval-Augmented Generation) is a technique that enhances LLM responses by retrieving relevant information from external knowledge sources before generating answers.

Why RAG is Important:

  • Reduces hallucinations: Grounds responses in actual data
  • Up-to-date knowledge: Access information beyond training cutoff
  • Domain-specific: Incorporate proprietary or specialized data
  • Cost-effective: Cheaper than fine-tuning for knowledge updates
  • Transparency: Can cite sources for generated content
Python

RAG vs Fine-tuning:

Use RAG when:

  • • Knowledge changes frequently
  • • Need source citations
  • • Limited training data
  • • Quick deployment needed

Use Fine-tuning when:

  • • Teaching new behaviors/style
  • • Consistent format needed
  • • Domain-specific terminology
  • • Latency is critical

Chunking is the process of splitting documents into smaller pieces for embedding and retrieval. The strategy significantly impacts RAG quality.

Common Chunking Strategies:

Python

Chunking Best Practices:

  • Chunk size: 500-1500 characters for most use cases
  • Overlap: 10-20% of chunk size to preserve context
  • Metadata: Include source, page number, section headers
  • Semantic boundaries: Split at paragraphs, not mid-sentence
  • Test and iterate: Evaluate retrieval quality with your data

Embedding models convert text into dense vectors that capture semantic meaning. Choosing the right model affects retrieval quality significantly.

Popular Embedding Models:

ModelDimensionsBest For
OpenAI text-embedding-3-large3072High accuracy, general purpose
OpenAI text-embedding-3-small1536Cost-effective, good quality
Cohere embed-v31024Multilingual, search optimized
sentence-transformers/all-MiniLM-L6-v2384Fast, local, free
BAAI/bge-large-en-v1.51024Open source, high quality
Python

Selection Criteria:

  • Accuracy: Check MTEB benchmark scores
  • Cost: API costs vs self-hosted
  • Latency: Embedding generation time
  • Language: Multilingual support needed?
  • Privacy: Can data leave your infrastructure?

Vector databases store and efficiently search high-dimensional embeddings using approximate nearest neighbor (ANN) algorithms.

Popular Vector Databases:

  • Pinecone: Fully managed, easy to use, scalable
  • Weaviate: Open source, hybrid search, GraphQL API
  • Chroma: Lightweight, embedded, great for prototyping
  • Milvus: Open source, highly scalable, cloud-native
  • Qdrant: Rust-based, fast, filtering support
  • pgvector: PostgreSQL extension, SQL integration
Python

Retrieval Methods:

Similarity Search

  • • Cosine similarity (most common)
  • • Euclidean distance
  • • Dot product

Advanced Retrieval

  • • MMR (Maximal Marginal Relevance)
  • • Re-ranking with cross-encoders
  • • Metadata filtering

Hybrid search combines semantic (vector) search with traditional keyword (BM25/TF-IDF) search to get the best of both approaches.

Why Hybrid Search?

  • Semantic search: Understands meaning, handles synonyms
  • Keyword search: Exact matches, specific terms, acronyms
  • Combined: Better recall and precision
Python

Hybrid Search Best Practices:

  • Weight tuning: Test different keyword/semantic ratios
  • Re-ranking: Use cross-encoder for final ranking
  • Query expansion: Add synonyms for better recall
  • Evaluation: Measure with nDCG, MRR, recall@k
🦜

LangChain Framework

Deep dive into LangChain components, patterns, and best practices

LangChain is a framework for building applications with LLMs. It provides modular components that can be composed together.

Core Components:

  • Models: LLMs and Chat Models (OpenAI, Anthropic, etc.)
  • Prompts: Templates for structuring LLM inputs
  • Chains: Sequences of calls (LLM, tools, etc.)
  • Memory: Persist state across chain runs
  • Agents: Use LLMs to decide which tools to use
  • Tools: Functions that agents can call
  • Retrievers: Fetch relevant documents
Python

LCEL (LangChain Expression Language) is a declarative way to compose chains using the pipe operator. It's the modern, recommended way to build LangChain applications.

Key Benefits:

  • Streaming: First-class streaming support
  • Async: Native async/await support
  • Parallel: Automatic parallelization
  • Retries: Built-in retry logic
  • Tracing: LangSmith integration
Python

Tools in LangChain are functions that agents can call. They require a name, description, and implementation.

Python

Tool Design Best Practices:

  • Clear descriptions: Help LLM understand when to use
  • Error handling: Return helpful error messages
  • Input validation: Use Pydantic for type safety
  • Async support: Implement _arun for async agents

LangSmith is a platform for debugging, testing, evaluating, and monitoring LangChain applications. It's essential for production deployments.

Key Features:

  • Tracing: See every step in your chain
  • Debugging: Inspect inputs/outputs at each step
  • Evaluation: Test chains against datasets
  • Monitoring: Track latency, costs, errors
  • Datasets: Create test sets for evaluation
Python

Production Monitoring:

Track

  • • Latency per step
  • • Token usage & costs
  • • Error rates
  • • User feedback

Debug

  • • Failed runs
  • • Hallucinations
  • • Prompt issues
  • • Tool errors

Understanding when to use each approach is crucial for building effective LangChain applications.

ApproachUse WhenAvoid When
LCEL ChainsFixed, predictable workflowsDynamic tool selection needed
AgentsDynamic reasoning, tool selectionSimple, predictable tasks
LangGraphComplex state, cycles, multi-agentSimple linear workflows
Python

Quick Decision Guide:

  • Simple Q&A or RAG: LCEL Chain
  • Tool use with reasoning: Agent
  • Multi-step with retries: LangGraph
  • Multi-agent systems: LangGraph
  • Production reliability: LangGraph (better control)

Interview Tips for AI Agents