š§ AI Agents Interview Questions
Master AI agent concepts, frameworks, orchestration, and building autonomous systems
15-Minute AI Agents Cheatsheet
Quick reference for last-minute interview preparation
š¤ What are AI Agents?
š§© Core Components
š Common Patterns
š§ Memory Types
š ļø Popular Frameworks
š§ Tool Use Patterns
š RAG Essentials
š¦ LangChain Quick Ref
š ReAct Pattern (Most Common)
ā ļø Key Challenges to Discuss
AI Agents are autonomous systems that can perceive their environment, make decisions, and take actions to achieve goals. They go beyond simple question-answering.
Key Differences:
Traditional Chatbots
- Follow predefined scripts
- Rule-based or simple ML
- Reactive (respond to inputs)
- Limited context
- No tool use
AI Agents
- Autonomous decision-making
- LLM-powered reasoning
- Proactive (plan and execute)
- Long-term memory
- Use tools and APIs
Agent Components:
- Perception: Understand environment through inputs
- Reasoning: LLM-based decision making
- Planning: Break down goals into steps
- Memory: Store and retrieve context
- Action: Use tools to interact with world
- Learning: Improve from feedback
ReAct is a powerful agent pattern that combines reasoning (thinking) with acting (using tools). The agent iteratively thinks about what to do, takes action, and observes results.
LangChain and LangGraph are popular frameworks for building AI agents with pre-built components and orchestration capabilities.
Advanced agents require sophisticated planning, memory management, and self-reflection capabilities to handle complex tasks autonomously.
RAG (Retrieval-Augmented Generation)
Master RAG concepts for building knowledge-grounded AI systems
RAG (Retrieval-Augmented Generation) is a technique that enhances LLM responses by retrieving relevant information from external knowledge sources before generating answers.
Why RAG is Important:
- Reduces hallucinations: Grounds responses in actual data
- Up-to-date knowledge: Access information beyond training cutoff
- Domain-specific: Incorporate proprietary or specialized data
- Cost-effective: Cheaper than fine-tuning for knowledge updates
- Transparency: Can cite sources for generated content
RAG vs Fine-tuning:
Use RAG when:
- ⢠Knowledge changes frequently
- ⢠Need source citations
- ⢠Limited training data
- ⢠Quick deployment needed
Use Fine-tuning when:
- ⢠Teaching new behaviors/style
- ⢠Consistent format needed
- ⢠Domain-specific terminology
- ⢠Latency is critical
Chunking is the process of splitting documents into smaller pieces for embedding and retrieval. The strategy significantly impacts RAG quality.
Common Chunking Strategies:
Chunking Best Practices:
- Chunk size: 500-1500 characters for most use cases
- Overlap: 10-20% of chunk size to preserve context
- Metadata: Include source, page number, section headers
- Semantic boundaries: Split at paragraphs, not mid-sentence
- Test and iterate: Evaluate retrieval quality with your data
Embedding models convert text into dense vectors that capture semantic meaning. Choosing the right model affects retrieval quality significantly.
Popular Embedding Models:
| Model | Dimensions | Best For |
|---|---|---|
| OpenAI text-embedding-3-large | 3072 | High accuracy, general purpose |
| OpenAI text-embedding-3-small | 1536 | Cost-effective, good quality |
| Cohere embed-v3 | 1024 | Multilingual, search optimized |
| sentence-transformers/all-MiniLM-L6-v2 | 384 | Fast, local, free |
| BAAI/bge-large-en-v1.5 | 1024 | Open source, high quality |
Selection Criteria:
- Accuracy: Check MTEB benchmark scores
- Cost: API costs vs self-hosted
- Latency: Embedding generation time
- Language: Multilingual support needed?
- Privacy: Can data leave your infrastructure?
Vector databases store and efficiently search high-dimensional embeddings using approximate nearest neighbor (ANN) algorithms.
Popular Vector Databases:
- Pinecone: Fully managed, easy to use, scalable
- Weaviate: Open source, hybrid search, GraphQL API
- Chroma: Lightweight, embedded, great for prototyping
- Milvus: Open source, highly scalable, cloud-native
- Qdrant: Rust-based, fast, filtering support
- pgvector: PostgreSQL extension, SQL integration
Retrieval Methods:
Similarity Search
- ⢠Cosine similarity (most common)
- ⢠Euclidean distance
- ⢠Dot product
Advanced Retrieval
- ⢠MMR (Maximal Marginal Relevance)
- ⢠Re-ranking with cross-encoders
- ⢠Metadata filtering
Hybrid search combines semantic (vector) search with traditional keyword (BM25/TF-IDF) search to get the best of both approaches.
Why Hybrid Search?
- Semantic search: Understands meaning, handles synonyms
- Keyword search: Exact matches, specific terms, acronyms
- Combined: Better recall and precision
Hybrid Search Best Practices:
- Weight tuning: Test different keyword/semantic ratios
- Re-ranking: Use cross-encoder for final ranking
- Query expansion: Add synonyms for better recall
- Evaluation: Measure with nDCG, MRR, recall@k
LangChain Framework
Deep dive into LangChain components, patterns, and best practices
LangChain is a framework for building applications with LLMs. It provides modular components that can be composed together.
Core Components:
- Models: LLMs and Chat Models (OpenAI, Anthropic, etc.)
- Prompts: Templates for structuring LLM inputs
- Chains: Sequences of calls (LLM, tools, etc.)
- Memory: Persist state across chain runs
- Agents: Use LLMs to decide which tools to use
- Tools: Functions that agents can call
- Retrievers: Fetch relevant documents
LCEL (LangChain Expression Language) is a declarative way to compose chains using the pipe operator. It's the modern, recommended way to build LangChain applications.
Key Benefits:
- Streaming: First-class streaming support
- Async: Native async/await support
- Parallel: Automatic parallelization
- Retries: Built-in retry logic
- Tracing: LangSmith integration
Tools in LangChain are functions that agents can call. They require a name, description, and implementation.
Tool Design Best Practices:
- Clear descriptions: Help LLM understand when to use
- Error handling: Return helpful error messages
- Input validation: Use Pydantic for type safety
- Async support: Implement _arun for async agents
LangSmith is a platform for debugging, testing, evaluating, and monitoring LangChain applications. It's essential for production deployments.
Key Features:
- Tracing: See every step in your chain
- Debugging: Inspect inputs/outputs at each step
- Evaluation: Test chains against datasets
- Monitoring: Track latency, costs, errors
- Datasets: Create test sets for evaluation
Production Monitoring:
Track
- ⢠Latency per step
- ⢠Token usage & costs
- ⢠Error rates
- ⢠User feedback
Debug
- ⢠Failed runs
- ⢠Hallucinations
- ⢠Prompt issues
- ⢠Tool errors
Understanding when to use each approach is crucial for building effective LangChain applications.
| Approach | Use When | Avoid When |
|---|---|---|
| LCEL Chains | Fixed, predictable workflows | Dynamic tool selection needed |
| Agents | Dynamic reasoning, tool selection | Simple, predictable tasks |
| LangGraph | Complex state, cycles, multi-agent | Simple linear workflows |
Quick Decision Guide:
- Simple Q&A or RAG: LCEL Chain
- Tool use with reasoning: Agent
- Multi-step with retries: LangGraph
- Multi-agent systems: LangGraph
- Production reliability: LangGraph (better control)
Interview Tips for AI Agents
- ā Understand agent architecture: perception, reasoning, planning, memory, action
- ā Know popular agent patterns (ReAct, Plan-and-Execute, AutoGPT)
- ā Be familiar with frameworks like LangChain, LangGraph, AutoGen
- ā Understand different memory types (short-term, long-term, semantic)
- ā Know how to implement tool use and function calling
- ā Be ready to discuss multi-agent systems and coordination
- ā Understand challenges: hallucination, reliability, error handling