66 docs tagged with "Intermediate"

Requires basic AI/ML understanding

Agent memory

How AI agents store, retrieve, and reason over information across turns and sessions.

Agent prompt engineering

Best practices for writing system prompts that produce reliable, well-scoped AI agent behavior.

Anthropic Tool Use

Claude's native function/tool calling mechanism using JSON schema definitions, tool_use and tool_result message types, with support for multi-turn tool use, parallel calls, and streaming.

AutoGen

Microsoft's multi-agent conversation framework enabling LLM-powered agents to collaborate via structured message exchanges, with built-in code execution and human-in-the-loop support.

Autonomous agents

Agents that operate with minimal human intervention.

Benchmarks

Standard benchmarks for AI: GLUE, SuperGLUE, MMLU, and more.

BERT

Bidirectional Encoder Representations from Transformers.

Case study — BART

Encoder-decoder predecessor to Gemini; denoising pretraining for summarization and generation.

Case study — DeepSeek

DeepSeek AI's open-weight LLMs with strong reasoning and code; MoE and efficient scaling.

Case study — Qwen

Alibaba's LLM family; multilingual, coding, and long-context support.

Chain-of-thought (CoT)

Step-by-step reasoning to improve LLM outputs.

CI/CD for ML

Continuous integration and delivery adapted for machine learning — testing data, models, and code together.

Claude Code skills

Reusable, invocable prompt templates that extend Claude Code's capabilities — what skills are, how to write them, where to store them, and how to invoke them with /skill-name.

Cohere

Enterprise-focused AI platform specializing in embeddings, reranking, and RAG for search and information retrieval at scale.

Context management

How Claude Code manages the context window across long sessions — automatic compression, conversation history strategies, and practical techniques for keeping sessions effective at scale.

Conversational memory

Memory patterns for chat agents — buffer, summary, vector, and entity memory.

Convolutional neural networks (CNN)

CNNs for spatial and image data.

CrewAI

Role-based multi-agent framework where agents have explicit roles, goals, and backstories, collaborating through structured tasks and crew processes.

Data pipelines

An overview of data pipelines in the ML context — batch vs streaming, ETL vs ELT, data quality, and schema validation.

Data Version Control (DVC)

Git for data and models — versioning datasets, pipelines, and experiments alongside source code.

Deep reinforcement learning (DRL)

RL with deep neural networks for function approximation.

DeepSeek

Chinese AI lab offering open-weights models with state-of-the-art reasoning and coding capabilities at significantly lower cost than proprietary alternatives.

Evaluation metrics

Measuring model performance across tasks.

Experiment tracking

How to systematically log, compare, and reproduce ML experiments using tracking tools.

Fine-tuning

Adapting LLMs to specific tasks and domains.

GPT

Generative Pre-trained Transformer and decoder-only models.

Hugging Face

Platform and libraries for models, datasets, and pipelines.

Infrastructure

Hardware and systems for training and serving AI: GPUs, TPUs, clusters.

LangChain

Framework for LLM applications and agents.

Stateful agent graphs built on LangChain, where nodes are Python functions, edges define routing, and a shared TypedDict state enables cycles, conditional branching, persistence, and human-in-the-loop checkpoints.

LlamaIndex

Data framework for LLM applications and RAG.

Local inference

Running AI models on-device or on-premises instead of cloud APIs.

Meta Llama

Meta's open-weights Llama model family — local deployment, third-party API hosting, fine-tuning, and the open vs. closed model debate.

Mistral AI

Mistral AI's dual open-weights and commercial API platform — efficient models, multilingual strengths, and La Plateforme for enterprise use.

ML monitoring

Comprehensive guide to monitoring machine learning models in production, covering concept drift, data drift, model decay, metrics, alerting strategies, and tooling.

MLflow

Open-source platform for the complete ML lifecycle, covering experiment tracking, projects, models, and the registry.

Model compression

Reducing model size and compute for deployment.

Model Context Protocol (MCP)

An open standard for connecting AI models to external tools, data sources, and services — enabling portable, interoperable tool use across any AI application.

Model registry

Centralized store for versioning, staging, and governing ML model artifacts across their full lifecycle.

Model serving

Strategies and frameworks for deploying ML models as scalable inference services — batch, real-time, and streaming.

Multi-agent systems

Multiple agents collaborating or competing.

Multimodal AI

Models that process and generate across text, image, audio, and video modalities.

ONNX Runtime

Cross-platform, high-performance inference engine for ONNX models with support for CPU, GPU, and NPU execution providers.

Planner-Executor architecture

Architecture where one LLM creates a step-by-step plan and another executes each step independently.

Prompt engineering

Designing prompts to steer LLM behavior and improve outputs.

Prompt ensembling

A technique that runs multiple structurally different prompt variations against the same LLM and aggregates their outputs, trading inference cost for higher accuracy and lower variance than any single prompt can achieve.

PyTorch Mobile

Deploy PyTorch models on mobile and edge devices using TorchScript and the next-generation ExecuTorch runtime.

RAG architecture

Components and design choices in RAG systems.

RAG examples

Example RAG pipelines and code snippets.

ReAct (Reasoning + Acting)

Interleaving reasoning and action in agents.

Reasoning patterns

How LLMs and agents structure reasoning and action.

Recurrent neural networks (RNN)

RNNs and sequential data.

Retrieval-augmented generation (RAG)

Combining retrieval with LLM generation for accurate, grounded answers.

Retrieval-decision-design (RDD)

Spec-driven reasoning pattern combining retrieval and decision design.

Self-consistency

A prompting technique that generates multiple independent chain-of-thought reasoning paths and selects the final answer by majority vote, significantly improving reliability over single-pass chain-of-thought.

Semantic search

Search by meaning using embeddings and similarity.

Spec-driven development

Building AI systems from explicit specifications.

Step-back prompting

A two-step prompting technique that first asks the model a higher-level abstract question, then uses that abstraction as context to answer the original specific question — improving reasoning accuracy on complex tasks.

Streaming (LLMs)

Token-by-token output for lower perceived latency and better UX.

Structured outputs

Techniques for getting LLMs to produce machine-readable structured data — JSON mode, function calling schemas, and Pydantic-based extraction — enabling reliable integration into APIs and automated pipelines.

Subagents

Hierarchical agents and delegation.

TensorFlow Lite

Lightweight runtime for on-device ML inference across Android, iOS, embedded systems, and microcontrollers.

Thinking modes and effort

Extended thinking in Claude Code — what it is, how effort levels affect reasoning depth versus speed, and how to configure thinking behavior for different task types.

Tree of thoughts (ToT)

Exploring multiple reasoning branches.

Vector databases

Storing and searching embeddings for RAG.

Weights & Biases (W&B)

Cloud-native MLOps platform for experiment tracking, hyperparameter sweeps, artifact management, and collaborative reporting.