Pular para o conteúdo principal

Modelos de linguagem grandes (LLMs)

Definição

Grandes modelos de linguagem são modelos baseados em transformers treinados em dados textuais massivos (e às vezes multimodais). They exhibit emergent abilities: few-shot learning, raciocínio, and tool use when scaled and aligned (por ex. via RLHF).

Um modelo mental útil: pré-treinamento aprende previsão do próximo token em enormes corpus e dá ao modelo amplo conhecimento and language ability. Instruction tuning (and similar) trains the model to follow user instructions and formats. Alignment (por ex. RLHF, DPO) shapes behavior to be helpful, honest, and safe. At inference time you can use the model zero-shot, few-shot, or augment it with recuperação (RAG) or tools (agents).

Como funciona

Pré-treinamento aprende previsão do próximo token em grandes corpus e produz um modelo base. Fine-tuning opcional (por ex. fine-tuning) adapts it to tasks or instruction formats; alignment (por ex. RLHF, DPO) optimizes human preference and safety. The deployed model is then used at inference time. You can call it zero-shot (no examples), few-shot (with prompt engineering), or augment it with RAG (recuperação as context) or agents (tools and loops). The diagram summarizes the training pipeline and the two main inference augmentations.

Casos de uso

LLMs are used wherever you need flexible language understanding or generation, from chat to code to analysis.

  • Chat, summarization, and translation
  • Code assistance and generation
  • Question answering and research assistance (often with RAG or tools)

Vantagens e desvantagens

ProsCons
Flexible, one model for many tasksCost and latency
Strong few-shot performanceHallucination, bias
Enables agents and tool useRequires careful evaluation

Documentação externa

Veja também