Natural language processing (NLP)

Definition

NLP covers tasks on text: classification, NER, QA, summarization, translation, and generation. Modern NLP is dominated by pretrained transformers (BERT, GPT, etc.) and LLMs.

Inputs are discrete (tokens); models learn from large corpora and are then adapted via fine-tuning or prompting. RAG and agents add retrieval and tools on top of NLP models for grounded QA and task completion.

How it works

Text is tokenized (split into subwords or words) and optionally normalized. The model (e.g. BERT, GPT) processes token IDs through embeddings and transformer layers to produce contextual representations. A task output head (e.g. classifier, span predictor, or next-token decoder) maps those to the final prediction. Models are pretrained on large corpora (masked LM or next-token prediction), then fine-tuned or prompted for downstream tasks. Pipelines often combine tokenization, embedding, and task-specific heads; LLMs can do many tasks with a single model and the right prompt.

Use cases

NLP applies to any product or pipeline that needs to understand or generate text at scale.

Machine translation, summarization, and question answering
Named entity recognition, sentiment analysis, and text classification
Chatbots, code generation, and document understanding

Natural language processing (NLP)

Definition

How it works

Use cases

External documentation

See also

Definition​

How it works​

Use cases​

External documentation​

See also​

Definition

How it works

Use cases

External documentation

See also