Skip to content

Abbreviations

This page lists high-frequency technical abbreviations used throughout the book for quick lookup across parts and chapters.

General Abbreviations

Table FM-1: General Abbreviations.

Abbreviation Full Name Description Main Locations
A100 NVIDIA A100 GPU NVIDIA A100 accelerator Part 1, Part 10, Part 11
AI Artificial Intelligence Artificial intelligence Whole book
AGI Artificial General Intelligence Artificial general intelligence Part 1, Part 11
API Application Programming Interface Application programming interface Part 1, Part 4, Part 10, Part 11
ANN Approximate Nearest Neighbor Approximate nearest-neighbor retrieval Part 1, Part 7
ASR Automatic Speech Recognition Automatic speech recognition Part 3
BM25 Best Matching 25 Classic sparse retrieval ranking method Part 1, Part 7
CI/CD Continuous Integration / Continuous Deployment Continuous integration and continuous deployment Part 1, Part 2, Part 8
CPU Central Processing Unit Central processing unit Part 1, Part 2
CSV Comma-Separated Values Comma-separated text format Part 1
ETL Extract, Transform, Load Extract, transform, and load Part 1, Part 8
GPU Graphics Processing Unit Graphics processing unit Part 1, Part 3, Part 10, Part 11
GUID Globally Unique Identifier Globally unique identifier Part 2
HDFS Hadoop Distributed File System Hadoop distributed file system Part 1, Part 10
H100 NVIDIA H100 GPU NVIDIA H100 accelerator Part 1
JSON JavaScript Object Notation Structured data interchange format Part 1, Part 3, Part 4, Part 10, Part 11
JSONL JSON Lines Line-delimited JSON text format Part 1, Part 3, Part 4, Part 10
KPI Key Performance Indicator Key performance indicator Part 4, Part 8
LLM Large Language Model Large language model Whole book
MLOps Machine Learning Operations Machine-learning engineering and operations system Part 1, Part 8
PDF Portable Document Format Portable document format Part 1, Part 3, Part 7, Part 10, Part 11
PII Personally Identifiable Information Personally identifiable information Part 2, Part 9, Part 10
ROI Return on Investment Return on investment Part 1, Part 8
SLA Service Level Agreement Service-level agreement Part 1, Part 4, Part 8
SOPs Standard Operating Procedures Standard operating procedures Part 4, Part 8
SQL Structured Query Language Structured query language Part 1, Part 4, Part 11
TPU Tensor Processing Unit Tensor processing unit Part 1
UTF-8 8-bit Unicode Transformation Format 8-bit Unicode transformation format Part 2

Data Engineering and Platforms

Table FM-2: Data Engineering and Platforms.

Abbreviation Full Name Description Main Locations
DataOps Data Operations Data operations and data-engineering operations system Part 2, Part 8, Part 10
DOM Document Object Model Document object model Part 3, Part 11
DVC Data Version Control Data version control tool or method Part 1, Part 2, Part 8
FAISS Facebook AI Similarity Search Vector similarity search library Part 7
FastText FastText Lightweight text representation and classification tool Part 2
LakeFS LakeFS Version management system for data lakes Part 1, Part 8
MATTR Moving-Average Type-Token Ratio Moving-average type-token ratio Part 2
MFU Model FLOPs Utilization Model FLOPs utilization Part 2
MinHash Min-wise Independent Permutations Hashing Approximate deduplication method based on min-wise hashing Part 1, Part 2, Part 11
OOM Out Of Memory Out-of-memory error Part 1, Part 2
PPL Perplexity Perplexity metric Part 1, Part 2, Part 4
RDMA Remote Direct Memory Access Remote direct memory access Part 1
ReDoS Regular Expression Denial of Service Regular-expression denial-of-service risk Part 2
TTR Type-Token Ratio Type-token ratio, a diversity metric Part 2
WARC Web ARChive Web archive format Part 2
WebDataset WebDataset Data packaging format and tool for large-scale training Part 2, Part 3

Training, Alignment, and Reasoning

Table FM-3: Training, Alignment, and Reasoning.

Abbreviation Full Name Description Main Locations
CoT Chain-of-Thought Chain-of-thought reasoning Part 6, Part 10, Part 11
DPO Direct Preference Optimization Direct preference optimization Part 4, Part 11
LoRA Low-Rank Adaptation Low-rank adaptation fine-tuning method Part 11
PPO Proximal Policy Optimization Proximal policy optimization Part 4, Part 11
PRM Process Reward Model Process reward model Part 4, Part 6, Part 10, Part 11
QA Quality Assurance Quality assurance Part 4, Part 10
RAG Retrieval-Augmented Generation Retrieval-augmented generation Part 7, Part 10
RL Reinforcement Learning Reinforcement learning Part 4, Part 11
RLAIF Reinforcement Learning from AI Feedback Reinforcement learning from AI feedback Part 4
RLHF Reinforcement Learning from Human Feedback Reinforcement learning from human feedback Part 4, Part 11
RM Reward Model Reward model Part 4
ROUGE-L Recall-Oriented Understudy for Gisting Evaluation - Longest Common Subsequence Text-similarity metric based on the longest common subsequence Part 2, Part 4
SFT Supervised Fine-Tuning Supervised fine-tuning Part 4, Part 10, Part 11

Multimodality and Vision

Table FM-4: Multimodality and Vision.

Abbreviation Full Name Description Main Locations
BBox Bounding Box Bounding box Part 3, Part 10, Part 11
ChartQA Chart Question Answering Chart question-answering task or dataset Part 3, Part 11
CLIP Contrastive Language-Image Pre-training Contrastive image-text pre-training model Part 3, Part 11
CLIP-Score CLIP Score Image-text relevance score based on CLIP Part 11
COCO Common Objects in Context General object-detection and image-captioning dataset Part 3, Part 10
DINO DEtection TRansformer with Improved deNoising anchOr boxes Detection model family, often used in Grounding DINO contexts Part 3, Part 11
DocVQA Document Visual Question Answering Document visual question-answering task or dataset Part 11
Grounding Visual Grounding Visual grounding or alignment task Part 3, Part 10, Part 11
IoU Intersection over Union Object-detection overlap metric Part 3
LLaVA Large Language and Vision Assistant Multimodal large model and data format name Part 10, Part 11
OCR Optical Character Recognition Optical character recognition Part 3, Part 7, Part 10, Part 11
OCR-Rich OCR-Rich Data Image or document data rich in OCR information Part 11
SSIM Structural Similarity Index Measure Structural similarity metric Part 11
ViT Vision Transformer Vision Transformer encoder Part 11
VLM Vision-Language Model Vision-language model Part 3, Part 11
VQA Visual Question Answering Visual question answering Part 3, Part 11
XML eXtensible Markup Language Extensible markup language Part 1, Part 3
YOLO You Only Look Once Object-detection model family Part 3

Evaluation, Compliance, and Governance

Table FM-5: Evaluation, Compliance, and Governance.

Abbreviation Full Name Description Main Locations
AGI-Eval AGI Evaluation Evaluation benchmark for general-intelligence capabilities Part 11
DPIA Data Protection Impact Assessment Data protection impact assessment Part 9
GSM8K Grade School Math 8K Grade-school math reasoning benchmark Part 1, Part 11
MCTS Monte Carlo Tree Search Monte Carlo tree search Part 11
MMLU Massive Multitask Language Understanding Massive multitask language-understanding benchmark Part 1, Part 11
MMMU Massive Multi-discipline Multimodal Understanding and Reasoning Multidiscipline multimodal understanding and reasoning benchmark Part 11
NSFW Not Safe For Work Content unsuitable for public or workplace contexts Part 2
P99 99th Percentile 99th percentile metric Part 2
P99.9 99.9th Percentile 99.9th percentile metric Part 2
RoPA Record of Processing Activities Record of processing activities Part 9