Category
LLM Engineering
8 published articles
01 Embeddings and Vector Databases
This note delves into the core concepts and applications of embeddings and vector databases. Embedding involves transforming unstructured data such as text and images into high-dimensional vectors to capture their semantic information. Vector databases efficiently store and retrieve these vectors, providing LLMs with long-term memory and semantic search capabilities. The note also details the practical applications of embedding techniques such as Word2Vec and TF-IDF, as well as vector databases like FAISS.
02 RAG Technology and Applications
This note delves deeply into RAG technology. Starting from core principles, advantages, and application development models, it provides a detailed introduction to embedding model selection and a practical case study on building a local knowledge base retrieval system using DeepSeek and Faiss. Additionally, it covers advanced optimization strategies such as query rewriting and online query search, aiming to improve the accuracy and timeliness of large-scale model question-answering systems.
03 RAG Multimodal Data Processing
This document provides a detailed overview of RAG multimodal data processing, including Gemini’s multimodal capabilities and API usage. Using the Disney RAG Assistant case study, it delves into Multimodal-Embedding, Faiss index construction, the unified multimodal vector space, and the query processing workflow. Finally, it compares and analyzes various knowledge slicing strategies and their applicable scenarios.
04 RAG Advanced Techniques and Optimization
This material explores advanced techniques and optimizations for RAG systems, structured across four main dimensions: Knowledge Base Processing, Efficient Retrieval, GraphRAG, and Agentic RAG. It cove
05 Hands-on Project: Enterprise Knowledge Base
This material explores a winning RAG system for an enterprise knowledge base challenge, focusing on processing complex annual reports for Q&A. It details the complete RAG pipeline from custom PDF pars
06 Function Calling&MCP
This material introduces Function Calling as a mechanism to extend large language models' capabilities by allowing them to invoke external tools for real-world interactions. It then delves into Model
07 Agent Autonomous Planning and Tool Development
This material explores the capabilities and design principles of AI Agents, systems powered by Large Language Models (LLMs) capable of autonomous planning, memory, and tool use to execute complex task
08 Agent Optimization and Performance Evaluation
This material covers methods for optimizing and evaluating AI Agents, focusing on practical tools and frameworks. It introduces a hybrid agent architecture using LangChain/LangGraph, demonstrates the