{"id":24452,"date":"2026-05-15T16:21:53","date_gmt":"2026-05-15T09:21:53","guid":{"rendered":"https:\/\/fpt-is.com\/en\/?post_type=goc_nhin_so&#038;p=24452"},"modified":"2026-05-16T14:57:08","modified_gmt":"2026-05-16T07:57:08","slug":"knowledge-graphs-and-their-application-in-law","status":"publish","type":"goc_nhin_so","link":"https:\/\/fpt-is.com\/en\/insights\/knowledge-graphs-and-their-application-in-law\/","title":{"rendered":"Knowledge Graphs and their application in law: how GraphRAG is rewriting the rules of Legal AI"},"content":{"rendered":"<h2>Summary<\/h2>\n<p>Large Language Models (LLMs) are driving a dramatic transformation in the legal industry. Yet Legal AI systems built on LLMs still suffer from a fundamental flaw: hallucination \u2014 where a model fabricates non-existent case law, misquotes regulations, or draws conclusions with no legal basis whatsoever. A 2024 Stanford study found that even specialized platforms like LexisNexis and Thomson Reuters recorded significant error rates. In a field where precision and traceability are non-negotiable, this is an especially serious risk.<\/p>\n<p>This article examines the limitations of traditional RAG architecture and introduces GraphRAG combined with Knowledge Graphs as a new foundation for Legal AI. We focus on four key questions: (1) what is a Knowledge Graph and why is legal data uniquely suited to the graph model; (2) what are the shortcomings of traditional RAG and how does GraphRAG improve legal reasoning, retrieval, and verification; (3) how do you actually build a Legal Knowledge Graph and what can you do with it; and (4) what does the road ahead look like for GraphRAG in Vietnam.<\/p>\n<h2><span style=\"color: #ff6600\">1. Introduction: When AI goes to court \u2014 and gets it wrong<\/span><\/h2>\n<p>In June 2023, two Manhattan attorneys submitted a federal court brief in <em>Mata v. Avianca<\/em> citing six precedents that ChatGPT had helpfully provided. Every single one was fictional. By the end of 2025, hundreds of similar incidents had been documented across the United States, the United Kingdom, Australia, and Canada. These cases exposed a core limitation of first-generation AI in law: models can produce information that <em>looks<\/em> legally sound but has no basis in actual law.<\/p>\n<p>And yet, law remains one of the sectors with the greatest potential for AI transformation. A 2023 Goldman Sachs report estimated that roughly 44% of legal work could be supported by AI, while a 2024 Thomson Reuters survey found that 67% of law firms believe AI will have a major impact on the profession within five years.<\/p>\n<p>The biggest challenge for Legal AI, however, is not speed \u2014 it&#8217;s reliability, traceability, and accurate legal reasoning. That is precisely why Knowledge Graphs and GraphRAG are emerging as the foundation for the next generation of Legal AI. Rather than treating legal text as isolated data fragments, these systems model the entire legal system as a network of entities and relationships, enabling AI to understand the structure, interconnections, and validity of legal documents.<\/p>\n<h2><span style=\"color: #ff6600\">2. What is a Knowledge Graph?<\/span><\/h2>\n<h3>2.1 The Basic Concept<\/h3>\n<p>A Knowledge Graph is a method of organizing data as a network of entities and the relationships between them. Every person, organization, document, or concept is represented as a <strong>node<\/strong>, while the relationships connecting them are represented as <strong>edges<\/strong>. This mirrors how human beings actually remember and reason \u2014 far more naturally than storing data as disconnected text fragments.<\/p>\n<p>A typical Knowledge Graph has three core components:<\/p>\n<ul>\n<li><strong>Nodes:<\/strong> representing entities in the system. For example, the node &#8220;Enterprise Law 2020&#8221; might carry attributes like document number &#8220;59\/2020\/QH14&#8221;, enactment date &#8220;June 17, 2020&#8221;, and status &#8220;in effect.&#8221;<\/li>\n<li><strong>Edges:<\/strong> representing relationships between entities. For instance, &#8220;Decree 01\/2021\/ND-CP&#8221; holds a [GUIDES] relationship to &#8220;Enterprise Law 2020.&#8221;<\/li>\n<li><strong>Properties:<\/strong> descriptive metadata attached to nodes or edges, enabling querying, filtering, and inference.<\/li>\n<\/ul>\n<p>The term &#8220;graph&#8221; is apt because legal data doesn&#8217;t exist in a simple linear or hierarchical structure. A single law may be implemented by multiple decrees; a single article may reference multiple other documents; a document may simultaneously amend an older law while itself being amended by a newer one. The entire legal system resembles a complex web of interconnections \u2014 not a fixed tree.<\/p>\n<p>The concept gained widespread recognition in 2012 when Google introduced its Knowledge Graph with the tagline <strong>&#8220;things, not strings&#8221;<\/strong> \u2014 emphasizing a shift from processing character sequences to processing real-world entities and relationships.<\/p>\n<h3>2.2 Why legal data is perfectly suited for the Graph Model<\/h3>\n<p>Legal data doesn&#8217;t exist as standalone documents. It forms a dense network of laws, decrees, circulars, judgments, and legal concepts, all tightly interwoven. This is precisely what makes law such a natural fit for the Knowledge Graph model. Four reasons stand out:<\/p>\n<ol>\n<li><strong>a) Complex interconnectivity<\/strong><\/li>\n<\/ol>\n<p>A single article of law often touches multiple documents and legal relationships. Article 17 of the Enterprise Law 2020, for example, is guided by several decrees, references laws including the Law on Cadres and Civil Servants, the Bankruptcy Law, and the Penal Code, and is cited in numerous court judgments. A Knowledge Graph captures all these relationships in a form that closely mirrors how lawyers actually think and analyze.<\/p>\n<ol>\n<li><strong>b) Strict hierarchical structure<\/strong><\/li>\n<\/ol>\n<p>Vietnam&#8217;s legal system is organized in multiple layers \u2014 from the Constitution and Codes, down through Laws, Decrees, Circulars, and local regulations. Within each document, further subdivisions include Parts, Chapters, Sections, Articles, Clauses, and Points. Traditional RAG systems typically destroy this structure when chopping text into small independent chunks, while a Knowledge Graph preserves the hierarchical relationships across the entire legal system.<\/p>\n<ol>\n<li><strong>c) Time-sensitive validity<\/strong><\/li>\n<\/ol>\n<p>The validity of a legal document is not static. A document may have fully lapsed, partially lapsed, be in a transitional period, or apply differently depending on the timeframe. A contract signed in 2024, for instance, may still need to be governed by the Land Law 2013 even after the 2024 Land Law has taken effect. Knowledge Graphs handle this elegantly \u2014 every entity and relationship in the graph can carry temporal and validity metadata.<\/p>\n<ol>\n<li><strong>d) Multi-step legal reasoning<\/strong><\/li>\n<\/ol>\n<p>Many legal questions cannot be answered from a single document. They require chained reasoning across multiple steps. Consider: <em>&#8220;Which still-valid documents guide the laws that the Enterprise Law 2020 references?&#8221;<\/em> To answer this, a system must identify the referenced laws, find their corresponding guidance documents, verify the validity of each, and filter out superseded provisions. Vector search can find semantically similar text, but it cannot automatically traverse complex chains of legal relationships. This is where Knowledge Graphs deliver a decisive advantage.<\/p>\n<p><img decoding=\"async\" class=\"aligncenter size-full wp-image-24462\" src=\"https:\/\/cdn.fpt-is.com\/en\/sites\/3\/2026\/05\/Hierachy-of-Vietnamese-Legal-Documents-1778836100.png\" alt=\"Hierachy Of Vietnamese Legal Documents 1778836100\" width=\"741\" height=\"666\" srcset=\"https:\/\/cdn.fpt-is.com\/en\/sites\/3\/2026\/05\/Hierachy-of-Vietnamese-Legal-Documents-1778836100.png 741w, https:\/\/cdn.fpt-is.com\/en\/sites\/3\/2026\/05\/Hierachy-of-Vietnamese-Legal-Documents-1778836100-700x629.png 700w\" sizes=\"(max-width: 741px) 100vw, 741px\" \/><\/p>\n<p style=\"text-align: center\"><strong>Figure 1.<\/strong> <em>Illustration of a partial Legal Knowledge Graph centered on the Enterprise Law 2020. Each node represents an entity (legal document, article, concept, or court judgment); each edge is labeled to indicate the type of legal relationship.<\/em><\/p>\n<h2><span style=\"color: #ff6600\">3. From RAG to GraphRAG<\/span><\/h2>\n<h3>3.1 Traditional RAG and its limits<\/h3>\n<p>Retrieval-Augmented Generation (RAG) is currently the most widely used architecture in AI-powered document retrieval systems. The core idea: instead of relying entirely on a language model&#8217;s pre-trained knowledge, the system pulls additional information from an external data store at query time. This keeps the AI up-to-date and reduces hallucination risk.<\/p>\n<p>A standard RAG pipeline works in four steps:<\/p>\n<ol>\n<li><strong>Chunking<\/strong> \u2014 splitting documents into smaller pieces<\/li>\n<li><strong>Embedding<\/strong> \u2014 converting each chunk into a vector representing its semantic meaning<\/li>\n<li><strong>Retrieval<\/strong> \u2014 converting the query into a vector and finding the closest-matching chunks<\/li>\n<li><strong>Generation<\/strong> \u2014 passing retrieved chunks to an LLM to generate a response<\/li>\n<\/ol>\n<p>Think of embedding as &#8220;semantic coordinates&#8221; in a high-dimensional space, where similar passages cluster together. This works well for enterprise FAQs, internal documentation, or customer support systems.<\/p>\n<p>But when applied to law, traditional RAG starts showing serious cracks.<\/p>\n<p><strong>Limitation 1 \u2014 Loss of structural context<\/strong><\/p>\n<p>When legal text is chopped into small chunks, the system loses critical information about where a passage sits within the broader document structure. The phrase &#8220;charter capital,&#8221; for example, appears in multiple articles of the Enterprise Law 2020, each applying to a different legal context \u2014 one as a general definition, one specific to single-member LLCs, another for multi-member LLCs. Vector search registers these passages as semantically similar but fails to grasp the differences in legal context, leading to misretrieved rules and inaccurate answers.<\/p>\n<p><strong>Limitation 2 \u2014 Failure to follow legal cross-references<\/strong><\/p>\n<p>Legal systems are riddled with cross-references: &#8220;shall be implemented in accordance with Article X.&#8221; Traditional RAG can retrieve the passage containing the reference \u2014 but cannot automatically follow it to retrieve the referenced content. Ask about procedures for establishing a foreign-invested company, and the system might find the Enterprise Law provision referencing the Investment Law, but fail to retrieve the procedures for obtaining an Investment Registration Certificate (IRC). The answer comes back missing crucial steps.<\/p>\n<p><strong>Limitation 3 \u2014 Inability to track document validity<\/strong><\/p>\n<p>Vector search finds semantically similar content but doesn&#8217;t understand the validity status of legal documents. When asked about land regulations applicable in 2026, a RAG system might return results from both the 2013 Land Law and the 2024 Land Law simultaneously. Without knowing which is currently in force, the AI may base its answer on superseded provisions.<\/p>\n<p><strong>Limitation 4 \u2014 Lost relationships when merging results<\/strong><\/p>\n<p>Traditional RAG assembles multiple chunks from different sources and feeds them to the LLM \u2014 but has no understanding of the legal relationships between those documents. The system doesn&#8217;t know which document amends which, what is the primary text and what is supplementary guidance, or which provisions have been superseded. This leads to answers that simultaneously invoke contradictory regulations.<\/p>\n<p><strong>Limitation 5 \u2014 Failure at multi-hop questions<\/strong><\/p>\n<p>This is the most serious limitation of RAG in Legal AI. Many legal questions require chained reasoning \u2014 not single-document retrieval. Consider: <em>&#8220;Which currently-valid decrees guide the articles of the Enterprise Law 2020 that pertain to single-member limited liability companies?&#8221;<\/em> The system must identify the relevant articles, find the corresponding guidance documents, verify validity status for each, and filter out expired or superseded provisions. This kind of multi-hop reasoning is something traditional vector search is nearly incapable of handling effectively.<\/p>\n<p><em>These five limitations are precisely the problems that the research team at FPT IS AI R&amp;D Center is focused on solving \u2014 specifically for the Vietnamese legal context, with its unique linguistic characteristics and document structure that English-language pipelines have not adequately addressed.<\/em><\/p>\n<h3>3.2 GraphRAG: Where LLMs meet Knowledge Graphs<\/h3>\n<p>GraphRAG was introduced by Microsoft Research in 2024 in the paper <em>&#8220;From Local to Global: A Graph RAG Approach to Query-Focused Summarization.&#8221;<\/em> In short order, it became one of the most important approaches in knowledge AI and Legal AI.<\/p>\n<p>Unlike traditional RAG \u2014 which stores data primarily as discrete chunks \u2014 GraphRAG adds a layer of <strong>structural understanding<\/strong> through a Knowledge Graph. Legal texts, articles, judgments, and legal concepts are represented as nodes; relationships like &#8220;references,&#8221; &#8220;guides,&#8221; &#8220;amends,&#8221; or &#8220;applies to&#8221; are represented as edges connecting those nodes.<\/p>\n<p>When a user asks a question, the system doesn&#8217;t just retrieve the most semantically similar passages. It also identifies relevant entities in the graph, traverses legal relationships, expands context according to the actual structure of the legal system, and checks document validity before generating a response.<\/p>\n<p>GraphRAG doesn&#8217;t wholesale replace vector search. The system still uses embedding to find a relevant starting point, then combines Knowledge Graph traversal to reason and expand context in a controlled, structured way. Think of GraphRAG as the combination of:<\/p>\n<ul>\n<li>The <strong>semantic search<\/strong> capability of vector search<\/li>\n<li>The <strong>relational representation<\/strong> of Knowledge Graphs<\/li>\n<li>The <strong>natural language synthesis<\/strong> of LLMs<\/li>\n<\/ul>\n<p>The result: a system that doesn&#8217;t just <em>find<\/em> information but <em>understands<\/em> how legal provisions relate to each other across the entire data network.<\/p>\n<p>A simple analogy: traditional RAG is like randomly asking passersby for directions \u2014 each person knows only a fragment of the story. GraphRAG is like using Google Maps \u2014 the system knows not just the destination but the entire network of connections and the most logical route through the system.<\/p>\n<h3>3.3 By the Numbers: How much better is GraphRAG?<\/h3>\n<p>GraphRAG addresses traditional RAG&#8217;s limitations by layering graph-based reasoning on top of retrieval. The system understands where content sits within legal structure, automatically follows cross-references, tracks validity states, and preserves relationships like &#8220;amends,&#8221; &#8220;supersedes,&#8221; and &#8220;guides.&#8221; This enables AI to produce more accurate synthesized answers and execute multi-step reasoning chains that traditional vector search simply cannot handle.<\/p>\n<p>Quantitative results from Microsoft Research (Edge et al., 2024) and independent benchmarks are striking:<\/p>\n<table>\n<thead>\n<tr>\n<td><strong>Query Type<\/strong><\/td>\n<td><strong>Traditional RAG<\/strong><\/td>\n<td><strong>GraphRAG<\/strong><\/td>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Multi-hop accuracy<\/td>\n<td>23\u201332%<\/td>\n<td>86\u201387%<\/td>\n<\/tr>\n<tr>\n<td>Corpus-wide synthesis (win rate)<\/td>\n<td>~22%<\/td>\n<td>~78%<\/td>\n<\/tr>\n<tr>\n<td>Citation accuracy<\/td>\n<td>~60%<\/td>\n<td>&gt;90%<\/td>\n<\/tr>\n<tr>\n<td>Specific single-instance queries<\/td>\n<td>~75%<\/td>\n<td>~88%<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><img decoding=\"async\" class=\"aligncenter size-full wp-image-24463\" src=\"https:\/\/cdn.fpt-is.com\/en\/sites\/3\/2026\/05\/knowledge_graph_enterprise_law_en-RS-1778836161.png\" alt=\"Knowledge Graph Enterprise Law En Rs 1778836161\" width=\"828\" height=\"522\" srcset=\"https:\/\/cdn.fpt-is.com\/en\/sites\/3\/2026\/05\/knowledge_graph_enterprise_law_en-RS-1778836161.png 828w, https:\/\/cdn.fpt-is.com\/en\/sites\/3\/2026\/05\/knowledge_graph_enterprise_law_en-RS-1778836161-700x441.png 700w\" sizes=\"(max-width: 828px) 100vw, 828px\" \/><\/p>\n<p style=\"text-align: center\"><em><strong>Figure 2<\/strong>. A visual comparison of the two architectures. Left: Traditional RAG retrieves the top-k semantically similar chunks with no understanding of the relationships between them. Right: GraphRAG identifies the starting node, then traverses meaningful edges through the knowledge graph<strong>.<\/strong><\/em><\/p>\n<h3>3.4 How GraphRAG actually works<\/h3>\n<p>GraphRAG operates in two main phases: <strong>indexing<\/strong> (building the knowledge graph, done once) and <strong>querying<\/strong> (answering questions, at runtime).<\/p>\n<p><strong>Indexing Phase<\/strong><\/p>\n<ul>\n<li><strong>Structure-aware chunking:<\/strong> Unlike traditional RAG&#8217;s fixed-character splits, GraphRAG preserves the natural structure of legal text. Each Article is typically treated as a distinct chunk and tagged with metadata including document number, position in the hierarchy, effective date, and legal status.<\/li>\n<li><strong>Entity extraction:<\/strong> AI reads each chunk to identify key entities \u2014 law names, article numbers, government agencies, legal concepts \u2014 using a combination of rule-based methods and AI models.<\/li>\n<li><strong>Relation extraction:<\/strong> This is GraphRAG&#8217;s most critical step. The AI identifies relationships like &#8220;references,&#8221; &#8220;guides,&#8221; &#8220;amends,&#8221; &#8220;supersedes,&#8221; and &#8220;applies to.&#8221; For example, from the sentence &#8220;This Decree details and guides the implementation of certain articles of the Enterprise Law,&#8221; the system must extract a [GUIDES] relationship.<\/li>\n<li><strong>Graph construction:<\/strong> Extracted data is loaded into a graph database like Neo4j. Entity resolution normalizes different referring expressions \u2014 &#8220;Enterprise Law 2020,&#8221; &#8220;Law No. 59\/2020\/QH14,&#8221; &#8220;Enterprise Law&#8221; \u2014 to the same node.<\/li>\n<li><strong>Community detection &amp; hierarchical summarization:<\/strong> Graph algorithms automatically identify clusters of tightly connected nodes \u2014 typically corresponding to domains like corporate law, labor law, or tax \u2014 and generate multi-level summaries to support broad, synthesis-type queries.<\/li>\n<\/ul>\n<p><strong>Query Phase<\/strong><\/p>\n<p>When a user submits a question, GraphRAG analyzes the query to identify intent and relevant legal entities. It then selects the appropriate search strategy \u2014 <strong>local search<\/strong> or <strong>global search<\/strong>.<\/p>\n<p>The system traverses the Knowledge Graph, moving through relevant nodes and relationships to expand context according to actual legal structure. Retrieved data is filtered by temporal validity to eliminate provisions that are no longer current. Finally, an LLM generates the response with specific citations to source articles and documents.<\/p>\n<p>A final <strong>verification layer<\/strong> checks whether the citations actually support the stated conclusions \u2014 providing an additional safeguard against hallucination.<\/p>\n<p><strong>Local vs. Global Search:<\/strong><\/p>\n<ul>\n<li><strong>Local search<\/strong> is ideal for specific queries about a particular article or a small number of legal entities \u2014 e.g., &#8220;What does Article 17 provide?&#8221; The system retrieves a small subgraph, making it fast and resource-efficient.<\/li>\n<li><strong>Global search<\/strong> handles broad synthesis questions requiring reasoning across wide scope. It relies on community summaries and the ability to connect and reason across multiple data clusters. Slower and more compute-intensive, global search enables GraphRAG to handle the kinds of questions that traditional RAG effectively cannot.<\/li>\n<\/ul>\n<h2><span style=\"color: #ff6600\">4. Building legal Knowledge Graphs<\/span><\/h2>\n<h3>4.1 The Construction Methodology<\/h3>\n<p>Building a Knowledge Graph for law is equal parts data engineering, natural language processing, and legal expertise. The process typically unfolds in four stages.<\/p>\n<p><strong>Stage 1 \u2014 Data Collection and Normalization<\/strong><\/p>\n<p>In Vietnam, key data sources include vbpl.vn (the National Legal Document Database), the public court judgment portal, ministerial information systems, and the national public services portal. In enterprise settings, the corpus also includes contracts, internal policies, and compliance documents. The major challenge: data exists in multiple formats \u2014 HTML, Word, text-based PDF, and scanned PDF \u2014 each requiring its own processing pipeline. OCR quality for scanned Vietnamese-language documents directly impacts the accuracy of every downstream step.<\/p>\n<p><strong>Stage 2 \u2014 Ontology Definition<\/strong><\/p>\n<p>The ontology is the &#8220;logical framework&#8221; of the Knowledge Graph, defining the types of entities and relationships in the legal system. For Vietnamese law, a solid ontology typically models:<\/p>\n<ul>\n<li>Document types: Constitution, Code, Law, Decree, Circular<\/li>\n<li>Internal structure: Part, Chapter, Article, Clause, Point<\/li>\n<li>Legal relationships: &#8220;references,&#8221; &#8220;guides,&#8221; &#8220;amends,&#8221; &#8220;supersedes,&#8221; &#8220;revokes&#8221;<\/li>\n<li>Temporal and validity attributes<\/li>\n<\/ul>\n<p><strong>Stage 3 \u2014 Entity and Relation Extraction<\/strong><\/p>\n<p>This stage directly determines the quality of the resulting Knowledge Graph. Modern systems combine three approaches:<\/p>\n<ul>\n<li><strong>Rule-based<\/strong> for clearly structured patterns<\/li>\n<li><strong>Machine learning<\/strong> with NER and relation extraction models trained on Vietnamese legal data<\/li>\n<li><strong>LLM-based<\/strong> methods for complex semantics and multi-layer references<\/li>\n<\/ul>\n<p>Despite advancing AI capabilities, high-quality systems still require legal expert review to validate and calibrate extracted results.<\/p>\n<p><strong>Stage 4 \u2014 Integration and Validation<\/strong><\/p>\n<p>Extracted data is loaded into a graph database to build the complete Knowledge Graph. This stage involves entity resolution (normalizing different names for the same document), conflict resolution across data sources, tracking revision history and validity status, and building data quality metrics. This is where you ensure consistency, traceability, and reliability across the entire Legal AI system.<\/p>\n<h3>4.2 Storage, Querying, and Legal Inference<\/h3>\n<p>In current Legal AI projects, <strong>Neo4j<\/strong> is the most widely used graph database, thanks to its large developer community and the relatively accessible Cypher query language. For very large-scale systems with tens of millions of nodes and relationships, <strong>TigerGraph<\/strong> is a notable alternative for high-performance graph processing.<\/p>\n<p>One of the most powerful capabilities of a Knowledge Graph is supporting multi-dimensional legal inference:<\/p>\n<ul>\n<li><strong>Chain-of-relationship inference:<\/strong> If Document A amends Document B, and Document B guides a particular article, the system can infer that Document A also indirectly affects that article.<\/li>\n<li><strong>Temporal inference:<\/strong> The system can determine which document applied at a specific point in time, based on effective dates, amendment timestamps, and supersession records.<\/li>\n<li><strong>Hierarchical inference:<\/strong> If a regulation applies to a broader legal category, sub-categories typically inherit that regulation \u2014 unless a specific carve-out exists.<\/li>\n<li><strong>Legal principle inference:<\/strong> The graph can support foundational rules of legal reasoning, including <em>lex posterior<\/em> (later law prevails), <em>lex specialis<\/em> (specific law prevails over general), and hierarchical priority when lower-level instruments conflict with higher-level ones.<\/li>\n<\/ul>\n<p><img decoding=\"async\" class=\"aligncenter size-full wp-image-24464\" src=\"https:\/\/cdn.fpt-is.com\/en\/sites\/3\/2026\/05\/rag_vs_graphrag_en-RS-1778836190.png\" alt=\"Rag Vs Graphrag En Rs 1778836190\" width=\"1000\" height=\"535\" srcset=\"https:\/\/cdn.fpt-is.com\/en\/sites\/3\/2026\/05\/rag_vs_graphrag_en-RS-1778836190.png 1000w, https:\/\/cdn.fpt-is.com\/en\/sites\/3\/2026\/05\/rag_vs_graphrag_en-RS-1778836190-700x375.png 700w\" sizes=\"(max-width: 1000px) 100vw, 1000px\" \/><\/p>\n<p style=\"text-align: center\"><strong>Figure 3.<\/strong> <em>The hierarchical structure of Vietnam&#8217;s legal document system, with legal authority increasing from bottom to top.<\/em><\/p>\n<h3>4.3 Real-World Applications for Legal AI<\/h3>\n<p><strong>Smart Legal Research.<\/strong> The most fundamental application: natural language querying with synthesized, citation-backed answers. GraphRAG handles questions that traditional keyword search cannot \u2014 multi-hop queries, time-specific questions, old-versus-new regulatory comparisons, case law trend analysis, and assessments of regulatory impact on business operations.<\/p>\n<p><strong>Contract Analysis and Clause Review.<\/strong> GraphRAG enables AI to read contracts, extract key provisions, and cross-reference them against the Legal Knowledge Graph to identify risks or unfavorable terms \u2014 including clause classification and comparison against the Civil Code and sector-specific laws, producing risk reports.<\/p>\n<p><strong>Legal Due Diligence for M&amp;A.<\/strong> GraphRAG can automatically review large volumes of documents (contracts, charters, licenses, tax and investment records), check approval requirements under Competition Law, assess foreign investment conditions, and flag risks across tax, labor, intellectual property, and personal data protection \u2014 surfacing legal interdependencies that are easy to miss under time pressure.<\/p>\n<h3>4.4 The limitations: What GraphRAG still can&#8217;t fully solve<\/h3>\n<p>Despite its advantages, Knowledge Graphs and GraphRAG come with real limitations.<\/p>\n<p><strong>High build cost.<\/strong> Constructing and maintaining the system requires a multidisciplinary team of AI engineers, data engineers, and legal experts. Full deployment and calibration can take months to years.<\/p>\n<p><strong>Continuous maintenance required.<\/strong> When law changes, the graph must be updated. Stale data means AI may reason from superseded provisions.<\/p>\n<p><strong>Quality depends on extraction accuracy.<\/strong> If the system misidentifies or misses relationships like &#8220;amends,&#8221; &#8220;revokes,&#8221; or &#8220;guides,&#8221; downstream reasoning will be flawed.<\/p>\n<p><strong>Hallucination is reduced but not eliminated.<\/strong> In Legal AI, the most dangerous errors aren&#8217;t wholesale fabrication \u2014 they&#8217;re partially correct answers that omit a critical legal condition.<\/p>\n<p><strong>Vietnamese-specific challenges.<\/strong> Vietnamese Legal AI faces additional hurdles: a shortage of standardized benchmarks, long complex sentences, nested cross-references, Sino-Vietnamese legal terminology, and pronoun resolution (&#8220;this article,&#8221; &#8220;this clause&#8221;). These characteristics make it difficult to directly port English-language models and pipelines.<\/p>\n<p><em>This is precisely the approach FPT IS AI R&amp;D Center is pursuing: combining PhoBERT models fine-tuned for Vietnamese legal NLP and NER with an ontology designed specifically for Vietnamese legal document structure, plus a verification layer that checks citations before generating responses. Rather than adapting English pipelines, this approach handles the specific characteristics of Vietnamese legal text \u2014 long sentences, nested references, Sino-Vietnamese terminology, and complex syntactic structures \u2014 minimizing the &#8220;partially correct but legally incomplete&#8221; errors that represent the most dangerous form of hallucination in legal applications.<\/em><\/p>\n<h2><span style=\"color: #ff6600\">5. International Benchmarks and the FPT Approach<\/span><\/h2>\n<p>Between 2023 and 2026, several countries and organizations deployed Knowledge Graphs and GraphRAG for Legal AI at scale. <strong>Harvey<\/strong> partnered with A&amp;O Shearman in the US to build a legal AI assistant for corporate lawyers. <strong>Singapore<\/strong> invested heavily in digital legal infrastructure through LawNet and standardized legal ontologies. <strong>China<\/strong> deployed its &#8220;Smart Court&#8221; model, linking millions of judgments to statutory provisions. <strong>Italy and the European Union<\/strong> adopted the Akoma Ntoso standard for encoding legal documents as structured XML.<\/p>\n<p>Three lessons emerge from these international experiences \u2014 and three principles that FPT applies in its own Legal AI development:<\/p>\n<ol>\n<li><strong>Effective Legal AI requires close collaboration between technology companies and legal organizations.<\/strong> FPT&#8217;s development process involves legal experts from the ontology definition phase through output validation.<\/li>\n<li><strong>Data quality and standardization are the most critical foundation.<\/strong> Mirroring Singapore&#8217;s approach with LawNet, FPT has built a normalization pipeline covering all currently-valid Vietnamese legal documents, with continuous updates as the law changes.<\/li>\n<li><strong>AI governance must be built in parallel with technology development.<\/strong> FPT&#8217;s solution integrates audit logging, access control, and responsible AI governance principles from day one of architecture design.<\/li>\n<\/ol>\n<p><strong>The FPT IS AI R&amp;D Center&#8217;s current architecture<\/strong> for Knowledge Graph and GraphRAG-based legal document search and analysis is built around three core principles:<\/p>\n<ul>\n<li><strong>Accurate:<\/strong> Every answer carries clear citations to source documents<\/li>\n<li><strong>Comprehensive:<\/strong> Full coverage of Vietnam&#8217;s currently-valid legal system, continuously updated<\/li>\n<li><strong>Trustworthy:<\/strong> Built on responsible AI governance principles<\/li>\n<\/ul>\n<p>The technical stack consists of five layers:<\/p>\n<ol>\n<li>Data collection and normalization<\/li>\n<li>Vietnamese Legal Knowledge Graph<\/li>\n<li>Vietnamese Legal NLP \u2014 using PhoBERT and fine-tuned models for NER and relation extraction<\/li>\n<li>GraphRAG and querying \u2014 combining Local Search, Global Search, and citation verification<\/li>\n<li>Interface and governance \u2014 including audit logging and role-based access control<\/li>\n<\/ol>\n<p>In this model, AI is positioned as a <strong>professional support tool<\/strong> \u2014 accelerating legal research and analysis \u2014 rather than a replacement for the judgment and decision-making authority of lawyers and legal professionals.<\/p>\n<h2>6. Conclusion: the infrastructure layer that Legal AI has been missing<\/h2>\n<p>Knowledge Graphs and GraphRAG are opening a fundamentally new approach to Legal AI \u2014 particularly in domains demanding precision, multi-step reasoning, and rigorous traceability. Rather than searching by keyword or semantic similarity alone, these systems model law as a network of entities and legal relationships \u2014 far closer to how lawyers actually research and analyze.<\/p>\n<p>With its dense cross-referencing, complex hierarchical structure, and time-varying validity rules, law is an extraordinarily natural fit for the graph model. This is why Knowledge Graphs and GraphRAG are becoming the foundational infrastructure for the next generation of Legal AI.<\/p>\n<p>In Vietnam, the convergence of growing digital legal databases, maturing Vietnamese NLP capabilities, and urgent digital transformation demand is creating real conditions for building trustworthy Legal AI systems. But realizing that potential requires genuine collaboration between technology companies, legal professionals, regulators, and the research community.<\/p>\n<p>AI can powerfully augment legal practice \u2014 in research, analysis, and information retrieval. But it cannot replace human judgment and professional responsibility. In a discipline built on reasoning and accountability, how you build and govern AI matters as much as the technology itself.<\/p>\n<p><strong>References<\/strong><\/p>\n<p><strong>1. Foundational GraphRAG Research<\/strong><\/p>\n<ul>\n<li>Edge, D., Trinh, H., Cheng, N., et al. (2024). &#8220;From Local to Global: A Graph RAG Approach to Query-Focused Summarization.&#8221; Microsoft Research. arXiv:2404.16130.<\/li>\n<li>Berners-Lee, T., Hendler, J., &amp; Lassila, O. (2001). &#8220;The Semantic Web.&#8221; <em>Scientific American<\/em>, 284(5), 34\u201343.<\/li>\n<li>Yao, S., et al. (2022). &#8220;ReAct: Synergizing Reasoning and Acting in Language Models.&#8221; arXiv:2210.03629.<\/li>\n<\/ul>\n<p><strong>2. Legal AI Research<\/strong><\/p>\n<ul>\n<li>Magesh, V., et al. (2024). &#8220;Hallucination-Free? Assessing the Reliability of Leading AI Legal Research Tools.&#8221; Stanford RegLab and HAI.<\/li>\n<li>Vuong, T. H. Y., et al. (2023). &#8220;Constructing a Knowledge Graph for Vietnamese Legal Cases with Heterogeneous Graphs.&#8221; IEEE-KSE 2023.<\/li>\n<li>Springer (2025). Graph RAG applied to 6 core Vietnamese legal codes.<\/li>\n<li>Vals AI (2025). &#8220;Legal AI Benchmarks: Comparative Performance of Leading Legal AI Tools.&#8221;<\/li>\n<li>American Bar Association (2024). &#8220;Formal Opinion 512: Generative Artificial Intelligence Tools.&#8221;<\/li>\n<\/ul>\n<p><strong>3. Industry Reports<\/strong><\/p>\n<ul>\n<li>Goldman Sachs Global Investment Research (2023). &#8220;The Potentially Large Effects of Artificial Intelligence on Economic Growth.&#8221;<\/li>\n<li>Thomson Reuters (2024). &#8220;Future of Professionals Report.&#8221;<\/li>\n<li>Gartner (2025). &#8220;Predictions for Generative AI in 2025.&#8221;<\/li>\n<\/ul>\n<p><strong>4. Vietnamese Law and Data Sources<\/strong><\/p>\n<ul>\n<li>AI Law No. 134\/2025\/QH15. National Assembly of Vietnam. Effective March 1, 2026.<\/li>\n<li>Enterprise Law No. 59\/2020\/QH14. National Assembly of Vietnam.<\/li>\n<li>vbpl.vn: National Database of Legal Documents. Ministry of Justice.<\/li>\n<li>congbobanan.toaan.gov.vn: Supreme People&#8217;s Court Public Judgment Portal.<\/li>\n<\/ul>\n<p><strong>5. Products and Technologies<\/strong><\/p>\n<ul>\n<li>Microsoft GraphRAG: https:\/\/github.com\/microsoft\/graphrag<\/li>\n<li>Neo4j: https:\/\/neo4j.com<\/li>\n<li>Harvey AI: https:\/\/www.harvey.ai<\/li>\n<li>Thomson Reuters CoCounsel: https:\/\/www.thomsonreuters.com\/en\/products\/cocounsel.html<\/li>\n<li>PhoBERT: https:\/\/github.com\/VinAIResearch\/PhoBERT<\/li>\n<li>Akoma Ntoso: http:\/\/www.akomantoso.org<\/li>\n<\/ul>\n<table>\n<tbody>\n<tr>\n<td width=\"623\"><em>Exclusive article by FPT IS Technology Experts, FPT Corporation<\/em><\/p>\n<p><strong>Nguyen Truong An<\/strong><\/p>\n<p><em>AI Product Manager \u2014 FPT IS AI R&amp;D Center<\/em><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n","protected":false},"author":3,"featured_media":24468,"parent":0,"template":"","nang_luc":[],"danh_muc_goc_nhin_so":[],"dich_vu":[],"linh_vuc":[],"platform":[],"san_pham":[],"the_goc_nhin_so":[],"class_list":["post-24452","goc_nhin_so","type-goc_nhin_so","status-publish","has-post-thumbnail","hentry"],"acf":[],"_links":{"self":[{"href":"https:\/\/fpt-is.com\/en\/wp-json\/wp\/v2\/goc_nhin_so\/24452","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/fpt-is.com\/en\/wp-json\/wp\/v2\/goc_nhin_so"}],"about":[{"href":"https:\/\/fpt-is.com\/en\/wp-json\/wp\/v2\/types\/goc_nhin_so"}],"author":[{"embeddable":true,"href":"https:\/\/fpt-is.com\/en\/wp-json\/wp\/v2\/users\/3"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/fpt-is.com\/en\/wp-json\/wp\/v2\/media\/24468"}],"wp:attachment":[{"href":"https:\/\/fpt-is.com\/en\/wp-json\/wp\/v2\/media?parent=24452"}],"wp:term":[{"taxonomy":"nang_luc","embeddable":true,"href":"https:\/\/fpt-is.com\/en\/wp-json\/wp\/v2\/nang_luc?post=24452"},{"taxonomy":"danh_muc_goc_nhin_so","embeddable":true,"href":"https:\/\/fpt-is.com\/en\/wp-json\/wp\/v2\/danh_muc_goc_nhin_so?post=24452"},{"taxonomy":"dich_vu","embeddable":true,"href":"https:\/\/fpt-is.com\/en\/wp-json\/wp\/v2\/dich_vu?post=24452"},{"taxonomy":"linh_vuc","embeddable":true,"href":"https:\/\/fpt-is.com\/en\/wp-json\/wp\/v2\/linh_vuc?post=24452"},{"taxonomy":"platform","embeddable":true,"href":"https:\/\/fpt-is.com\/en\/wp-json\/wp\/v2\/platform?post=24452"},{"taxonomy":"san_pham","embeddable":true,"href":"https:\/\/fpt-is.com\/en\/wp-json\/wp\/v2\/san_pham?post=24452"},{"taxonomy":"the_goc_nhin_so","embeddable":true,"href":"https:\/\/fpt-is.com\/en\/wp-json\/wp\/v2\/the_goc_nhin_so?post=24452"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}