{"id":6403,"date":"2025-10-19T09:59:03","date_gmt":"2025-10-19T01:59:03","guid":{"rendered":"https:\/\/nullthought.net\/?p=6403"},"modified":"2025-10-19T09:59:05","modified_gmt":"2025-10-19T01:59:05","slug":"%e6%95%b4%e5%90%88mineru-chonkie-rag-anything-%e7%9a%84-rag%e7%b3%bb%e7%bb%9f%e6%8a%80%e6%9c%af%e5%ae%9e%e7%8e%b0","status":"publish","type":"post","link":"https:\/\/nullthought.net\/?p=6403","title":{"rendered":"\u6574\u5408MinerU + Chonkie + RAG-Anything \u7684 RAG\u7cfb\u7edf\u6280\u672f\u5b9e\u73b0"},"content":{"rendered":"\n<p><mark style=\"background-color:rgba(0, 0, 0, 0);color:#cf2e2e\" class=\"has-inline-color\">\u4eca\u65e5\u7565\u7814\u7a76\u4e86\u4e00\u4e0b\u6574\u5408MinerU + Chonkie + RAG-Anything \u7684 RAG\u7cfb\u7edf\u6280\u672f\u5b9e\u73b0\uff1a<\/mark><\/p>\n\n\n\n<p><strong>MinerU<\/strong>\uff1a\u5f00\u7bb1\u5373\u7528\u7684<strong>\u9ad8\u4fdd\u771f\u7248\u9762\u89e3\u6790<\/strong>\uff08\u8868\u683c\/\u8de8\u9875\/\u516c\u5f0f\/\u56fe\u9898\uff09\uff0c\u4e14\u8f93\u51fa<strong>\u7ed3\u6784\u5316 JSON + Markdown<\/strong>\uff0c\u7279\u522b\u9002\u5408\u540e\u7eed\u5207\u5757\u4e0e\u8bc1\u636e\u56de\u663e\u3002<\/p>\n\n\n\n<p><strong>Chonkie<\/strong>\uff1a\u4e13\u6ce8\u201c<strong>\u5207\u5757\u5c31\u4f4d<\/strong>\u201d\uff0c\u8f7b\u91cf\u3001\u6027\u80fd\u9ad8\u3001\u7b56\u7565\u591a\u6837\uff0c\u80fd\u76f4\u63a5\u4f5c\u4e3a<strong>\u7edf\u4e00\u5207\u5757\u5c42<\/strong>\uff1bTS \u7248\u672c\u53ef\u5728\u524d\u540e\u7aef\u540c\u6784\uff0c\u65b9\u4fbf\u5904\u7406\u5728\u7ebf\u6587\u6863\u6d41\u3002<\/p>\n\n\n\n<p><strong><a href=\"https:\/\/nullthought.net\/?p=6393\" target=\"_blank\" rel=\"noreferrer noopener\">RAG-Anything<\/a><\/strong>\uff1a\u591a\u6a21\u6001\u3001<strong>\u53cc\u56fe+\u6df7\u5408\u68c0\u7d22+\u591a\u9636\u6bb5\u7f16\u6392<\/strong>\u7684\u201c\u4e00\u4f53\u5316 RAG \u6846\u67b6\u201d\uff0c\u6b63\u597d\u8865\u8db3\u201c\u8de8\u6587\u672c-\u8868-\u56fe-\u516c\u5f0f\u201d\u7684\u68c0\u7d22\u63a8\u7406\u80fd\u529b\uff0c\u4e0e LightRAG \u751f\u6001\u517c\u5bb9\u3002<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">\u4e00\u3001\u6280\u672f\u67b6\u6784<\/h4>\n\n\n\n<p>4 \u5c42\u67b6\u6784\uff1aL1\u91c7\u96c6\u89e3\u6790 \u2192 L2\u5207\u5757\u5f52\u6863 \u2192 L3\u68c0\u7d22\u7f16\u6392 \u2192 L4\u751f\u6210\u4e0e\u53cd\u9988\uff09<\/p>\n\n\n\n<h5 class=\"wp-block-heading\"><strong>L1 \u91c7\u96c6\u4e0e\u89e3\u6790\uff08Documents \u2192 Structured Multimodal\uff09<\/strong><\/h5>\n\n\n\n<p>1\uff09<strong>MinerU \u89e3\u6790\u670d\u52a1<\/strong>\uff08CLI\/SDK\/API\uff09\uff1a\u5c06 PDF\/\u56fe\u7247\/Office \u8f6c\u4e3a Markdown + \u7ed3\u6784\u5316 JSON\uff0c\u5e76\u540c\u65f6\u5bfc\u51fa\u8868\u683c\u3001\u516c\u5f0f\u3001\u56fe\u50cf\u7247\u6bb5\u7684\u4f4d\u7f6e\u4fe1\u606f\u4e0e\u53ef\u89c6\u5316\u8c03\u8bd5\u6587\u4ef6\uff0c\u786e\u4fdd\u590d\u6742\u8868\u683c\u3001\u8de8\u9875\u5355\u5143\u683c\u7b49\u9ad8\u4fdd\u771f\u62bd\u53d6\u3002<\/p>\n\n\n\n<p>2\uff09<strong>\u5bf9\u8c61\u5b58\u50a8\/\u7248\u672c\u5316<\/strong>\uff1a\u539f\u59cb\u6587\u4ef6\u4e0e MinerU \u4ea7\u7269\u5165 MinIO\/S3\uff08<code>raw\/<\/code>\u3001<code>mineru\/md\/<\/code>\u3001<code>mineru\/json\/<\/code>\u3001<code>mineru\/assets\/<\/code>\uff09\u3002<\/p>\n\n\n\n<h5 class=\"wp-block-heading\"><strong>L2 \u5207\u5757\u4e0e\u5f52\u6863\uff08Structured \u2192 Chunks+KG\uff09<\/strong><\/h5>\n\n\n\n<p>3) <strong>Chonkie \u5207\u5757\u5668<\/strong>\uff1a\u9488\u5bf9 MinerU \u7684 Markdown\/JSON \u8fdb\u884c\u201c\u8bed\u4e49-\u7ed3\u6784\u53cc\u611f\u77e5\u201d\u7684\u5207\u5757\uff08\u7ae0\u8282\/\u6bb5\u843d\/\u8868\u683c\/\u56fe\u9898\/\u4ee3\u7801\/\u516c\u5f0f\uff09\uff0c\u5e76\u4fdd\u7559 <code>source_url \/ page \/ bbox \/ section_path \/ modality<\/code> \u7b49\u5143\u6570\u636e\u3002Chonkie \u63d0\u4f9b\u9ad8\u6027\u80fd\u3001\u8f7b\u91cf\u7ea7\u7684\u591a\u79cd chunker\uff08Recursive\u3001ByHeaders\u3001FixedTokens \u7b49\uff09\uff0c\u9002\u5408\u505a RAG \u4fa7\u7684\u6807\u51c6\u5316\u5207\u5757\u5c42\u3002<\/p>\n\n\n\n<p>4) <strong>\u77e5\u8bc6\u56fe\u8c31\u6784\u5efa\uff08RAG-Anything\uff09<\/strong>\uff1a\u628a\u201c\u6587\u672c\u5757\u3001\u8868\u683c\u3001\u56fe\u7247\u3001\u516c\u5f0f\u201d\u7b49\u4f5c\u4e3a<strong>\u8282\u70b9<\/strong>\uff0c\u7ae0\u8282\u5173\u7cfb\u3001\u56fe\u8868-\u56fe\u9898\u3001\u8868-\u5f15\u7528\u3001\u8de8\u9875\u7eed\u8868\u3001\u540c\u540d\u5b9e\u4f53\u4f5c\u4e3a<strong>\u8fb9<\/strong>\u751f\u6210\u201c<strong>\u53cc\u56fe<\/strong>\u201d\uff08\u7ed3\u6784\u56fe + \u8bed\u4e49\u56fe\uff09\uff0c\u4e3a\u8de8\u6a21\u6001\u68c0\u7d22\u63d0\u4f9b\u5bfc\u822a\u9aa8\u67b6\u3002<\/p>\n\n\n\n<p>5) <strong>\u5411\u91cf\u4e0e\u7a00\u758f\u7d22\u5f15<\/strong>\uff1a<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>\u6587\u672c\uff1abge-m3 \/ e5-mistral \u7b49\uff1b<\/li>\n\n\n\n<li>\u8868\u683c\uff1a\u8868\u683c\u6587\u672c\u5316\uff08Markdown\/CSV\uff09+ \u4e13\u7528 embedding\uff1b<\/li>\n\n\n\n<li>\u56fe\u7247\/\u56fe\u8868\uff1aCLIP\/Multimodal embedding\uff1b<\/li>\n\n\n\n<li>\u7a00\u758f\uff1aBM25\/ColBERT-X\uff08\u53ef\u9009\uff09\u3002<br>\uff08RAG-Anything \u539f\u751f\u652f\u6301\u591a\u6a21\u6001\u4e0e\u591a\u9636\u6bb5\u68c0\u7d22\u7f16\u6392\uff0c\u53ef\u4e0e LightRAG \u751f\u6001\u517c\u5bb9\u3002\uff09<\/li>\n<\/ul>\n\n\n\n<h5 class=\"wp-block-heading\"><strong>L3 \u68c0\u7d22\u4e0e\u7f16\u6392\uff08Query \u2192 Multistage Retrieval\uff09<\/strong><\/h5>\n\n\n\n<p>6) <strong>\u8de8\u6a21\u6001\u68c0\u7d22\u7f16\u6392\u5668\uff08RAG-Anything\uff09<\/strong>\uff1a\u8f93\u5165\u67e5\u8be2\u540e\uff0c\u5148\u5728\u7ed3\u6784\u56fe\u4e0a<strong>\u5b9a\u4f4d\u5019\u9009\u533a\u57df<\/strong>\uff08\u7ae0\u8282\/\u5c0f\u8282\/\u56fe\u8868\u65cf\uff09\uff0c\u518d\u505a\u5411\u91cf+\u7a00\u758f\u6df7\u5408\u53ec\u56de\uff0c\u6700\u540e\u7528 reranker\uff08bge-reranker-v2 \u7b49\uff09\u91cd\u6392\uff0c\u5f97\u5230\u201c\u8bc1\u636e\u96c6\u5408\u201d\uff08\u6587\u672c\u6bb5\u3001\u8868\u683c\u3001\u56fe\u7247\u3001\u516c\u5f0f\uff09\u3002<\/p>\n\n\n\n<p>7) <strong>\u7247\u6bb5\u878d\u5408\u5668<\/strong>\uff1a\u6309\u201c\u4e3b\u9898\u7ebf\u7d22 + \u7248\u9762\u90bb\u63a5 + \u8de8\u9875\u7eed\u8868\u201d\u5c06\u8bc1\u636e\u62fc\u5408\uff0c\u751f\u6210<strong>\u4e0a\u4e0b\u6587\u5305\uff08Context Pack\uff09<\/strong>\uff0c\u4fdd\u6301\u5f15\u7528\u987a\u5e8f\u4e0e\u53ef\u89c6\u5316\u5750\u6807\u3002<\/p>\n\n\n\n<h5 class=\"wp-block-heading\"><strong>L4 \u751f\u6210\u4e0e\u53cd\u9988\uff08Context \u2192 Answer+Citations\uff09<\/strong><\/h5>\n\n\n\n<p>8) <strong>\u7b54\u6848\u751f\u6210\u5668<\/strong>\uff1a\u591a\u6a21\u6001 LLM\uff08\u6216\u6587\u672c LLM + \u8868\u683c\/\u56fe\u50cf\u4e13\u7528\u89e3\u91ca\u5668\uff09\u5728\u4e25\u683c token \u9884\u7b97\u4e0b\u751f\u6210\u7b54\u6848\uff0c\u5e76\u8f93\u51fa<strong>\u9010\u6761\u5f15\u7528<\/strong>\uff08\u6587\u6863\/\u9875\u7801\/\u56fe\u53f7\/\u5355\u5143\u683c\u5750\u6807\uff09\u3002<\/p>\n\n\n\n<p>9) <strong>\u53cd\u9988\u95ed\u73af<\/strong>\uff1a\u7528\u6237\u70b9\u9009\u201c\u5f15\u7528\u8bc1\u636e\u201d\u89e6\u53d1<strong>\u7ea0\u9519\/\u7cbe\u4fee<\/strong>\uff08Refine\uff09\u4e0e<strong>\u7d22\u5f15\u66f4\u65b0<\/strong>\uff08\u5bf9\u4f4e\u8d28\u91cf\u5757\u56de\u6d41 MinerU \u914d\u7f6e\u4e0e Chonkie \u5207\u5757\u7b56\u7565\uff09\uff0c\u5b9e\u73b0\u6301\u7eed\u63d0\u5347\u3002<\/p>\n\n\n\n<blockquote class=\"wp-block-quote has-small-font-size is-layout-flow wp-block-quote-is-layout-flow\">\n<p class=\"has-small-font-size\">\u8fd0\u884c\u5f62\u6001\uff1aWeb API\uff08FastAPI\uff09+ \u540e\u53f0\u4efb\u52a1\uff08Celery\uff09+ \u5411\u91cf\u5e93\uff08FAISS\/PGVector\/Weaviate\uff09+ \u56fe\u6570\u636e\u5e93\uff08Neo4j\/ArangoDB\uff09+ \u5bf9\u8c61\u5b58\u50a8\uff08MinIO\uff09+ \u89c2\u6d4b\uff08Prom+Grafana+OpenTelemetry\uff09\u3002<\/p>\n<\/blockquote>\n\n\n\n<h4 class=\"wp-block-heading\">\u4e8c\u3001\u5173\u952e\u6d41\u7a0b\uff08\u7aef\u5230\u7aef\uff09<\/h4>\n\n\n\n<h5 class=\"wp-block-heading\">1) \u89e3\u6790\u4e0e\u5165\u5e93\uff08MinerU\uff09<\/h5>\n\n\n\n<ul class=\"wp-block-list\">\n<li>\u89e6\u53d1\uff1a\u4e0a\u4f20\/\u76d1\u542c <code>raw\/<\/code>\u3002<\/li>\n\n\n\n<li>\u6267\u884c\uff1a\u8c03\u7528 MinerU\uff08\u672c\u5730 CLI \u6216\u670d\u52a1\u7aef API\uff09\u8f93\u51fa Markdown + \u7ed3\u6784 JSON + \u8d44\u4ea7\u3002<\/li>\n\n\n\n<li>\u8d28\u91cf\uff1a\u5f00\u542f\u53ef\u89c6\u5316\u8c03\u8bd5\u6587\u4ef6\u4e0e\u201c\u7ed3\u6784\u5316\u6570\u636e\u6587\u4ef6\u201d\uff0c\u4fbf\u4e8e\u4e8c\u6b21\u5904\u7406\u4e0e\u8d28\u68c0\u3002<\/li>\n<\/ul>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>MinerU \u80fd\u9ad8\u4fdd\u771f\u5904\u7406<strong>\u65cb\u8f6c\u8868\u683c\u3001\u8de8\u9875\u5408\u5e76\u5355\u5143\u683c<\/strong>\uff0c\u9002\u5408\u91d1\u878d\/\u79d1\u7814\/\u89c4\u8303\u6807\u51c6\u7b49\u82db\u523b\u7248\u9762\u3002<\/p>\n<\/blockquote>\n\n\n\n<h5 class=\"wp-block-heading\">2) \u8bed\u4e49\u5207\u5757\uff08Chonkie\uff09<\/h5>\n\n\n\n<ul class=\"wp-block-list\">\n<li>\u89c4\u5219\uff1a\u4f18\u5148 <code>ByHeaders \u2192 Paragraph \u2192 Recursive<\/code>\uff0c\u8868\u683c\/\u56fe\u9898\u5355\u72ec\u4e3a\u5757\uff1b\u9047\u957f\u8868\u5206\u201c\u6807\u9898 + \u7247\u6bb5\u5757\uff08\u542b\u884c\u53f7\u8303\u56f4\uff09\u201d\u3002<\/li>\n\n\n\n<li>\u76ee\u6807\uff1a<strong>\u7a33\u5b9a token \u4e0a\u9650<\/strong>\uff08\u5982 300\u2013500\uff09+ <strong>\u7ed3\u6784\u5bf9\u9f50<\/strong>\uff08\u4e0e MinerU \u7684 section\/anchor \u5bf9\u5e94\uff09\u3002<\/li>\n\n\n\n<li>\u5b9e\u73b0\uff1aPython \u4fa7\u76f4\u63a5\u7528 <code>chonkie<\/code>\uff1b\u524d\u7aef\/Node \u4efb\u52a1\u53ef\u7528 <code>@chonkiejs\/core<\/code> \u540c\u6784\u91cd\u7528\u3002<\/li>\n<\/ul>\n\n\n\n<h5 class=\"wp-block-heading\">3) \u7d22\u5f15\u4e0e\u5efa\u56fe\uff08RAG-Anything + \u81ea\u5efa\u7d22\u5f15\uff09<\/h5>\n\n\n\n<ul class=\"wp-block-list\">\n<li>\u6587\u672c\/\u8868\u683c\/\u56fe\u7247\u5206\u522b\u5d4c\u5165\uff0c\u5199\u5165\u5411\u91cf\u5e93 <code>namespace={modality}<\/code>\uff1b<\/li>\n\n\n\n<li>\u6784\u5efa<strong>\u53cc\u56fe<\/strong>\uff1a\n<ul class=\"wp-block-list\">\n<li>\u7ed3\u6784\u56fe\uff08\u7ae0\u8282\u5c42\u7ea7\u3001\u5757\u90bb\u63a5\u3001\u56fe\u2194\u56fe\u9898\u3001\u8868\u2194\u8868\u9898\u3001\u6b63\u6587\u2194\u5f15\u7528\uff09\uff1b<\/li>\n\n\n\n<li>\u8bed\u4e49\u56fe\uff08\u76f8\u4f3c\u5757\u3001\u5b9e\u4f53\u5171\u73b0\u3001\u8de8\u6a21\u6001\u5bf9\u9f50\uff09\u3002<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>\u5c06\u53cc\u56fe\u5165 Neo4j\uff0c\u5e76\u628a\u5757-\u8282\u70b9 ID \u56de\u5199\u5230\u5411\u91cf\u6761\u76ee\u5143\u6570\u636e\u4e2d\u3002<\/li>\n<\/ul>\n\n\n\n<h5 class=\"wp-block-heading\">4) \u591a\u9636\u6bb5\u68c0\u7d22\u4e0e\u7f16\u6392\uff08RAG-Anything\uff09<\/h5>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>\u7ed3\u6784\u5148\u5bfc<\/strong>\uff1a\u5728\u56fe\u4e0a\u5b9a\u4f4d\u5019\u9009\u7ae0\u8282\/\u56fe\u8868\u65cf\uff1b<\/li>\n\n\n\n<li><strong>\u6df7\u5408\u53ec\u56de<\/strong>\uff1a\u5728\u5019\u9009\u5b50\u56fe\u91cc\u505a\u5411\u91cf+BM25 \u6df7\u5408\uff08\u51cf\u5c11\u566a\u58f0\uff09\uff1b<\/li>\n\n\n\n<li><strong>\u4ea4\u53c9\u91cd\u6392<\/strong>\uff1a\u7528\u8de8\u6a21\u6001 reranker\uff1b<\/li>\n\n\n\n<li><strong>\u8bc1\u636e\u6253\u5305<\/strong>\uff1a\u540c\u9875\u90bb\u63a5\u4f18\u5148\u3001\u8de8\u9875\u7eed\u8868\u5408\u5e76\u3001\u56fe\u9898\u968f\u56fe\u3002<br>RAG-Anything\u7684\u8bbe\u8ba1\u76ee\u6807\u5c31\u662f\u201c<strong>\u591a\u6a21\u6001\u3001\u591a\u9636\u6bb5\u3001\u8de8\u6a21\u6001\u7406\u89e3<\/strong>\u201d\uff0c\u4e0e\u4f20\u7edf\u7eaf\u6587\u672c RAG \u7684\u5dee\u8ddd\u5728\u4e8e\u5b83\u628a\u591a\u6a21\u6001\u5f53\u4f5c<strong>\u4e92\u8054\u7684\u77e5\u8bc6\u5b9e\u4f53<\/strong>\u6765\u7edf\u4e00\u5904\u7406\u3002<\/li>\n<\/ol>\n\n\n\n<h5 class=\"wp-block-heading\">5) \u751f\u6210\u4e0e\u53ef\u8ffd\u6eaf<\/h5>\n\n\n\n<ul class=\"wp-block-list\">\n<li>LLM \u6536\u5230 Context Pack\uff08\u5206\u533a\uff1a\u6b63\u6587\/\u8868\u683c\u6458\u8981\/\u56fe\u9898\/\u8de8\u9875\u8865\u4e01\uff09\uff1b<\/li>\n\n\n\n<li>\u8f93\u51fa<strong>\u9010\u8bc1\u636e\u5f15\u7528<\/strong>\uff08<code>doc_id#page:line_range \/ Fig.x \/ Tab.y[row-range]<\/code>\uff09\uff1b<\/li>\n\n\n\n<li>UI \u4fa7\u63d0\u4f9b\u201c\u8bc1\u636e\u9ad8\u4eae\u9884\u89c8\u201d\uff08\u7528 MinerU \u7684 bbox \u6620\u5c04\u56de\u539f PDF \u89c6\u56fe\uff09\u3002<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>\u4eca\u65e5\u7565\u7814\u7a76\u4e86\u4e00\u4e0b\u6574\u5408MinerU + Chonkie + RAG-Anything \u7684 RAG\u7cfb\u7edf\u6280\u672f\u5b9e\u73b0\uff1a  [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","ast-disable-related-posts":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"set","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"footnotes":""},"categories":[3,8],"tags":[39,101,76],"class_list":["post-6403","post","type-post","status-publish","format-standard","hentry","category-it","category-tech","tag-ai","tag-old-programmer-in-ai-era","tag-rag"],"rttpg_featured_image_url":null,"rttpg_author":{"display_name":"NullThought","author_link":"https:\/\/nullthought.net\/?author=1"},"rttpg_comment":0,"rttpg_category":"<a href=\"https:\/\/nullthought.net\/?cat=3\" rel=\"category\">IT<\/a> <a href=\"https:\/\/nullthought.net\/?cat=8\" rel=\"category\">Tech<\/a>","rttpg_excerpt":"\u4eca\u65e5\u7565\u7814\u7a76\u4e86\u4e00\u4e0b\u6574\u5408MinerU + Chonkie + RAG-Anything \u7684 RAG\u7cfb\u7edf\u6280\u672f\u5b9e\u73b0\uff1a &hellip;","_links":{"self":[{"href":"https:\/\/nullthought.net\/index.php?rest_route=\/wp\/v2\/posts\/6403","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/nullthought.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/nullthought.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/nullthought.net\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/nullthought.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=6403"}],"version-history":[{"count":1,"href":"https:\/\/nullthought.net\/index.php?rest_route=\/wp\/v2\/posts\/6403\/revisions"}],"predecessor-version":[{"id":6404,"href":"https:\/\/nullthought.net\/index.php?rest_route=\/wp\/v2\/posts\/6403\/revisions\/6404"}],"wp:attachment":[{"href":"https:\/\/nullthought.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=6403"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/nullthought.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=6403"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/nullthought.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=6403"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}