{"id":6430,"date":"2025-10-22T09:51:24","date_gmt":"2025-10-22T01:51:24","guid":{"rendered":"https:\/\/nullthought.net\/?p=6430"},"modified":"2025-10-22T09:54:30","modified_gmt":"2025-10-22T01:54:30","slug":"sinq%ef%bc%9a%e6%97%a0%e6%a0%a1%e5%87%86%e5%9d%87%e5%8c%80%e9%87%8f%e5%8c%96","status":"publish","type":"post","link":"https:\/\/nullthought.net\/?p=6430","title":{"rendered":"SINQ\uff1a\u65e0\u6821\u51c6\u5747\u5300\u91cf\u5316"},"content":{"rendered":"\n<p>\u5927\u578b\u8bed\u8a00\u6a21\u578b\uff08LLM\uff09\u5728\u90e8\u7f72\u7aef\u6700\u5e38\u7528\u7684\u538b\u7f29\u8def\u5f84\u662f<a href=\"https:\/\/nullthought.net\/?p=4092\" target=\"_blank\" rel=\"noreferrer noopener\">\u540e\u8bad\u7ec3\u91cf\u5316<\/a>\uff08Post-training quantization, PTQ\uff09\u3002\u4f46\u5728\u22644bit\u7684\u4f4e\u6bd4\u7279\u6743\u91cd\u91cf\u5316\u4e0b\uff0c\u5747\u5300\u3001\u65e0\u6821\u51c6\u7684\u7ecf\u5178\u65b9\u6cd5\uff08\u5982RTN\uff09\u5e38\u56e0\u201c\u79bb\u7fa4\u503c\u201d\u800c\u663e\u8457\u52a3\u5316\u2014\u2014\u540c\u4e00\u7f29\u653e\u56e0\u5b50\u88ab\u8feb\u540c\u65f6\u670d\u52a1\u4e8e\u6781\u5927\u503c\u4e0e\u666e\u901a\u6743\u91cd\uff0c\u5bfc\u81f4\u6574\u884c\u6216\u6574\u5217\u88ab\u201c\u7275\u8fde\u201d\uff0c\u56f0\u4e8e\u201c\u5171\u4eab\u5c3a\u5ea6\u201d\u7684\u7cbe\u5ea6\u74f6\u9888\u3002\u8bba\u6587<strong><a href=\"https:\/\/arxiv.org\/abs\/2509.22944\" target=\"_blank\" rel=\"noreferrer noopener\">SINQ: Sinkhorn-Normalized Quantization for Calibration-Free Low-Precision LLM Weights<\/a><\/strong>\u805a\u7126\u201c\u65e0\u9700\u6821\u51c6\u3001\u5747\u5300\u91cf\u5316\u201d\u7684\u573a\u666f\uff0c\u8bd5\u56fe\u5728\u4fdd\u6301\u5b9e\u73b0\u7b80\u5355\u3001\u901f\u5ea6\u5feb\u3001\u67b6\u6784\u65e0\u5173\u7684\u540c\u65f6\uff0c\u663e\u8457\u7f29\u5c0f\u4e0e\u6821\u51c6\u6216\u975e\u5747\u5300\u683c\u5f0f\uff08\u5982NF4\uff09\u7684\u8d28\u91cf\u5dee\u8ddd\u3002<\/p>\n\n\n\n<p>\u8bba\u6587\u6838\u5fc3\u601d\u60f3\u4e0e\u8d21\u732e\u5982\u4e0b\uff1a<br>1\uff09\u63d0\u51fa\u201c\u53cc\u8f74\u7f29\u653e\u201d\uff08dual-scaling\uff09\u6743\u91cd\u91cf\u5316\u53c2\u6570\u5316\uff1a\u5728\u77e9\u9635\u4e24\u7ef4\u540c\u65f6\u5f15\u5165\u7f29\u653e\u5411\u91cfs<sup>\u2192<\/sup>\uff08\u884c\u5411\uff09\u4e0et<sup>\u2192<\/sup>\uff08\u5217\u5411\uff09\uff0c\u4ee5\u5728\u884c\/\u5217\u4e4b\u95f4\u8c03\u914d\u79bb\u7fa4\u503c\u7684\u5f71\u54cd\uff0c\u4ece\u800c\u907f\u514d\u5355\u5c3a\u5ea6\u65e0\u6cd5\u5316\u89e3\u7684\u201c\u8fde\u5750\u6548\u5e94\u201d\u3002<br>2\uff09\u63d0\u51fa\u7528\u4e8e\u5ea6\u91cf\u201c\u53ef\u91cf\u5316\u6027\u201d\u7684\u4ee3\u7406\u6307\u6807\u2014\u2014\u77e9\u9635\u5931\u8861\uff08imbalance\uff09\uff1a\u5b9a\u4e49\u4e3a\u6240\u6709\u884c\/\u5217\u6807\u51c6\u5dee\u4e2d\u7684\u6700\u5927\u503c\u4e0e\u6700\u5c0f\u503c\u4e4b\u6bd4\uff0c\u5e76\u4ee5\u6b64\u4f5c\u4e3a\u4f18\u5316\u76ee\u6807\u3002<br>3\uff09\u7ed9\u51fa\u57fa\u4e8eSinkhorn\u2013Knopp\u601d\u60f3\u7684\u5feb\u901f\u8fed\u4ee3\u6cd5\uff08Sinkhorn-Normalized Quantization, SINQ\uff09\uff1a\u4e0d\u662f\u6807\u51c6\u5316\u884c\u5217\u201c\u548c\u201d\uff0c\u800c\u662f\u6807\u51c6\u5316\u884c\u5217\u201c\u6807\u51c6\u5dee\u201d\uff1b\u901a\u8fc7\u4ea4\u66ff\u5f52\u4e00\u5316\u884c\/\u5217std\uff0c\u4f7f\u5176\u540c\u65f6\u6536\u655b\u5230\u540c\u4e00\u6807\u5c3a\uff0c\u964d\u4f4e\u5931\u8861\u3002<br>4\uff09\u7cfb\u7edf\u5b9e\u9a8c\u8868\u660e\uff1a\u5728Qwen3 \u7cfb\u5217\u3001DeepSeek-V2.5 \u7b49\u6a21\u578b\u4e0a\uff0cSINQ\u5728\u65e0\u6821\u51c6\u5747\u5300\u91cf\u5316\u4e0b\u663e\u8457\u4f18\u4e8eSOTA\u57fa\u7ebf\uff0c\u5e76\u53ef\u4e0e\u6821\u51c6\uff08AWQ\uff09\u4e0e\u975e\u5747\u5300\u91cf\u5316\uff08NF4\uff09\u517c\u5bb9\u53e0\u52a0\uff1b\u540c\u65f6\u91cf\u5316\u8017\u65f6\u63a5\u8fd1RTN\u3002<\/p>\n\n\n\n<p>\u8bba\u6587\u4f5c\u8005\u4e3aLorenz K. M\u00fcller, Philippe Bich, Jiawei Zhuang, Ahmet \u00c7elik, Luca Benfenati, Lukas Cavigelli\uff0c\u6765\u81ea\u534e\u4e3a\u3002<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<p>SINQ on GitHub: <a href=\"https:\/\/github.com\/huawei-csl\/SINQ\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/github.com\/huawei-csl\/SINQ<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u5927\u578b\u8bed\u8a00\u6a21\u578b\uff08LLM\uff09\u5728\u90e8\u7f72\u7aef\u6700\u5e38\u7528\u7684\u538b\u7f29\u8def\u5f84\u662f\u540e\u8bad\u7ec3\u91cf\u5316\uff08Post-training quantizatio [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","ast-disable-related-posts":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"set","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"footnotes":""},"categories":[8],"tags":[39,86],"class_list":["post-6430","post","type-post","status-publish","format-standard","hentry","category-tech","tag-ai","tag-llm"],"rttpg_featured_image_url":null,"rttpg_author":{"display_name":"NullThought","author_link":"https:\/\/nullthought.net\/?author=1"},"rttpg_comment":0,"rttpg_category":"<a href=\"https:\/\/nullthought.net\/?cat=8\" rel=\"category\">Tech<\/a>","rttpg_excerpt":"\u5927\u578b\u8bed\u8a00\u6a21\u578b\uff08LLM\uff09\u5728\u90e8\u7f72\u7aef\u6700\u5e38\u7528\u7684\u538b\u7f29\u8def\u5f84\u662f\u540e\u8bad\u7ec3\u91cf\u5316\uff08Post-training quantizatio&hellip;","_links":{"self":[{"href":"https:\/\/nullthought.net\/index.php?rest_route=\/wp\/v2\/posts\/6430","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/nullthought.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/nullthought.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/nullthought.net\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/nullthought.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=6430"}],"version-history":[{"count":3,"href":"https:\/\/nullthought.net\/index.php?rest_route=\/wp\/v2\/posts\/6430\/revisions"}],"predecessor-version":[{"id":6433,"href":"https:\/\/nullthought.net\/index.php?rest_route=\/wp\/v2\/posts\/6430\/revisions\/6433"}],"wp:attachment":[{"href":"https:\/\/nullthought.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=6430"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/nullthought.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=6430"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/nullthought.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=6430"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}