<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>模型评测 on 杨の草原</title><link>https://thinkless-github-io.pages.dev/tags/%E6%A8%A1%E5%9E%8B%E8%AF%84%E6%B5%8B/</link><description>Recent content in 模型评测 on 杨の草原</description><generator>Hugo</generator><language>zh-CN</language><lastBuildDate>Mon, 02 Feb 2026 16:27:09 +0800</lastBuildDate><atom:link href="https://thinkless-github-io.pages.dev/tags/%E6%A8%A1%E5%9E%8B%E8%AF%84%E6%B5%8B/index.xml" rel="self" type="application/rss+xml"/><item><title>基于人类反馈的强化学习（RLHF）4</title><link>https://thinkless-github-io.pages.dev/posts/%E5%9F%BA%E4%BA%8E%E4%BA%BA%E7%B1%BB%E5%8F%8D%E9%A6%88%E7%9A%84%E5%BC%BA%E5%8C%96%E5%AD%A6%E4%B9%A0rlhf4/</link><pubDate>Mon, 02 Feb 2026 16:27:09 +0800</pubDate><guid>https://thinkless-github-io.pages.dev/posts/%E5%9F%BA%E4%BA%8E%E4%BA%BA%E7%B1%BB%E5%8F%8D%E9%A6%88%E7%9A%84%E5%BC%BA%E5%8C%96%E5%AD%A6%E4%B9%A0rlhf4/</guid><description>别被大模型满天飞的“SOTA”跑分骗了！高分真代表好用吗？评估 RLHF 模型远不只是看通过率。本文梳理以“HHH”为核心的对齐评估体系，拆解训练过程中奖励分数与 KL 散度的权衡逻辑。从人工评估的实验设计到自动化基准的去噪技巧，再到红队测试的对抗性验证，给出一套从微调监控到安全部署的全链路评估指南。</description></item><item><title>提示词调优</title><link>https://thinkless-github-io.pages.dev/posts/%E6%8F%90%E7%A4%BA%E8%AF%8D%E8%B0%83%E4%BC%98/</link><pubDate>Sat, 30 Aug 2025 10:17:19 +0800</pubDate><guid>https://thinkless-github-io.pages.dev/posts/%E6%8F%90%E7%A4%BA%E8%AF%8D%E8%B0%83%E4%BC%98/</guid><description>一套提示词调优流程：先用 35 项标准打分，再根据反馈迭代改写，适合拿来评估和打磨复杂提示词。</description></item><item><title>训练数据集与性能评测</title><link>https://thinkless-github-io.pages.dev/posts/%E8%AE%AD%E7%BB%83%E6%95%B0%E6%8D%AE%E9%9B%86%E4%B8%8E%E6%80%A7%E8%83%BD%E8%AF%84%E6%B5%8B/</link><pubDate>Tue, 29 Apr 2025 21:29:21 +0800</pubDate><guid>https://thinkless-github-io.pages.dev/posts/%E8%AE%AD%E7%BB%83%E6%95%B0%E6%8D%AE%E9%9B%86%E4%B8%8E%E6%80%A7%E8%83%BD%E8%AF%84%E6%B5%8B/</guid><description>大模型训练数据集与评测指南：中文数据集资源汇总、数据处理方法、模型性能评测指标。构建高质量训练数据的实用教程。</description></item></channel></rss>