๐Ÿ“š Weekly AI Paper Digest

๊ธฐ๊ฐ„: 2026-01-26 ~ 2026-01-31 ์„ ์ •: ์ด๋ฒˆ ์ฃผ ๊ฐ€์žฅ ์ฃผ๋ชฉ๋ฐ›์€ ๋…ผ๋ฌธ Top 5


๐Ÿ† ์ด๋ฒˆ ์ฃผ Top 5

์ˆœ์œ„๋…ผ๋ฌธโฌ†๏ธDeep Dive
๐Ÿฅ‡Can LLMs Clean Up Your Mess? A Survey ofโ€ฆ181DD-001
๐ŸฅˆLongCat-Flash-Thinking-2601 Technical Reโ€ฆ171DD-002
๐Ÿฅ‰Idea2Story: An Automated Pipeline for Trโ€ฆ149DD-003
4.daVinci-Dev: Agent-native Mid-training fโ€ฆ123DD-004
5.AgentDoG: A Diagnostic Guardrail Framewoโ€ฆ120DD-005

๐Ÿ“‘ ๋…ผ๋ฌธ๋ณ„ ์š”์•ฝ

๐Ÿฅ‡ 1. Can LLMs Clean Up Your Mess? A Survey of Application-Ready Data Preparation with LLMs

arXiv: 2601.17058 | โฌ†๏ธ 181 โ†’ Deep Dive ๋ณด๊ธฐ ํƒœ๊ทธ: llm data-preparation data-cleaning data-integration survey prompt-engineering entity-matching data-centric-ai

์ด ๋…ผ๋ฌธ์€ ์ˆ˜๋ฐฑ ๊ฐœ์˜ ์—ฐ๊ตฌ๋ฅผ ์ข…ํ•ฉํ•˜์—ฌ, ๊ฑฐ๋Œ€ ์–ธ์–ด ๋ชจ๋ธ(LLM)์„ ํ™œ์šฉํ•ด ์ „ํ†ต์ ์œผ๋กœ ์ˆ˜์ž‘์—…์— ์˜์กดํ–ˆ๋˜ ๋ณต์žกํ•˜๊ณ  ๋น„์‹ผ ๋ฐ์ดํ„ฐ ์ค€๋น„ ๊ณผ์ •์„ ์–ด๋–ป๊ฒŒ ์ž๋™ํ™”ํ•˜๊ณ  ํ˜์‹ ํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ์ฒด๊ณ„์ ์œผ๋กœ ์ •๋ฆฌํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— ๋งค์šฐ ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค.

๐Ÿ“– ์ƒ์„ธ ๋ถ„์„: โ†’ Deep Dive ๋ณด๊ธฐ์—์„œ ์‹ฌ์ธต ๋ถ„์„์„ ํ™•์ธํ•˜์„ธ์š”.


๐Ÿฅˆ 2. LongCat-Flash-Thinking-2601 Technical Report

arXiv: 2601.16725 | โฌ†๏ธ 171 โ†’ Deep Dive ๋ณด๊ธฐ ํƒœ๊ทธ: longcat-flash-thinking mixture-of-experts agentic-ai reinforcement-learning heavy-thinking test-time-scaling llm-reasoning

5,600์–ต ํŒŒ๋ผ๋ฏธํ„ฐ์˜ MoE(Mixture-of-Experts) ๊ตฌ์กฐ๋ฅผ ํ™œ์šฉํ•˜์—ฌ, ๋„๊ตฌ ์‚ฌ์šฉ ๋ฐ ์™ธ๋ถ€ ํ™˜๊ฒฝ๊ณผ์˜ ์ƒํ˜ธ์ž‘์šฉ์ด ํ•„์š”ํ•œ ๋ณต์žกํ•œ ์‹ค์ œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ๋Š” ์ตœ๊ณ  ์ˆ˜์ค€(SOTA)์˜ ์—์ด์ „ํŠธ ์ถ”๋ก  ๋Šฅ๋ ฅ์„ ๊ตฌํ˜„ํ•˜๊ณ , ์ด๋ฅผ โ€˜Heavy Thinkingโ€™์ด๋ผ๋Š” ์ถ”๋ก  ์‹œ๊ฐ„ ๊ณ„์‚ฐ ํ™•์žฅ ๊ธฐ๋ฒ•์œผ๋กœ ๊ทน๋Œ€ํ™”ํ–ˆ๋‹ค๋Š” ์ ์—์„œ ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค.

๐Ÿ“– ์ƒ์„ธ ๋ถ„์„: โ†’ Deep Dive ๋ณด๊ธฐ์—์„œ ์‹ฌ์ธต ๋ถ„์„์„ ํ™•์ธํ•˜์„ธ์š”.


๐Ÿฅ‰ 3. Idea2Story: An Automated Pipeline for Transforming Research Concepts into Complete Scientific Narratives

arXiv: 2601.20833 | โฌ†๏ธ 149 โ†’ Deep Dive ๋ณด๊ธฐ ํƒœ๊ทธ: autonomous-science llm-agents knowledge-graph research-automation offline-computation scientific-discovery idea-generation

์ด ๋…ผ๋ฌธ์€ ์—ฐ๊ตฌ ์ž๋™ํ™” ์‹œ์Šคํ…œ์ด ๊ธฐ์กด์˜ ๋А๋ฆฌ๊ณ  ๋น„ํšจ์œจ์ ์ธ โ€˜์‹ค์‹œ๊ฐ„ ์˜จ๋ผ์ธ ์ถ”๋ก โ€™ ๋ฐฉ์‹์—์„œ ๋ฒ—์–ด๋‚˜, ์‚ฌ์ „์— ๊ตฌ์ถ•ํ•œ ์ง€์‹ ๊ทธ๋ž˜ํ”„(Knowledge Graph)๋ฅผ ํ™œ์šฉํ•ด ๋น„์šฉ๊ณผ ํ• ๋ฃจ์‹œ๋„ค์ด์…˜(Hallucination)์„ ์ค„์ด๊ณ  ๋ชจํ˜ธํ•œ ์•„์ด๋””์–ด๋ฅผ ์™„๋ฒฝํ•œ ๊ณผํ•™์  ์„œ์‚ฌ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ํš๊ธฐ์ ์ธ ํŒŒ์ดํ”„๋ผ์ธ Idea2Story๋ฅผ ์ œ์•ˆํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค.

๐Ÿ“– ์ƒ์„ธ ๋ถ„์„: โ†’ Deep Dive ๋ณด๊ธฐ์—์„œ ์‹ฌ์ธต ๋ถ„์„์„ ํ™•์ธํ•˜์„ธ์š”.


4. 4. daVinci-Dev: Agent-native Mid-training for Software Engineering

arXiv: 2601.18418 | โฌ†๏ธ 123 โ†’ Deep Dive ๋ณด๊ธฐ ํƒœ๊ทธ: ai-agent llm software-engineering mid-training fine-tuning code-generation scalability mlops

์ด ๋…ผ๋ฌธ์€ ๊ธฐ์กด์˜ ๋น„์‹ผ ํ›„์ฒ˜๋ฆฌ(post-training) ๋ฐฉ์‹์„ ๋„˜์–ด, LLM์˜ ํ›ˆ๋ จ ์ค‘๊ฐ„(mid-training) ๋‹จ๊ณ„๋ถ€ํ„ฐ ์—์ด์ „ํŠธ์ฒ˜๋Ÿผ ํ–‰๋™ํ•˜๋Š” ๊ธฐ๋ฐ˜ ๋Šฅ๋ ฅ์„ ๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ๋กœ ํ•™์Šต์‹œ์ผœ ๋” ํ™•์žฅ ๊ฐ€๋Šฅํ•˜๊ณ  ๊ฐ•๋ ฅํ•œ ์†Œํ”„ํŠธ์›จ์–ด ์—”์ง€๋‹ˆ์–ด๋ง ์—์ด์ „ํŠธ๋ฅผ ๋งŒ๋“œ๋Š” ์ƒˆ๋กœ์šด ํŒจ๋Ÿฌ๋‹ค์ž„์„ ์ œ์‹œํ–ˆ๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.

๐Ÿ“– ์ƒ์„ธ ๋ถ„์„: โ†’ Deep Dive ๋ณด๊ธฐ์—์„œ ์‹ฌ์ธต ๋ถ„์„์„ ํ™•์ธํ•˜์„ธ์š”.


5. 5. AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security

arXiv: 2601.18491 | โฌ†๏ธ 120 โ†’ Deep Dive ๋ณด๊ธฐ ํƒœ๊ทธ: ai-agent safety guardrail llm security trajectory-analysis atbench diagnostic-framework

์ด ๋…ผ๋ฌธ์€ AI ์—์ด์ „ํŠธ์˜ ๋‹จ์ˆœํ•œ ์ถœ๋ ฅ ํ•„ํ„ฐ๋ง์„ ๋„˜์–ด, ์ „์ฒด ์‹คํ–‰ ๊ณผ์ •(๊ถค์ )์„ ์„ธ๋ฐ€ํ•˜๊ฒŒ ์ง„๋‹จํ•˜๊ณ  โ€˜์–ด๋””์„œ, ์–ด๋–ป๊ฒŒ, ๋ฌด์—‡์ดโ€™ ๋ฌธ์ œ์ธ์ง€ 3์ฐจ์›์ ์œผ๋กœ ๋ถ„๋ฅ˜ํ•  ์ˆ˜ ์žˆ๋Š” ์ตœ์ดˆ์˜ ์ง„๋‹จ์  ๊ฐ€๋“œ๋ ˆ์ผ ํ”„๋ ˆ์ž„์›Œํฌ(AgentDoG)์™€ ๋ฒค์น˜๋งˆํฌ(ATBench)๋ฅผ ์ œ์•ˆํ–ˆ๋‹ค๋Š” ์ ์—์„œ ๋งค์šฐ ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค.

๐Ÿ“– ์ƒ์„ธ ๋ถ„์„: โ†’ Deep Dive ๋ณด๊ธฐ์—์„œ ์‹ฌ์ธต ๋ถ„์„์„ ํ™•์ธํ•˜์„ธ์š”.


๐Ÿ“… ์ƒ์„ฑ์ผ: 2026-02-02 | ๐Ÿค– GLM-4.7 Weekly Digest