โ† ๐Ÿ“š ์ด๋ฒˆ ์ฃผ Weekly Digest๋กœ ๋Œ์•„๊ฐ€๊ธฐ

DD-020 PaperBanana: Automating Academic Illustration for AI Scientists

arXiv: 2601.23265 ๊ธฐ๊ด€: Google Upvotes: 137 | Comments: 12 ์ˆœ์œ„: ์ด๋ฒˆ ์ฃผ Top 5


๐ŸŒ PaperBanana: Automating Academic Illustration for AI Scientists (Deep Dive)

Review Status: โœ… Deep Analysis Complete Target Audience: Junior AI/ML Developers & Researchers Review Date: 2026-02-02 (Based on paper release)


1. ์™œ ์ด ๋…ผ๋ฌธ์ด ์ค‘์š”ํ•œ๊ฐ€?

ํ˜„์žฌ์˜ โ€˜์˜คํ† ํ”Œ๋กœํŠธ(AutoML)โ€˜๋‚˜ โ€˜AI ๊ณผํ•™์žโ€™๋“ค์€ ํ…์ŠคํŠธ(๋…ผ๋ฌธ ์“ฐ๊ธฐ)๋‚˜ ์ฝ”๋“œ(์‹คํ—˜)๋Š” ์ž๋™ํ™”ํ–ˆ์ง€๋งŒ, ๋…ผ๋ฌธ์˜ ํ•ต์‹ฌ์ธ โ€˜์ผ๋Ÿฌ์ŠคํŠธ(๋„ํ‘œ)โ€™ ์ƒ์„ฑ์€ ์—ฌ์ „ํžˆ ์‚ฌ๋žŒ์˜ ์†์„ ํ•„์š”๋กœ ํ•œ๋‹ค๋Š” ๋ฌธ์ œ์ ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ๊ธฐ์กด์˜ ์ด๋ฏธ์ง€ ์ƒ์„ฑ ๋ชจ๋ธ์€ ํ•™์ˆ ์  ์ •ํ™•๋„(Faithfulness)๊ฐ€ ๋–จ์–ด์ง€๊ณ , ์ฝ”๋“œ ๊ธฐ๋ฐ˜(TikZ ๋“ฑ) ๋ฐฉ์‹์€ ํ‘œํ˜„๋ ฅ์— ํ•œ๊ณ„๊ฐ€ ์žˆ์ฃ .

์ด ๋…ผ๋ฌธ์€ **5๊ฐ€์ง€ ์ „๋ฌธ ์—์ด์ „ํŠธ(Agent) ํ˜‘์—… ์‹œ์Šคํ…œ(PaperBanana)**์„ ์ œ์‹œํ•˜์—ฌ, ๋ณต์žกํ•œ ๋…ผ๋ฌธ์˜ ๋ฉ”์†Œ๋“œ๋ฅผ ์ฝ๊ณ  ์ฐธ๊ณ  ๋ฌธํ—Œ์„ ์ฐพ์•„๋ณธ ๋’ค, ์ถœํŒ ๊ฐ€๋Šฅํ•œ ํ€„๋ฆฌํ‹ฐ์˜ ๋„ํ‘œ๋ฅผ ์ž๋™์œผ๋กœ ๊ทธ๋ ค์ฃผ๋Š” ์‹œ์Šคํ…œ์„ ๊ตฌํ˜„ํ–ˆ๋‹ค๋Š” ์ ์—์„œ ํš๊ธฐ์ ์ž…๋‹ˆ๋‹ค.


2. ํ•ต์‹ฌ ์•„์ด๋””์–ด ์‰ฝ๊ฒŒ ์ดํ•ดํ•˜๊ธฐ

๐ŸŽฌ ๋น„์œ : โ€œ์˜ํ™” ์ œ์ž‘ ํŒ€โ€์˜ ํƒ„์ƒ

๋…ผ๋ฌธ์˜ ๋„ํ‘œ ํ•˜๋‚˜๋ฅผ ๊ทธ๋ฆฌ๋Š” ๊ณผ์ •์„ **โ€œ์˜ํ™” ํ•œ ํŽธ์„ ๋งŒ๋“œ๋Š” ๊ณผ์ •โ€**์œผ๋กœ ์ƒ๊ฐํ•ด ๋ณด์„ธ์š”. ๊ธฐ์กด AI๋Š” โ€œ์˜ํ™” ์ฐ์–ดโ€๋ผ๊ณ  ํ•œ ๋ฒˆ์— ์‹œ์ผฐ๋‹ค๊ฐ€ ์—‰๋ง์ง„์ฐฝ์ธ ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์™”์Šต๋‹ˆ๋‹ค. PaperBanana๋Š” ์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ์ „๋ฌธ๊ฐ€๋“ค์„ ๊ณ ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค.

  1. ์ž๋ฃŒ ์กฐ์‚ฌ์› (Retriever Agent):
    • ์ด ๋…ผ๋ฌธ์˜ ์ฃผ์ œ(์˜ˆ: Transformer)์™€ ๋น„์Šทํ•œ ๊ฐ์„ฑ์„ ๊ฐ€์ง„ ๊ธฐ์กด์˜ ๋ฉ‹์ง„ ํฌ์Šคํ„ฐ(์ฐธ๊ณ  ๋„ํ‘œ)๋“ค์„ ๋„์„œ๊ด€์—์„œ ์ฐพ์•„์˜ต๋‹ˆ๋‹ค.
  2. ๊ฐ๋… & ์ž‘๊ฐ€ (Planner & Stylist Agent):
    • ์›๋ณธ ๋…ผ๋ฌธ ๋‚ด์šฉ์„ ์ฝ๊ณ , โ€œ์ž, ์—ฌ๊ธฐ์„œ๋Š” ์ด๋Ÿฐ ์•„์ด์ฝ˜์„ ์“ฐ๊ณ , ๋ฐฐ๊ฒฝ์€ ํ•˜์–€์ƒ‰์œผ๋กœ ํ•˜์žโ€๋ผ๊ณ  ๊ตฌ์ฒด์ ์ธ ๋Œ€๋ณธ๊ณผ ์—ฐ์ถœ ๊ณ„ํš์„ ์„ธ์›๋‹ˆ๋‹ค.
  3. ํ™”๊ฐ€ (Visualizer Agent):
    • ์‹ค์ œ๋กœ ์ด๋ฏธ์ง€ ์ƒ์„ฑ ๋ชจ๋ธ(DALL-E๋‚˜ Midjourney ๊ฐ™์€)์„ ์‚ฌ์šฉํ•ด ๊ทธ๋ฆผ์„ ๊ทธ๋ฆฝ๋‹ˆ๋‹ค.
  4. ํ‰๋ก ๊ฐ€ (Critic Agent):
    • ๊ทธ๋ ค์ง„ ๊ทธ๋ฆผ์„ ๋ณด๊ณ  โ€œ์ด ํ™”์‚ดํ‘œ ๋ฐฉํ–ฅ์ด ํ‹€๋ ธ์–ดโ€, โ€œ๊ธ€์ž๊ฐ€ ๋„ˆ๋ฌด ์ž‘์•„โ€๋ผ๊ณ  ์ง€์ ํ•ฉ๋‹ˆ๋‹ค.
  5. ๋ฐ˜๋ณต (Iterative Refinement):
    • ํ‰๋ก ๊ฐ€๊ฐ€ โ€œํ•ฉ๊ฒฉโ€์„ ์ค„ ๋•Œ๊นŒ์ง€ 3~4๋ฒˆ ๊ณผ์ •์„ ๋ฐ˜๋ณตํ•ฉ๋‹ˆ๋‹ค.

โš™๏ธ ๋‹จ๊ณ„๋ณ„ ๋™์ž‘ ๊ณผ์ •

์ด ๋…ผ๋ฌธ์˜ ํ•ต์‹ฌ์€ **Linear Planning Phase(์„ ํ˜• ๊ณ„ํš ๋‹จ๊ณ„)**์™€ **Iterative Refinement Loop(๋ฐ˜๋ณต ์ •์ œ ๋ฃจํ”„)**๋กœ ๋‚˜๋‰ฉ๋‹ˆ๋‹ค.

  1. ์ž…๋ ฅ (Input): ๋…ผ๋ฌธ์˜ ๋ฉ”์†Œ๋“œ ์„ค๋ช… ํ…์ŠคํŠธ($S$)์™€ ๊ทธ๋ฆผ์— ๋Œ€ํ•œ ์„ค๋ช…($C$).
  2. ์ฐธ์กฐ ๊ฒ€์ƒ‰ (Retrieval):
    • ๊ธฐ์กด ๋„ํ‘œ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค($R$)์—์„œ ๊ฐ€์žฅ ๋น„์Šทํ•œ ์˜ˆ์‹œ $N$๊ฐœ๋ฅผ ๋ฝ‘์•„์˜ต๋‹ˆ๋‹ค. ์ด๋•Œ ๋‹จ์ˆœํžˆ ํ‚ค์›Œ๋“œ ๋งค์นญ๋งŒ ํ•˜๋Š” ๊ฒŒ ์•„๋‹ˆ๋ผ, VLM(๋น„์ „-์–ธ์–ด ๋ชจ๋ธ)์„ ์ด์šฉํ•ด โ€œ์ด ๋„ํ‘œ์˜ ๊ตฌ์กฐ๊ฐ€ ํŒŒ์ดํ”„๋ผ์ธ์ด์•ผ?โ€ ์ฒ˜๋Ÿผ ๋…ผ๋ฆฌ์ ์ธ ๋งค์นญ์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.
    • $$E = VLM_{Ret}(S, C, {(S_i, C_i)}_{E_i \in R})$$
  3. ์Šคํƒ€์ผ ์ตœ์ ํ™” ๋ฐ ์ƒ์„ฑ: ๋ฝ‘ํžŒ ์ฐธ์กฐ ๋„ํ‘œ๋ฅผ ๋ณด๊ณ , ํ˜„์žฌ ๋…ผ๋ฌธ์— ๋งž๋Š” ์Šคํƒ€์ผ๊ณผ ๋ ˆ์ด์•„์›ƒ์„ ๊ธฐ์ˆ ํ•œ ํ”„๋กฌํ”„ํŠธ๋ฅผ ๋งŒ๋“ญ๋‹ˆ๋‹ค.
  4. ํ”ผ๋“œ๋ฐฑ ๋ฃจํ”„: Visualizer๊ฐ€ ๊ทธ๋ฆผ์„ ๊ทธ๋ฆฌ๋ฉด Critic๊ฐ€ **Faithfulness(๋‚ด์šฉ ์ •ํ™•๋„)**์™€ **Aesthetic(๋ฏธ์  ์™„์„ฑ๋„)**๋ฅผ ํ‰๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. ์ ์ˆ˜๊ฐ€ ๋‚ฎ์œผ๋ฉด ๋‹ค์‹œ ๊ทธ๋ฆฝ๋‹ˆ๋‹ค.

3. ์‹คํ—˜ ๊ฒฐ๊ณผ ๋ถ„์„

์ด ๋…ผ๋ฌธ์€ ์—ฐ๊ตฌ์ž๋“ค์ด ์ง์ ‘ ๋งŒ๋“  ๋ฒค์น˜๋งˆํฌ์ธ PaperBananaBench๋ฅผ ์‚ฌ์šฉํ•ด ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ–ˆ์Šต๋‹ˆ๋‹ค.

๐Ÿ“Š ์„ฑ๋Šฅ ๋น„๊ต (Overall Score ๊ธฐ์ค€)

๋ฐฉ๋ฒ•๋ก  (Method)๋ชจ๋ธFaithfulness (์ถฉ์‹ค๋„)Conciseness (๊ฐ„๊ฒฐ์„ฑ)Readability (๊ฐ€๋…์„ฑ)Aesthetic (๋ฏธ๊ด€)Overall (์ข…ํ•ฉ)
Baseline (Vanilla)GPT-Image-1.54.537.530.037.011.5
Baseline (Few-shot)Nano-Banana-Pro41.649.637.660.541.8
Baseline (Agentic)Paper2Any6.544.020.540.08.5
PaperBanana (Ours)Nano-Banana-Pro45.880.751.472.160.2
Human (์ธ๊ฐ„)-50.050.050.050.050.0

๐Ÿ” ์ฃผ๋ชฉํ•  ๋งŒํ•œ ์„ฑ๊ณผ

  1. ์••๋„์ ์ธ ์ข…ํ•ฉ ์ ์ˆ˜ (60.2 vs 50.0):
    • ๊ฐ€์žฅ ๋†€๋ผ์šด ์ ์€ PaperBanana๊ฐ€ ์ธ๊ฐ„์˜ ํ‰๊ท  ์ ์ˆ˜(50.0)๋ฅผ ๋›ฐ์–ด๋„˜์—ˆ๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ด๋Š” AI๊ฐ€ ์ƒ์„ฑํ•œ ๋„ํ‘œ๊ฐ€ ํ‰๋ก ๊ฐ€(VLM Judge) ์ž…์žฅ์—์„œ ๋” ๋ช…ํ™•ํ•˜๊ณ (Clear), ๋” ๊น”๋”ํ•˜๋‹ค๋Š”(Aesthetic) ๊ฒƒ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. (๋ฌผ๋ก  ๋ณต์žกํ•œ ๋…ผ๋ฆฌ์˜ ์ „๋‹ฌ๋ ฅ์€ ์ธ๊ฐ„์ด ์•„์ง ์•ž์„œ์ง€๋งŒ, ์ „๋ฐ˜์ ์ธ ํ€„๋ฆฌํ‹ฐ์—์„œ ์Šน๋ฆฌํ–ˆ์Šต๋‹ˆ๋‹ค.)
  2. Conciseness(๊ฐ„๊ฒฐ์„ฑ)์˜ ํญ๋ฐœ์  ์ƒ์Šน (80.7):
    • ๊ธฐ์กด AI๋Š” ๋ถˆํ•„์š”ํ•œ ์žฅ์‹์„ ๋งŽ์ด ๋„ฃ์—ˆ์ง€๋งŒ, PaperBanana๋Š” ํ•™์ˆ ์  ๋ชฉ์ ์— ๋งž๊ฒŒ ๋ถˆํ•„์š”ํ•œ ์š”์†Œ๋ฅผ ์ œ๊ฑฐํ•˜๊ณ  ํ•ต์‹ฌ์— ์ง‘์ค‘ํ•˜๋Š” ๋Šฅ๋ ฅ์ด ํƒ์›”ํ•ฉ๋‹ˆ๋‹ค.
  3. ๊ธฐ์กด Agentic ๋ฐฉ์‹(Paper2Any)์˜ ์ฐธํŒจ (8.5):
    • ๋‹จ์ˆœํžˆ ์—์ด์ „ํŠธ๋ฅผ ์“ด๋‹ค๊ณ  ํ•ด๊ฒฐ๋˜๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ, **โ€˜์ฐธ๊ณ  ๋„ํ‘œ(Retrieval)โ€˜**๋ฅผ ์–ผ๋งˆ๋‚˜ ์ž˜ ํ™œ์šฉํ•˜๋А๋ƒ๊ฐ€ ์„ฑ๋Šฅ์˜ ํ•ต์‹ฌ์ž„์„ ์ฆ๋ช…ํ–ˆ์Šต๋‹ˆ๋‹ค.

4. ํ•œ๊ณ„์ ๊ณผ ํ–ฅํ›„ ์—ฐ๊ตฌ ๋ฐฉํ–ฅ

๐Ÿ›‘ ์ €์ž๊ฐ€ ์–ธ๊ธ‰/์•”์‹œํ•œ ํ•œ๊ณ„์ 

  • VLM ํŒ์‚ฌ์˜ ์‹ ๋ขฐ์„ฑ (VLM-as-a-Judge Reliability):
    • ํ‰๊ฐ€๋ฅผ ์œ„ํ•ด ๋‹ค๋ฅธ VLM(Gemini-3-Pro)์„ ์‚ฌ์šฉํ–ˆ๋Š”๋ฐ, ์ด ํ‰๊ฐ€์ž๊ฐ€ ์™„๋ฒฝํ•˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋…ผ๋ฌธ์—์„œ๋„ ์ด๋ฅผ ๊ฒ€์ฆํ•˜๊ธฐ ์œ„ํ•ด 2๋‹จ๊ณ„ ๊ฒ€์ฆ ๊ณผ์ •์„ ๊ฑฐ์ณค๋‹ค๊ณ  ์–ธ๊ธ‰ํ•ฉ๋‹ˆ๋‹ค.
  • ๋ณต์žกํ•œ ์‹œ๊ฐ์  ์š”์†Œ์˜ ์ œ์•ฝ:
    • ๋งค์šฐ ๋ณต์žกํ•œ 3D ๊ตฌ์กฐ๋‚˜ ํŠน์ˆ˜ํ•œ ์•„์ด์ฝ˜์ด ๋งŽ์ด ํ•„์š”ํ•œ ๋„ํ‘œ๋Š” ์—ฌ์ „ํžˆ ์ƒ์„ฑํ•˜๊ธฐ ์–ด๋ ต๊ฑฐ๋‚˜ ๊ธ€์ž(Text rendering)๊ฐ€ ๊นจ์ง€๋Š” ํ˜„์ƒ์ด ๋‚จ์•„์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๐Ÿš€ ๊ฐœ์„  ๊ฐ€๋Šฅํ•œ ์ 

  • ์‚ฌ์šฉ์ž ํ”ผ๋“œ๋ฐฑ ๋ฐ˜์˜ (Human-in-the-loop):
    • ํ˜„์žฌ๋Š” Critic Agent๊ฐ€ ํŒ๋‹จํ•˜์ง€๋งŒ, ์‹ค์ œ ์‚ฌ์šฉ์ž(์—ฐ๊ตฌ์ž)๊ฐ€ โ€œ์ด ๋ถ€๋ถ„๋งŒ ์ˆ˜์ •ํ•ด์ค˜โ€๋ผ๊ณ  ์ง์ ‘ ๊ฐœ์ž…ํ•  ์ˆ˜ ์žˆ๋Š” ์ธํ„ฐ๋ž™ํ‹ฐ๋ธŒ ๊ธฐ๋Šฅ์ด ์ถ”๊ฐ€๋œ๋‹ค๋ฉด ์™„๋ฒฝํ•  ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

5. ์‹ค๋ฌด ์ ์šฉ ๊ฐ€๋Šฅ์„ฑ

๐Ÿ’ผ ๋ฐ”๋กœ ์ ์šฉ ๊ฐ€๋Šฅํ•œ ๊ณณ

  • AI ์—ฐ๊ตฌ์†Œ/๋Œ€ํ•™:
    • ๋…ผ๋ฌธ ์ž‘์„ฑ ์‹œ๊ฐ„์„ ํš๊ธฐ์ ์œผ๋กœ ๋‹จ์ถ•ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํŠนํžˆ ์ดˆ๋ก(Abstract)์ด๋‚˜ ๋„์ž…๋ถ€(Intro)์˜ ๊ฐœ์š” ๋„ํ‘œ๋ฅผ ๋ช‡ ์ดˆ ๋งŒ์— ๋งŒ๋“ค์–ด ์ดˆ์•ˆ์„ ์™„์„ฑํ•˜๋Š” ๋ฐ ์œ ์šฉํ•ฉ๋‹ˆ๋‹ค.
  • ๊ธฐ์ˆ  ๋ธ”๋กœ๊ฑฐ/๋ฌธ์„œ ์ž‘์„ฑ์ž:
    • ๊ธฐ์ˆ  ์•„ํ‚คํ…์ฒ˜ ๋‹ค์ด์–ด๊ทธ๋žจ์„ ๋น ๋ฅด๊ฒŒ ์ƒ์„ฑํ•˜์—ฌ ๋ฌธ์„œ์˜ ๊ฐ€๋…์„ฑ์„ ๋†’์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

โš™๏ธ ํ•„์š”ํ•œ ๋ฆฌ์†Œ์Šค

  • GPU: ๊ณ ์„ฑ๋Šฅ VLM(๋น„์ „-์–ธ์–ด ๋ชจ๋ธ)๊ณผ Image Generation ๋ชจ๋ธ์ด ํ•„์š”ํ•˜๋ฏ€๋กœ, A100 ๋˜๋Š” H100็บงๅˆซ็š„ GPU ๋ฆฌ์†Œ์Šค๊ฐ€ ํ•„์š”ํ•  ๊ฒƒ์œผ๋กœ ๋ณด์ž…๋‹ˆ๋‹ค. (PaperBanana ์ž์ฒด๊ฐ€ ํ”„๋ ˆ์ž„์›Œํฌ์ด๋ฏ€๋กœ API ํ˜•ํƒœ๋กœ ์„œ๋น„์Šค๋œ๋‹ค๋ฉด ๋กœ์ปฌ GPU๋Š” ์—†์–ด๋„ ๋ฉ๋‹ˆ๋‹ค.)
  • ๋ฐ์ดํ„ฐ: ์ž์‹ ์˜ ์—ฐ๊ตฌ ๋ถ„์•ผ์— ๋งž๋Š” ๊ณ ํ’ˆ์งˆ ๋„ํ‘œ ๋ฐ์ดํ„ฐ์…‘(Reference Set $R$)์„ ๋ณ„๋„๋กœ ๊ตฌ์ถ•ํ•˜๋ฉด ์„ฑ๋Šฅ์ด ๋” ์ข‹์•„์ง‘๋‹ˆ๋‹ค.

6. ์ด ๋…ผ๋ฌธ์„ ์ดํ•ดํ•˜๊ธฐ ์œ„ํ•œ ์‚ฌ์ „ ์ง€์‹

  1. VLM (Vision-Language Model): ์ด๋ฏธ์ง€์™€ ํ…์ŠคํŠธ๋ฅผ ๋™์‹œ์— ์ดํ•ดํ•˜๊ณ  ์ƒ์„ฑํ•˜๋Š” ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. (์˜ˆ: GPT-4o, Gemini Pro Vision)
  2. Agentic AI (AI ์—์ด์ „ํŠธ): ์‚ฌ์šฉ์ž์˜ ๋ช…๋ น ํ•˜๋‚˜๋งŒ์œผ๋กœ ์Šค์Šค๋กœ ๊ณ„ํš์„ ์„ธ์šฐ๊ณ  ๋„๊ตฌ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ชฉํ‘œ๋ฅผ ๋‹ฌ์„ฑํ•˜๋Š” AI ์‹œ์Šคํ…œ์ž…๋‹ˆ๋‹ค.
  3. TikZ: LaTeX ๋ฌธ์„œ์—์„œ ๋ณต์žกํ•œ ๋„ํ‘œ๋ฅผ ์ฝ”๋“œ๋กœ ๊ทธ๋ฆด ๋•Œ ์“ฐ๋Š” ๊ฐ€์žฅ ์œ ๋ช…ํ•œ ํˆด์ž…๋‹ˆ๋‹ค. (๋ฐฐ์šฐ๊ธฐ ์–ด๋ ต์ง€๋งŒ ํ€„๋ฆฌํ‹ฐ๊ฐ€ ์ข‹์Œ)
  4. In-context Learning (๋ช‡-shot ํ•™์Šต): ๋ชจ๋ธ์˜ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์—…๋ฐ์ดํŠธํ•˜์ง€ ์•Š๊ณ , ํ”„๋กฌํ”„ํŠธ์— ์˜ˆ์‹œ ๋ช‡ ๊ฐœ๋ฅผ ๋ณด์—ฌ์คŒ์œผ๋กœ์จ ๋ชจ๋ธ์ด ํŒจํ„ด์„ ํ•™์Šตํ•˜๊ฒŒ ํ•˜๋Š” ๊ธฐ๋ฒ•์ž…๋‹ˆ๋‹ค.
  5. RAG (Retrieval-Augmented Generation): ๋ชจ๋ธ์ด ์™ธ๋ถ€ ์ง€์‹(์ฐธ๊ณ  ๋ฌธํ—Œ ๋“ฑ)์„ ๊ฒ€์ƒ‰ํ•ด์„œ ๋‹ต๋ณ€์„ ์ƒ์„ฑํ•˜๋Š” ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ๋Š” โ€˜๋น„์Šทํ•œ ๋„ํ‘œ ์˜ˆ์‹œโ€™๋ฅผ ๊ฐ€์ ธ์˜ค๋Š” ๋ฐ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.
  6. Diffusion Model (ํ™•์‚ฐ ๋ชจ๋ธ): ์ตœ์‹  ์ด๋ฏธ์ง€ ์ƒ์„ฑ AI์˜ ํ•ต์‹ฌ ๊ธฐ์ˆ ๋กœ, ๋…ธ์ด์ฆˆ์—์„œ๋ถ€ํ„ฐ ์ ์ง„์ ์œผ๋กœ ์ด๋ฏธ์ง€๋ฅผ ๋ณต์›ํ•ด๋‚ด๋Š” ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค.

๐Ÿ“š ์ด๋ฒˆ ์ฃผ ๊ด€๋ จ Deep Dive

์ˆœ์œ„๋…ผ๋ฌธDeep Dive
๐Ÿฅ‡Green-VLA: Staged Vision-Language-Aโ€ฆDD-017
๐ŸฅˆERNIE 5.0 Technical ReportDD-016
๐Ÿฅ‰Kimi K2.5: Visual Agentic Intelligeโ€ฆDD-018
4.Vision-DeepResearch: Incentivizing โ€ฆDD-019
5.PaperBanana: Automating Academic Ilโ€ฆ๐Ÿ“ ํ˜„์žฌ ๋ฌธ์„œ

๐Ÿ“… ์ƒ์„ฑ์ผ: 2026-02-08 | ๐Ÿค– GLM-4.7 Deep Dive