๐Ÿ“š Weekly AI Paper Digest

๊ธฐ๊ฐ„: 2026-03-02 ~ 2026-03-07 ์„ ์ •: ์ด๋ฒˆ ์ฃผ ๊ฐ€์žฅ ์ฃผ๋ชฉ๋ฐ›์€ ๋…ผ๋ฌธ Top 5


๐Ÿ† ์ด๋ฒˆ ์ฃผ Top 5

์ˆœ์œ„๋…ผ๋ฌธโฌ†๏ธDeep Dive
๐Ÿฅ‡Utonia: Toward One Encoder for All Pointโ€ฆ142DD-036
๐ŸฅˆHeterogeneous Agent Collaborative Reinfoโ€ฆ140DD-037
๐Ÿฅ‰OmniLottie: Generating Vector Animationsโ€ฆ134DD-038
4.Helios: Real Real-Time Long Video Generaโ€ฆ133DD-039
5.From Scale to Speed: Adaptive Test-Time โ€ฆ130DD-040

๐Ÿ” ์ด๋ฒˆ ์ฃผ ํŠธ๋ Œ๋“œ

ํ•ต์‹ฌ ํ‚ค์›Œ๋“œ

  • ๋ฒ”์šฉ 3D ํ‘œ์ƒ (Universal 3D Representation): ์„œ๋กœ ๋‹ค๋ฅธ ๋„๋ฉ”์ธ(์›๊ฒฉ ํƒ์‚ฌ, ์‹ค๋‚ด/์™ธ ๋“ฑ)์˜ ํฌ์ธํŠธ ํด๋ผ์šฐ๋“œ๋ฅผ ์•„์šฐ๋ฅด๋Š” ๋‹จ์ผ ์ธ์ฝ”๋” ํ•™์Šต ์ถ”๊ตฌ.
  • ์‹ค์‹œ๊ฐ„ ์ƒ์„ฑ ํšจ์œจํ™” (Real-time Generation Efficiency): ๋Œ€๊ทœ๋ชจ ๋น„๋””์˜ค ์ƒ์„ฑ ๋ชจ๋ธ์˜ ์‹ค์‹œ๊ฐ„ ๊ตฌํ˜„ ๋ฐ ์ถ”๋ก  ์†๋„ ์ตœ์ ํ™” ๊ธฐ์ˆ .
  • ํ…Œ์ŠคํŠธ ํƒ€์ž„ ํ™•์žฅ (Test-Time Scaling): ์ถ”๋ก  ์‹œ๊ฐ„์„ ํ™œ์šฉํ•ด ์ด๋ฏธ์ง€ ํŽธ์ง‘ ํ’ˆ์งˆ์„ ๋†’์ด๋Š” ์ฒด์ธ์˜ค๋ธŒ์†ŒํŠธ(Image-CoT) ๊ธฐ๋ฒ•์˜ ์ ์šฉ.
  • ๋ฒกํ„ฐ ์• ๋‹ˆ๋ฉ”์ด์…˜ ์ƒ์„ฑ (Vector Animation): ๊ฐ€๋ณ๊ณ  ์ œ์–ด๊ฐ€ ์šฉ์ดํ•œ Lottie ํ˜•์‹์„ ์ด์šฉํ•œ ๊ณ ํ’ˆ์งˆ ์• ๋‹ˆ๋ฉ”์ด์…˜ ์ƒ์„ฑ.
  • ํ˜‘๋ ฅํ˜• ๊ฐ•ํ™” ํ•™์Šต (Collaborative RL): ์ด์งˆ์ ์ธ ์—์ด์ „ํŠธ๋“ค์ด ํ•™์Šต ๋ฐ์ดํ„ฐ๋ฅผ ๊ณต์œ ํ•˜์—ฌ ํšจ์œจ์„ ๋†’์ด๋Š” ์ƒˆ๋กœ์šด ํŒจ๋Ÿฌ๋‹ค์ž„.

๊ณตํ†ต ์ฃผ์ œ

์ด๋ฒˆ ์ฃผ ๋…ผ๋ฌธ๋“ค์€ ๋Œ€๊ทœ๋ชจ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ์œ ์ง€ํ•˜๋ฉด์„œ๋„ **ํšจ์œจ์„ฑ(์†๋„, ์ž์›)**๊ณผ **๋ฒ”์šฉ์„ฑ(Universality)**์„ ๊ทน๋Œ€ํ™”ํ•˜๋Š” ๋ฐฉํ–ฅ์œผ๋กœ ์—ฐ๊ตฌ๊ฐ€ ์ง‘์ค‘๋˜๊ณ  ์žˆ์Œ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. ํŠนํžˆ ๋น„๋””์˜ค, ์ด๋ฏธ์ง€, 3D ๋ฐ์ดํ„ฐ ๋“ฑ ๋‹ค์–‘ํ•œ ๋ฏธ๋””์–ด ์ƒ์„ฑ ๋ฐ ํŽธ์ง‘ ์ž‘์—…์—์„œ ๋‹จ์ผ ๋ชจ๋ธ๋กœ ์—ฌ๋Ÿฌ ๋„๋ฉ”์ธ์„ ์ฒ˜๋ฆฌํ•˜๊ฑฐ๋‚˜, ์ถ”๋ก  ๊ณผ์ •์„ ์ตœ์ ํ™”ํ•˜์—ฌ ์‹ค์‹œ๊ฐ„ ์„ฑ๋Šฅ์„ ๋‚ด๋Š” ๊ธฐ์ˆ ์  ๋ŒํŒŒ๊ตฌ๊ฐ€ ์ฃผ๋ฅผ ์ด๋ฃน๋‹ˆ๋‹ค.

์ฃผ๋ชฉํ•  ์ 

Utonia๋Š” ์•ž์„œ GPT๊ฐ€ ํ…์ŠคํŠธ์—์„œ ํ–ˆ๋˜ ๊ฒƒ์ฒ˜๋Ÿผ, 3D ํฌ์ธํŠธ ํด๋ผ์šฐ๋“œ ์˜์—ญ์—์„œ๋„ ํ•˜๋‚˜์˜ ์ธ์ฝ”๋”๋กœ ๋ชจ๋“  ๋„๋ฉ”์ธ์„ ํ†ตํ•ฉํ•˜๋ ค๋Š” ์•ผ์‹ฌ ์ฐฌ ์‹œ๋„๋ฅผ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. Helios๋Š” 140์–ต ํŒŒ๋ผ๋ฏธํ„ฐ์˜ ๊ฑฐ๋Œ€ ๋น„๋””์˜ค ๋ชจ๋ธ์„ ๋‹จ์ผ H100 GPU์—์„œ ์ดˆ๋‹น 19.5ํ”„๋ ˆ์ž„์œผ๋กœ ์‹ค์‹œ๊ฐ„ ์ƒ์„ฑํ•˜๋Š” ์—”์ง€๋‹ˆ์–ด๋ง์  ์„ฑ์ทจ๋ฅผ ํ†ตํ•ด, ์ƒ์„ฑ ๋ชจ๋ธ์˜ ์‹ค์šฉํ™” ๊ฐ€๋Šฅ์„ฑ์„ ํฌ๊ฒŒ ๋†’์˜€์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ, From Scale to Speed๋Š” ๋‹จ์ˆœ ์ƒ์„ฑ์„ ๋„˜์–ด โ€˜์ด๋ฏธ์ง€ ํŽธ์ง‘โ€™์ด๋ผ๋Š” ๋ชฉ์  ์ง€ํ–ฅ์  ์ž‘์—…์— ์ถ”๋ก  ์‹œ๊ฐ„ ํ™•์žฅ ๊ธฐ๋ฒ•์„ ์ ‘๋ชฉํ•˜์—ฌ ์ •ํ™•๋„์™€ ํšจ์œจ์„ฑ์„ ๋™์‹œ์— ์žก์œผ๋ ค๋Š” ์‹œ๋„๊ฐ€ ํฅ๋ฏธ๋กญ์Šต๋‹ˆ๋‹ค.

์‹ค๋ฌด ์‹œ์‚ฌ์ 

๊ฐœ๋ฐœ์ž์™€ ์—ฐ๊ตฌ์ž๋Š” ๋‹จ์ˆœํžˆ ๋ชจ๋ธ์˜ ํŒŒ๋ผ๋ฏธํ„ฐ ๊ทœ๋ชจ๋ฅผ ํ‚ค์šฐ๋Š” ๊ฒƒ๋ณด๋‹ค๋Š” **์ถ”๋ก  ์†๋„๋ฅผ ํš๊ธฐ์ ์œผ๋กœ ๋†’์ด๋Š” ๊ธฐ์ˆ (์˜ˆ: Helios์˜ ์ตœ์ ํ™”, Image-CoT)**์— ์ฃผ๋ชฉํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. 3D ๋ฐ ์• ๋‹ˆ๋ฉ”์ด์…˜ ๋ถ„์•ผ์—์„œ๋Š” ๋ฐ์ดํ„ฐ ์ข…๋ฅ˜์— ๊ตฌ์• ๋ฐ›์ง€ ์•Š๋Š” **๋ฒ”์šฉ ๋ชจ๋ธ(์˜ˆ: Utonia)**์„ ์ ์šฉํ•˜์—ฌ ๋ผ๋ฒจ๋ง ๋น„์šฉ์„ ์ ˆ๊ฐํ•˜๊ณ  ๋‹ค์–‘ํ•œ ๋„๋ฉ”์ธ์— ๋Œ€์‘ํ•  ์ˆ˜ ์žˆ๋Š” ์ „๋žต์„ ์ˆ˜๋ฆฝํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ ๋ฒกํ„ฐ ๊ธฐ๋ฐ˜์˜ ์• ๋‹ˆ๋ฉ”์ด์…˜ ์ƒ์„ฑ๊ณผ ๊ฐ™์ด ์›น ๋ฐ ๋ชจ๋ฐ”์ผ ํ™˜๊ฒฝ์—์„œ ๋ฐ”๋กœ ํ™œ์šฉ ๊ฐ€๋Šฅํ•œ ํฌ๋งท์˜ ์ƒ์„ฑ ๋ชจ๋ธ ์ˆ˜์š”๊ฐ€ ์ฆ๊ฐ€ํ•˜๊ณ  ์žˆ์Œ์„ ์ธ์ง€ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.


๐Ÿ“‘ ๋…ผ๋ฌธ๋ณ„ ์š”์•ฝ

๐Ÿฅ‡ 1. Utonia: Toward One Encoder for All Point Clouds

arXiv: 2603.03283 | โฌ†๏ธ 142 โ†’ Deep Dive ๋ณด๊ธฐ ํƒœ๊ทธ: point-cloud 3d-vision self-supervised-learning transformer unified-model multimodal ptv3

์‹ค๋‚ด, ์•ผ์™ธ, ํ•ญ๊ณต, CAD ๋“ฑ ์„œ๋กœ ๋‹ค๋ฅธ ๋„๋ฉ”์ธ์˜ ์ ๊ตฌ๋ฆ„ ๋ฐ์ดํ„ฐ๋ฅผ ํ•˜๋‚˜์˜ ํ†ตํ•ฉ ๋ชจ๋ธ๋กœ ํ•™์Šต์‹œ์ผœ, ๋„๋ฉ”์ธ ๊ฐ„์˜ ์ฐจ์ด๋ฅผ ๊ทน๋ณตํ•˜๊ณ  ๋ฒ”์šฉ์ ์œผ๋กœ ํ™œ์šฉ ๊ฐ€๋Šฅํ•œ 3D ์ธ์‹ ํ‘œ์ค€์„ ์ œ์‹œํ•œ ์—ฐ๊ตฌ์ž…๋‹ˆ๋‹ค.

๐Ÿ“– ์ƒ์„ธ ๋ถ„์„: โ†’ Deep Dive ๋ณด๊ธฐ์—์„œ ์‹ฌ์ธต ๋ถ„์„์„ ํ™•์ธํ•˜์„ธ์š”.


๐Ÿฅˆ 2. Heterogeneous Agent Collaborative Reinforcement Learning

arXiv: 2603.02604 | โฌ†๏ธ 140 โ†’ Deep Dive ๋ณด๊ธฐ ํƒœ๊ทธ: reinforcement-learning multi-agent-system llm collaborative-learning optimization rlhf marl

์„œ๋กœ ๋‹ค๋ฅธ ์„ฑ๋Šฅ๊ณผ ๊ตฌ์กฐ๋ฅผ ๊ฐ€์ง„ ์—์ด์ „ํŠธ๋“ค์ด ํ•™์Šต ๋‹จ๊ณ„์—์„œ ๊ฒ€์ฆ๋œ ๋กค์•„์›ƒ(์‹œ๋ฎฌ๋ ˆ์ด์…˜ ๊ฒฝํ—˜)์„ ๊ณต์œ ํ•˜์—ฌ ์ƒํ˜ธ ๋ณด์™„์ ์œผ๋กœ ํ•™์Šตํ•จ์œผ๋กœ์จ, ์ถ”๋ก  ์‹œ๊ฐ„์—๋Š” ๋…๋ฆฝ์ ์œผ๋กœ ์ž‘๋™ํ•˜๋ฉด์„œ๋„ ํšจ์œจ์„ฑ์„ ๊ทน๋Œ€ํ™”ํ•˜๋Š” ์ƒˆ๋กœ์šด ํ˜‘์—… ๊ฐ•ํ™” ํ•™์Šต ํŒจ๋Ÿฌ๋‹ค์ž„์„ ์ œ์‹œํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค.

๐Ÿ“– ์ƒ์„ธ ๋ถ„์„: โ†’ Deep Dive ๋ณด๊ธฐ์—์„œ ์‹ฌ์ธต ๋ถ„์„์„ ํ™•์ธํ•˜์„ธ์š”.


๐Ÿฅ‰ 3. OmniLottie: Generating Vector Animations via Parameterized Lottie Tokens

arXiv: 2603.02138 | โฌ†๏ธ 134 โ†’ Deep Dive ๋ณด๊ธฐ ํƒœ๊ทธ: ai-paper ml

๐Ÿ“– ์ƒ์„ธ ๋ถ„์„: โ†’ Deep Dive ๋ณด๊ธฐ์—์„œ ์‹ฌ์ธต ๋ถ„์„์„ ํ™•์ธํ•˜์„ธ์š”.


4. 4. Helios: Real Real-Time Long Video Generation Model

arXiv: 2603.04379 | โฌ†๏ธ 133 โ†’ Deep Dive ๋ณด๊ธฐ ํƒœ๊ทธ: video-generation real-time long-video efficiency transformer multimodal helios ai-research

๋‹จ์ผ GPU์—์„œ ์‹ค์‹œ๊ฐ„์œผ๋กœ ๋ถ„ ๋‹จ์œ„ ๊ธด ์˜์ƒ์„ ์ƒ์„ฑํ•˜๋ฉด์„œ๋„ ํ™”๋ฉด์˜ ์ผ๊ด€์„ฑ์„ ์œ ์ง€ํ•˜๋Š” ์ตœ์ดˆ์˜ 140์–ต ํŒŒ๋ผ๋ฏธํ„ฐ ๋ชจ๋ธ์„ ์ œ์‹œํ–ˆ๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.

๐Ÿ“– ์ƒ์„ธ ๋ถ„์„: โ†’ Deep Dive ๋ณด๊ธฐ์—์„œ ์‹ฌ์ธต ๋ถ„์„์„ ํ™•์ธํ•˜์„ธ์š”.


5. 5. From Scale to Speed: Adaptive Test-Time Scaling for Image Editing

arXiv: 2603.00141 | โฌ†๏ธ 130 โ†’ Deep Dive ๋ณด๊ธฐ ํƒœ๊ทธ: image-editing test-time-scaling diffusion-model efficiency adaptive-computation cot-chain-of-thought ai-research

์ด๋ฏธ์ง€ ํŽธ์ง‘ ์ž‘์—…์˜ ๋‚œ์ด๋„์— ๋”ฐ๋ผ ์ถ”๋ก  ์ž์›์„ ๋™์ ์œผ๋กœ ํ• ๋‹นํ•˜์—ฌ ๊ณ„์‚ฐ ํšจ์œจ์„ฑ์„ 2๋ฐฐ ์ด์ƒ ๋†’์ด๋ฉด์„œ๋„ ํŽธ์ง‘ ํ’ˆ์งˆ์€ ์œ ์ง€ํ•˜๋Š” ์ƒˆ๋กœ์šด ํ…Œ์ŠคํŠธ ์‹œ๊ฐ„ ํ™•์žฅ(Test-Time Scaling) ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์‹œํ–ˆ๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.

๐Ÿ“– ์ƒ์„ธ ๋ถ„์„: โ†’ Deep Dive ๋ณด๊ธฐ์—์„œ ์‹ฌ์ธต ๋ถ„์„์„ ํ™•์ธํ•˜์„ธ์š”.


๐Ÿ“… ์ƒ์„ฑ์ผ: 2026-03-08 | ๐Ÿค– GLM-4.7 Weekly Digest