๐Ÿ–‹๏ธ
noviceforever
Ctrlk
  • About me
  • Miscellaneous
    • Introduction
  • Machine Learning
    • Tabular Data
    • Computer Vision (CNN-based)
    • Computer Vision (Transformer-based)
    • Natural Language Processing
    • Recommendation System
    • Reinforcement Learning
    • IoT on AWS
    • Distributed Training
    • Deployment
  • AWS AIML
    • Amazon Personalize
    • Amazon Bedrock AgentCore
    • Customer Support
  • GenAI
    • Theory
    • Synthetic Data
    • MoE (Mixture-of-Experts)
      • MoE Overview
      • MoE ๋ชจ๋ธ ๋น„๊ต ๋ฐ ์ฃผ์š” ๊ธฐ๋ฒ• ์ •๋ฆฌ
      • ๋ถ„์‚ฐ ํ›ˆ๋ จ ๊ธฐ์ดˆ ๊ฐœ๋…
      • ์ „๋ฌธ๊ฐ€ ๋ณ‘๋ ฌํ™” (Expert Parallelism)
      • [Optional] NVSHMEM (NVIDIA Shared Memory)
      • ๋ถ„์‚ฐ ํ›ˆ๋ จ์—์„œ์˜ AWS ๋„คํŠธ์›Œํ‚น: EFA (Elastic Fabric Adapter)
      • AWS์—์„œ MoE ๋ชจ๋ธ์„ ํšจ์œจ์ ์œผ๋กœ ํ›ˆ๋ จํ•˜๊ธฐ
      • ๋ถ„์‚ฐ ํ›ˆ๋ จ ์ „๋žต
      • ML ์—”์ง€๋‹ˆ์–ด์™€ ์ธํ”„๋ผ ์—”์ง€๋‹ˆ์–ด ๊ฐ„ ๋ถ„์‚ฐ ํ›ˆ๋ จ ํ˜‘์—… ๊ฐ€์ด๋“œ ๋ฐ ์ฒดํฌ๋ฆฌ์ŠคํŠธ
      • ์ถ”๋ก  ์ตœ์ ํ™” ๊ฐœ์š” (Prefill๊ณผ Decoding์— ๋”ฐ๋ฅธ ์ฃผ์š” ๊ธฐ๋ฒ• ์ •๋ฆฌ)
      • SageMaker Large Model Inference (LMI)๋ฅผ ํ™œ์šฉํ•œ ๋ชจ๋ธ ์„œ๋น™ ๋ฐ ์ตœ์ ํ™” ๊ฐ€์ด๋“œ
    • Open Source SLM-Based Hybrid Agent AI Architecture
    • Fine-tuning
    • LLM Evaluation
Powered by GitBook
On this page
  1. GenAI

MoE (Mixture-of-Experts)

MoE OverviewMoE ๋ชจ๋ธ ๋น„๊ต ๋ฐ ์ฃผ์š” ๊ธฐ๋ฒ• ์ •๋ฆฌ๋ถ„์‚ฐ ํ›ˆ๋ จ ๊ธฐ์ดˆ ๊ฐœ๋…์ „๋ฌธ๊ฐ€ ๋ณ‘๋ ฌํ™” (Expert Parallelism)[Optional] NVSHMEM (NVIDIA Shared Memory)๋ถ„์‚ฐ ํ›ˆ๋ จ์—์„œ์˜ AWS ๋„คํŠธ์›Œํ‚น: EFA (Elastic Fabric Adapter)AWS์—์„œ MoE ๋ชจ๋ธ์„ ํšจ์œจ์ ์œผ๋กœ ํ›ˆ๋ จํ•˜๊ธฐ๋ถ„์‚ฐ ํ›ˆ๋ จ ์ „๋žตML ์—”์ง€๋‹ˆ์–ด์™€ ์ธํ”„๋ผ ์—”์ง€๋‹ˆ์–ด ๊ฐ„ ๋ถ„์‚ฐ ํ›ˆ๋ จ ํ˜‘์—… ๊ฐ€์ด๋“œ ๋ฐ ์ฒดํฌ๋ฆฌ์ŠคํŠธ์ถ”๋ก  ์ตœ์ ํ™” ๊ฐœ์š” (Prefill๊ณผ Decoding์— ๋”ฐ๋ฅธ ์ฃผ์š” ๊ธฐ๋ฒ• ์ •๋ฆฌ)SageMaker Large Model Inference (LMI)๋ฅผ ํ™œ์šฉํ•œ ๋ชจ๋ธ ์„œ๋น™ ๋ฐ ์ตœ์ ํ™” ๊ฐ€์ด๋“œ
Previous[Use-case w/ Hands-on] ์‹ค์ œ ๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ ํ•ฉ์„ฑ QnA ์ƒ์„ฑํ•˜๊ธฐNextMoE Overview