René's URL Explorer Experiment


Title: Xiaonan Nie’s Home

Open Graph Title: Xiaonan Nie’s Home

Open Graph Description: About me

Opengraph URL: https://codecaution.github.io/

direct link

Domain: codecaution.github.io

og:localeen-US
og:site_nameXiaonan Nie's Home
HandheldFriendlyTrue
MobileOptimized320
NoneAbout me
msapplication-TileColor#000000
msapplication-TileImagehttps://codecaution.github.io/images/mstile-144x144.png?v=M44lzPylqQ
msapplication-confighttps://codecaution.github.io/images/browserconfig.xml?v=M44lzPylqQ
theme-color#ffffff

Links:

Xiaonan Nie's Homehttps://codecaution.github.io/
Githubhttps://github.com/codecaution
Google Scholarhttps://scholar.google.com/citations?user=99LfmxYAAAAJ&hl=zh-CN&oi=ao
Prof. Bin Cuihttp://net.pku.edu.cn/~cuibin/
the 1st Google MoE workshophttps://rsvp.withgoogle.com/events/googleworkshopsparsityadaptivecomputation-2022/agenda
NVIDIA’s GPU technology conference (GTC) 2024https://www.nvidia.com/en-us/on-demand/session/gtc24-s61691/
HETUhttps://github.com/PKU-DAIR/Hetu
Angel-PTMhttps://cloud.tencent.com/developer/article/2245528
Baichuan-AIhttps://www.baichuan-ai.com/home
GTC 2024https://www.nvidia.com/en-us/on-demand/session/gtc24-s61691/
2021 Synced Machine Intelligence TOP-10 Open Source Awards.https://www.jiqizhixin.com/awards/2021/events
Pop SOTA!List for AI Developers 2021.https://mp.weixin.qq.com/s/jHkF9UpgEn1MLZpRH2FOaA
[2021 CCF BDCI Contest]https://mp.weixin.qq.com/s/hSoDMVMZApQxaiNqh2jUSg
HunYuan-NLP 1T, Top-1 model in CLUEhttps://cluebenchmarks.com/rank.html
PDFhttps://arxiv.org/abs/2509.20427
PDFhttps://arxiv.org/abs/2506.09113
PDFhttps://arxiv.org/abs/2505.14683
PDFhttps://arxiv.org/abs/2309.10305
Efficiently Training 7B LLM with 1 Million Sequence Length on 8 GPUs.https://arxiv.org/abs/2407.12117
Malleus: Straggler-Resilient Hybrid Parallel Training of Large-scale Models via Malleable Data and Model Parallelization.https://arxiv.org/abs/2410.13333
PQCache: Product Quantization-based KVCache for Long Context LLM Inference.https://arxiv.org/abs/2407.12820
ByteScale: Efficient Scaling of LLM Training with a 2048K Context Length on More Than 12,000 GPUs.https://arxiv.org/abs/2502.21231
NetMoE: Accelerating MoE Training through Dynamic Sample Placement.https://openreview.net/forum?id=1qP3lsatCR
DataSculpt: A Holistic Data Management Framework for Long-Context LLMs Training.https://arxiv.org/abs/2409.00997
LSH-MoE: Communication-efficient MoE Training via Locality-Sensitive Hashing.https://arxiv.org/abs/2411.08446
Improving Automatic Parallel Training via Balanced Memory Workload Optimizationhttps://arxiv.org/abs/2307.02031
FlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via Dynamic Device Placementhttps://arxiv.org/abs/2304.03946
Angel-PTM: A Scalable and Economical Large-scale Pre-training System in Tencenthttps://arxiv.org/pdf/2303.02868.pdf
Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelismhttps://www.vldb.org/pvldb/vol16/p470-miao.pdf
OSDP: Optimal Sharded Data Parallel for Distributed Deep Learninghttps://arxiv.org/abs/2209.13258
TSPLIT: Fine-grained GPU Memory Management for Efficient DNN Training via Tensor Splittinghttps://ieeexplore.ieee.org/document/9835178
EvoMoE: An Evolutional Mixture-of-Experts Training Framework via Dense-To-Sparse Gatehttps://codecaution.github.io/(https:/arxiv.org/abs/2112.14397)
Hetu: A highly efficient automatic parallel distributed deep learning systemhttp://scis.scichina.com/en/2023/117101.pdf
HET: Scaling out Huge Embedding Model Training via Cache-enabled Distributed Frameworkhttps://dl.acm.org/doi/10.14778/3489496.3489511
HET-GMP: A Graph-based System Approach to Scaling Large Embedding Model Traininghttps://dl.acm.org/doi/10.1145/3514221.3517902
Heterogeneity-Aware Distributed Machine Learning Training via Partial Reducehttps://dl.acm.org/doi/10.1145/3448016.3452773
Sitemaphttps://codecaution.github.io/sitemap/
GitHubhttp://github.com/codecaution
Feedhttps://codecaution.github.io/feed.xml
Jekyllhttp://jekyllrb.com
AcademicPageshttps://github.com/academicpages/academicpages.github.io
Minimal Mistakeshttps://mademistakes.com/work/minimal-mistakes-jekyll-theme/

Viewport: width=device-width, initial-scale=1.0


URLs of crawlers that visited me.