大模型

大模型大模型基础

大模型分布式入门

crabboss 2024 年 7 月 11 日 0Comment

大模型训练需要数据至多，参数量之大，计算…

Read More

大模型大模型量化

大模型量化入门

crabboss 2024 年 7 月 11 日 0Comment

@减少大模型显存消耗，不过分降低模型性能…

Read More

大模型大模型基础

优化器的进化之旅

crabboss 2024 年 7 月 11 日 0Comment

或许大家都习惯了使用Adam和AdamW…

Read More

大模型大模型基础

FlashAttention – 原理解析

crabboss 2024 年 7 月 10 日 0Comment

FlashAttention是一种利用软…

Read More

大模型大模型基础

AMP原理 – 自动混合精度

crabboss 2024 年 7 月 10 日 0Comment

AMP – 自动混合精度，大…

Read More

大模型大模型基础

DeepSpeed – 入门

crabboss 2024 年 7 月 10 日 0Comment

Deepspeed以其良好的实用性和易用…

Read More

大模型预训练

OLMO预训练数据处理

crabboss 2024 年 7 月 5 日 0Comment

大模型的预训练处理的基本处理方式： 1 …

Read More

大模型预训练

OLMO预训练初探

crabboss 2024 年 7 月 5 日 0Comment

OLMO是开源社区十分活跃的选手，他们开…

Read More

大模型基础未分类

不同tokenizer的压缩率

crabboss 2024 年 7 月 5 日 0Comment

在做预训练的时候发现不同的tokeniz…

Read More

大模型基础

scheduler学习率调度器

crabboss 2024 年 6 月 30 日 0Comment

有的时候transformer或者tor…

Read More

大模型大模型基础

大模型分布式入门

大模型大模型量化

大模型量化入门

大模型大模型基础

优化器的进化之旅

大模型大模型基础

FlashAttention – 原理解析