Technology

Huawei asserts superior AI training approach compared to DeepSeek, thanks to its Ascend processors.

Huawei's advancements in Artificial Intelligence (AI) model architecture could be substantial, as the company aims to lessen its dependence on American technologies.

, and Administrator

2025 June 7 . 2:27 AM

3 min read

Advancements in AI model architecture by Huawei could be substantial, as the company aims to lessen... — Advancements in AI model architecture by Huawei could be substantial, as the company aims to lessen its reliance on American technologies.

Huawei asserts superior AI training approach compared to DeepSeek, thanks to its Ascend processors.

Update on Huawei's Advanced AI Breakthrough

A game-changing paper, penned by Huawei's Pangu team, has been unveiled recently. Comprising a core crew of 22 contributors and 56 associated researchers, the team has introduced the revolutionary concept of Mixture of Grouped Experts (MoGE). This innovation is an upgrade on the Mixture of Experts (MoE) method, which has been pivotal in DeepSeek's affordable AI models.

Although MoE offers reduced execution costs for massive model parameters and heightened learning capacity, it occasionally encounters inefficiencies, as per the paper. This issue arises due to the unbalanced activation of so-called experts, causing performance issues when operating on multiple devices simultaneously.

In contrast, the revamped MoGE method见Footnote* groups the experts during selection and better distributes the workload amongst them – results that are a stark contrast to the inefficiencies often seen in MoE.

In the realm of AI, "experts" refer to specialized sub-models or components within a broader model, each responsible for managing specific tasks or types of data. By leveraging this diverse expertise, the system can achieve enhanced performance overall.

01:38

Nvidia's CEO Huang highlights China as a crucial market during his Beijing visit, remarks on the US AI chip ban

Nvidia

*Note: Seen in Enrichment Data

The Edge of MoGE: Balancing Experts' Workloads for AI Excellence

Balanced Expert Workloads: MoGE sorts experts during selection, ensuring a fair distribution of workload across devices during parallel operation. This leads to superior efficiency over MoE, where certain experts are frequently activated more than others, leading to inefficiencies.
Boosted Throughput: By redistributing the computational load more evenly, MoGE can substantially enhance the performance of AI models, particularly in the crucial inference phase, which is vital for real-time applications.
Improved Scalability: MoGE proves more suitable for distributed computing environments, as it ensures each device processes a fair share of workload, amplifying overall system effectiveness when multiple devices are employed.
Customized for Specific Hardware: MoGE can be customized for specific hardware configurations, like Huawei's Ascend NPUs, enabling more efficient training and inference processes tailored to the capabilities of the underlying hardware.
Scalable Large Language Models (LLMs): MoGE is particularly advantageous for implementing complex tasks in LLMs by utilizing a diverse set of specialized sub-models or "experts," grouped for superior performance.
Cost-Effective AI Training: By enhancing the efficiency of AI model training, MoGE can diminish the expenses associated with large-scale AI model development and deployment, making it an essential technique for companies aiming to optimize their AI infrastructure.
Hybrid Approaches: MoGE endorses hybrid approaches in AI, enabling the combination of multiple techniques for better results compared to using a single approach like MoE.

Applications

Advantages

MoE vs. MoGE: Comparing the Two Techniques

| Feature | Mixture of Experts (MoE) | Mixture of Grouped Experts (MoGE) ||-----------------------|-----------------------------------------------------------|-------------------------------------------------------------------|| Expert Activation | Activated based on inputs, resulting in uneven usage. | Grouped and activated for a balanced workload. || Efficiency | Can be inefficient due to uneven expert usage. | More efficient due to better load balancing across devices. || Scalability | Less scalable due to uneven load distribution. | Highly scalable for parallel processing environments. || Hardware Optimization | Not customized for specific hardware configurations. | Optimizable for specific hardware like Ascend NPUs. |

In summary, MoGE provides significant enhancements over MoE, offering superior efficiency, scalability, and hardware optimization for AI models – paving the way for future advancements in the field.

The groundbreaking innovation, MoGE (Mixture of Grouped Experts), addresses the inefficiencies found in the Mixture of Experts (MoE) method by balanced workload distribution among handling specific tasks or types of data, leading to a more efficient and effective AI system.
Leveraging the MoGE method offers cost-effective AI training, endorses hybrid approaches, and is particularly useful for implementing complex tasks in large language models (LLMs), making it an essential technique for companies aiming to optimize their AI infrastructure.

Latest

Data link involving physical wires for connecting computing devices

sports-betting

Persisting Threat from Cybercrime: Nonetheless, Wins in Cyber Investigations Persist

Persistent Danger from Cybercrime Prevails; Yet, Notable Investigatory Victories Obtained

, and Administrator

2025 June 7

Advertising agency led by Sir Martin Sorrell anticipates a modest drop in its like-for-like net...

Technology

Marketing expenditures by big tech firms purposefully reduced, causing a significant impact on Sir Martin Sorrell's advertising agency.

Advertising firm led by Sir Martin Sorrell projects a decrease of around 1-10% in net revenues without adjusting for acquisitions and currency fluctuations by 2025.

, and Administrator

2025 June 7

Legislation on AI-Assisted Frauds: US Reps. Ted Lieu (D-CA) and Kevin Kiley (R-CA) Introduce Bill...

Technology

Weekly Policy Updates, Nov 18, 2024: Key Decisions and Developments

Lawmakers in the U.S. Congress, specifically Representatives Ted Lieu (D-CA) and Kevin Kiley (R-CA), have proposed legislation boosting the fines for perpetrating mail, telecommunications, banking fraud, and money laundering, all of which involve the use of artificial intelligence (AI).

, and Administrator

2025 June 7

Lixo, a Paris-based software firm that employs computer vision and AI to enhance waste disposal...

Science

Interview Questions for Marjorie Darcet, Co-founder and CEO of Lixo:

Lixo, a Paris-based software firm employing computer vision and AI for optimized waste management, was discussed by its co-founder and CEO, Marjorie Darcet. Darcet detailed the role of Lixo's computer vision tech in minimizing hazardous substance risks in waste disposal.

, and Administrator

2025 June 7

Huawei asserts superior AI training approach compared to DeepSeek, thanks to its Ascend processors.

Huawei asserts superior AI training approach compared to DeepSeek, thanks to its Ascend processors.

The Edge of MoGE: Balancing Experts' Workloads for AI Excellence

Applications

Advantages

MoE vs. MoGE: Comparing the Two Techniques

Read also:

Related

Latest