The cost model leverages SMT‑based solving (Z3) to achieve optimal decoding speed under CPU, I/O, and memory constraints.The cost model leverages SMT‑based solving (Z3) to achieve optimal decoding speed under CPU, I/O, and memory constraints.

How PowerInfer‑2 Turns Your Smartphone Into an AI Workstation

Abstract and 1. Introduction

  1. Background and Motivation
  2. PowerInfer-2 Overview
  3. Neuron-Aware Runtime Inference
  4. Execution Plan Generation
  5. Implementation
  6. Evaluation
  7. Related Work
  8. Conclusion and References

5 Execution Plan Generation

Today’s smartphones are equipped with a variety of hardware specifications, such as differing CPU capabilities, I/O throughput, and DRAM sizes. Users deploying LLMs on these devices also have diverse objectives. Some may prioritize a balance between generation speed and memory usage, while others aim to maximize hardware utilization for increased speed. Additionally, the models themselves vary in weight numbers, structures, and sparsity levels. To manage this complexity, PowerInfer-2 includes an offline planner specifically designed to develop execution plans that optimally meet these varied requirements.

\

5.1 Execution Plan

\

5.2 Input Parameters

Table 2 also lists three categories of input parameters:

\ • Hardware: Parameters profiled from the hardware, such as CPU FLOPS, I/O throughput, and memory bandwidth.

\ • User: Parameters specified by the user, such as CPU constraints, memory limit, and lower bound of decoding speed.

\ • Model: Parameters about the model collected by an offline profiler, such as the size of the model, sparsity levels and caching characteristics, etc.

\

\

5.3 Cost Model

After collecting the input parameters, the planner uses a cost model to generate the execution plan. The goal is to maximize the generation speed s (as defined by Equation 1) while adhering to user-specified constraints (Formulas 3-5). The decoding speed s is inversely proportional to the time taken to decode one token (Equation 1), which is determined by the computation times for that token (Equation 2), as we efficiently overlap the computation and I/O operations. As we have defined the objective function and the constraints, the constructed model can be solved by mature SMT solvers. In our implementation, we utilize the Z3 solver [11] to solve the cost model.

\

\ To compute the decoding time, we first model the times for computation. As we observed that memory opeartion is not a significant factor compared to the computation, we do not consider it in the computation time. Computation time (Equation 6) is primarily influenced by the attention blocks, predictors, and FFN blocks. The calculation involves dividing the computational workload of these components by the CPU flops (defined in Equation 7- 8). The flops of the selected CPU cores are specified in Equations 9.

\

\ Table 2: Symbols used in execution planning.

\ As FFN block computation overlaps with neuron loading, the planner must also account for I/O transmission time. This is calculated by dividing the volume of neurons transferred from flash storage (Equation 10) by the I/O bandwidth. This transferred volume depends on both the activation rate and the cache miss rate.

\

\ Finally, the planner calculates the time to load neurons from memory, which relates to the weight sizes of attention blocks, predictors, and neurons activated at runtime. The memory time is determined by dividing the total weight of activated neurons for one token by the memory bandwidth (Equation 11).

\

6 Implementation

PowerInfer-2 is developed on top of PowerInfer [30], a stateof-the-art serving framework designed for sparsely-activated LLMs, by integrating an additional 12K lines of C++ code into PowerInfer [30]. These enhancements encompass several key areas, including the polymorphic neuron engine, neuron cache, flexible neuron loading, and neuron-cluster-level I/O pipeline.

\ Since PowerInfer-2 depends on privileged system APIs (e.g., mlock that locks pages in memory) that needs the root permission, we built it on the Android [5] platform. Even though there is no need to alter the system kernel, a rooted Android system still provides us with considerable flexibility in developing and debugging our system. Furthermore, PowerInfer-2 is inherently designed with no modifications to the kernel, making it easily portable to other operating systems, including iOS [14] platform.

\ The current implementation of PowerInfer-2 supports a diverse array of LLMs with varying model sizes, including Llama-2 family [27] (7B, 13B), TurboSparse-Mistral [31] (7B), and TurboSparse-Mixtral [31] (47B).

\ Table 3: Hardware specifications of smartphones we used in the evaluation. “DRAM” is the physical memory size. “Available” is the maximum memory size that can be occupied by an application.

\

:::info Authors:

(1) Zhenliang Xue, Co-first author from Institute of Parallel and Distributed Systems (IPADS), Shanghai Jiao Tong University;

(2) Yixin Song, Co-first author from Institute of Parallel and Distributed Systems (IPADS), Shanghai Jiao Tong University;

(3) Zeyu Mi, Institute of Parallel and Distributed Systems (IPADS), Shanghai Jiao Tong University (yzmizeyu@sjtu.edu.cn);

(4) Le Chen, Institute of Parallel and Distributed Systems (IPADS), Shanghai Jiao Tong University;

(5) Yubin Xia, Institute of Parallel and Distributed Systems (IPADS), Shanghai Jiao Tong University;

(6) Haibo Chen, Institute of Parallel and Distributed Systems (IPADS), Shanghai Jiao Tong University.

:::


:::info This paper is available on arxiv under CC BY 4.0 license.

:::

\

Market Opportunity
Sleepless AI Logo
Sleepless AI Price(AI)
$0.0379
$0.0379$0.0379
-1.07%
USD
Sleepless AI (AI) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

Bitcoin and Ethereum ETFs See $232M in Outflows as Traders De‑Risk Ahead of Christmas

Bitcoin and Ethereum ETFs See $232M in Outflows as Traders De‑Risk Ahead of Christmas

U.S. spot Bitcoin and Ethereum ETFs recorded combined net outflows of approximately $232 million on Wednesday, as traders trimmed exposure ahead of the Christmas holiday and year‑end liquidity slowdown.
Share
MEXC NEWS2025/12/26 16:51
MICA Rules Come into Effect! Another European Country Issues a Very Strong Warning to Crypto Exchanges! Here Are the Details

MICA Rules Come into Effect! Another European Country Issues a Very Strong Warning to Crypto Exchanges! Here Are the Details

The post MICA Rules Come into Effect! Another European Country Issues a Very Strong Warning to Crypto Exchanges! Here Are the Details appeared on BitcoinEthereumNews
Share
BitcoinEthereumNews2025/12/26 15:25
Ethereum Hits Losing Streak: How Massive Liquidations Impact ETH Price

Ethereum Hits Losing Streak: How Massive Liquidations Impact ETH Price

Ethereum has entered a sharp losing streak, with cascading liquidations and technical weakness fueling volatility across the market. A wave of $1.8 billion in long liquidations on September 23 wiped out more than 370,000 traders, leaving Ethereum (ETH) particularly exposed. This market update is powered by Outset PR, the first data-driven crypto PR agency that equips blockchain projects with precise, effective strategies to boost visibility.  $1.8B Liquidations Trigger ETH Sell-Off The crypto market’s heavy reliance on leverage has once again backfired. ETH futures accounted for over $500 million of the $1.8 billion long liquidation, underscoring Ethereum’s vulnerability to sudden drawdowns. Leverage risk: With the average funding rate at +0.0029%, traders were heavily overexposed. Domino effect: When ETH broke below $4,150, stop-losses and margin calls triggered a cascading sell-off. Open interest: ETH derivatives open interest surged 19% in 24h, showing volatility was amplified by excessive speculation. The high-leverage environment created a fragile setup where a single breakdown sparked a chain reaction of forced selling. Technical Weakness Adds Pressure ETH also faces mounting technical headwinds after failing to hold critical levels. Pivot breakdown: ETH slipped below its 24h pivot point at $4,250. Resistance: The 38.2% Fibonacci retracement at $4,624 now serves as resistance. Beyond that, MACD histogram at -33.17 signals clear bearish momentum, while the RSI at 40.46 is weak but not oversold, leaving room for further downside. Price targets: Short-term traders are eyeing $4,092 (September 23 low) as the next support.Long-term structure remains intact as long as ETH holds above the 200-day EMA ($3,403), suggesting investors aren’t panic-selling yet. PR with C-Level Clarity: Outset PR’s Proprietary Techniques Deliver Tangible Results  If PR has ever felt like trying to navigate a foggy road without headlights, Outset PR brings clarity with data. It builds strategies based on both retrospective and real-time metrics, which helps to obtain results with a long-lasting effect.  Outset PR replaces vague promises with concrete plans tied to perfect publication timing, narratives that emphasize the product-market fit, and performance-based media selection. Clients gain a forward-looking perspective: how their story will unfold, where it will land, and what impact it may create.  While most crypto PR agencies rely on standardized packages and mass-blast outreach, Outset PR takes a tailored approach. Each campaign is calibrated to match the client’s specific goals, budget, and growth stage. This is PR with a personal touch, where strategy feels handcrafted and every client gets a solution that fits. Outset PR’s secret weapon is its exclusive traffic acquisition tech and internal media analytics.  Proprietary Tech That Powers Performance One of Outset PR’s most impactful tools is its in-house user acquisition system. It fuses organic editorial placements with SEO and lead-generation tactics, enabling clients to appear in high-discovery surfaces and drive multiples more traffic than through conventional PR alone. Case in point: Crypto exchange ChangeNOW experienced a sustained 40% boost in reach after Outset PR amplified a well-polished organic coverage with a massive Google Discover campaign, powered by its proprietary content distribution engine.   Drive More Traffic with Outset PR’s In-house Tech Outset PR Notices Media Trends Ahead of the Crowd Outset PR obtains unique knowledge through its in-house analytical desk which gives it a competitive edge. The team regularly provides valuable insights into the performance of crypto media outlets based on the criteria like: domain activity month-on-month visibility shifts audience geography source of traffic By consistently publishing analytical reports, identifying performance trends, and raising the standards of media targeting across the industry, Outset PR unlocks a previously untapped niche in crypto PR, which poses it as a trendsetter in this field.  Case in point: The careful selection of media outlets has helped Outset PR increase user engagement for Step App in the US and UK markets. Outset PR Engineers Visibility That Fits the Market One of the biggest pain points in Web3 PR is the disconnect between effort and outcome: generic messaging, no product-market alignment, and media hits that generate visibility but leave business impact undefined. Outset PR addresses this by offering customized solutions. Every campaign begins with a thorough research and follows a clearly mapped path from spend to the result. It's data-backed and insight-driven with just the right level of boutique care. Outlook Ethereum’s latest slump highlights the double-edged sword of leverage. Excessive positioning fueled sharp liquidations, while technical weakness reinforced the bearish momentum. Yet, with the 200-day EMA still holding firm, long-term holders remain calm for now. This analysis was brought to you by Outset PR, the first data-driven crypto PR agency. Just as Ethereum’s market path hinges on reclaiming key levels, Outset PR helps projects reclaim visibility and momentum with strategies grounded in data and measurable results. You can find more information about Outset PR here: Website: outsetpr.io Telegram: t.me/outsetpr  X: x.com/OutsetPR    Disclaimer: This article is provided for informational purposes only. It is not offered or intended to be used as legal, tax, investment, financial, or other advice.
Share
Coinstats2025/09/23 23:29