The cost model leverages SMT‑based solving (Z3) to achieve optimal decoding speed under CPU, I/O, and memory constraints.The cost model leverages SMT‑based solving (Z3) to achieve optimal decoding speed under CPU, I/O, and memory constraints.

How PowerInfer‑2 Turns Your Smartphone Into an AI Workstation

2025/11/04 03:56

Abstract and 1. Introduction

  1. Background and Motivation
  2. PowerInfer-2 Overview
  3. Neuron-Aware Runtime Inference
  4. Execution Plan Generation
  5. Implementation
  6. Evaluation
  7. Related Work
  8. Conclusion and References

5 Execution Plan Generation

Today’s smartphones are equipped with a variety of hardware specifications, such as differing CPU capabilities, I/O throughput, and DRAM sizes. Users deploying LLMs on these devices also have diverse objectives. Some may prioritize a balance between generation speed and memory usage, while others aim to maximize hardware utilization for increased speed. Additionally, the models themselves vary in weight numbers, structures, and sparsity levels. To manage this complexity, PowerInfer-2 includes an offline planner specifically designed to develop execution plans that optimally meet these varied requirements.

\

5.1 Execution Plan

\

5.2 Input Parameters

Table 2 also lists three categories of input parameters:

\ • Hardware: Parameters profiled from the hardware, such as CPU FLOPS, I/O throughput, and memory bandwidth.

\ • User: Parameters specified by the user, such as CPU constraints, memory limit, and lower bound of decoding speed.

\ • Model: Parameters about the model collected by an offline profiler, such as the size of the model, sparsity levels and caching characteristics, etc.

\

\

5.3 Cost Model

After collecting the input parameters, the planner uses a cost model to generate the execution plan. The goal is to maximize the generation speed s (as defined by Equation 1) while adhering to user-specified constraints (Formulas 3-5). The decoding speed s is inversely proportional to the time taken to decode one token (Equation 1), which is determined by the computation times for that token (Equation 2), as we efficiently overlap the computation and I/O operations. As we have defined the objective function and the constraints, the constructed model can be solved by mature SMT solvers. In our implementation, we utilize the Z3 solver [11] to solve the cost model.

\

\ To compute the decoding time, we first model the times for computation. As we observed that memory opeartion is not a significant factor compared to the computation, we do not consider it in the computation time. Computation time (Equation 6) is primarily influenced by the attention blocks, predictors, and FFN blocks. The calculation involves dividing the computational workload of these components by the CPU flops (defined in Equation 7- 8). The flops of the selected CPU cores are specified in Equations 9.

\

\ Table 2: Symbols used in execution planning.

\ As FFN block computation overlaps with neuron loading, the planner must also account for I/O transmission time. This is calculated by dividing the volume of neurons transferred from flash storage (Equation 10) by the I/O bandwidth. This transferred volume depends on both the activation rate and the cache miss rate.

\

\ Finally, the planner calculates the time to load neurons from memory, which relates to the weight sizes of attention blocks, predictors, and neurons activated at runtime. The memory time is determined by dividing the total weight of activated neurons for one token by the memory bandwidth (Equation 11).

\

6 Implementation

PowerInfer-2 is developed on top of PowerInfer [30], a stateof-the-art serving framework designed for sparsely-activated LLMs, by integrating an additional 12K lines of C++ code into PowerInfer [30]. These enhancements encompass several key areas, including the polymorphic neuron engine, neuron cache, flexible neuron loading, and neuron-cluster-level I/O pipeline.

\ Since PowerInfer-2 depends on privileged system APIs (e.g., mlock that locks pages in memory) that needs the root permission, we built it on the Android [5] platform. Even though there is no need to alter the system kernel, a rooted Android system still provides us with considerable flexibility in developing and debugging our system. Furthermore, PowerInfer-2 is inherently designed with no modifications to the kernel, making it easily portable to other operating systems, including iOS [14] platform.

\ The current implementation of PowerInfer-2 supports a diverse array of LLMs with varying model sizes, including Llama-2 family [27] (7B, 13B), TurboSparse-Mistral [31] (7B), and TurboSparse-Mixtral [31] (47B).

\ Table 3: Hardware specifications of smartphones we used in the evaluation. “DRAM” is the physical memory size. “Available” is the maximum memory size that can be occupied by an application.

\

:::info Authors:

(1) Zhenliang Xue, Co-first author from Institute of Parallel and Distributed Systems (IPADS), Shanghai Jiao Tong University;

(2) Yixin Song, Co-first author from Institute of Parallel and Distributed Systems (IPADS), Shanghai Jiao Tong University;

(3) Zeyu Mi, Institute of Parallel and Distributed Systems (IPADS), Shanghai Jiao Tong University (yzmizeyu@sjtu.edu.cn);

(4) Le Chen, Institute of Parallel and Distributed Systems (IPADS), Shanghai Jiao Tong University;

(5) Yubin Xia, Institute of Parallel and Distributed Systems (IPADS), Shanghai Jiao Tong University;

(6) Haibo Chen, Institute of Parallel and Distributed Systems (IPADS), Shanghai Jiao Tong University.

:::


:::info This paper is available on arxiv under CC BY 4.0 license.

:::

\

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

Why Is Crypto Down Today? – November 14, 2025

Why Is Crypto Down Today? – November 14, 2025

The crypto market is down today and by a significantly higher percentage than over the past few days, with the cryptocurrency market capitalisation decreasing by 5.6%, now standing at $3.38 trillion. 96 of the top 100 coins have dropped over the past 24 hours. At the same time, the total crypto trading volume is at $254 billion. TLDR: The crypto market capitalisation is down by 5.6% on Friday morning (UTC); 96 of the top 100 coins and all top 10 coins are down today; BTC decreased by 6.2% to $97,033, and ETH fell by 9.2% to $3,208; ’Bitcoin appears to be fighting one battle after another’; The real test could be the interest rate decision in the US on 10 December; Crypto and tech stocks are diverging; ’Despite recent price movement, 2025 has been the year of institutional investment into digital assets’; ’Bitcoin DeFi is poised to be at the forefront of the global financial system – from Wall Street to Main Street’; US BTC spot ETFs saw a whopping $869.86 million in outflows on Thursday, and ETH ETFs let go of $259.72 million; Canary Capital’s XRPC, the first US spot XRP ETF, made its debut on Thursday; Crypto market sentiment drops again within the fear territory. Crypto Winners & Losers At the time of writing, all top 10 coins per market capitalization have seen their prices decrease over the past 24 hours. Bitcoin (BTC) has dropped by 6.2% since this time yesterday, currently trading at $97,033.
 Bitcoin (BTC)
24h7d30d1yAll time Ethereum (ETH) is down by 9.2%, now changing hands at $3,208. This, along with Lido Staked Ether (STETH), is the highest fall in the category. Solana (SOL) is in in the second place, having dropped 8.6% to the price of $142. The smallest fall is 2.3% by Tron (TRX), which now stands at $0.2927. When it comes to the top 100 coins, only four are green. Among these, Zcash (ZEC) appreciated the most, rising to the price of $507. Leo Token (LEO) follows with a 2% rise to $9.17. On the other hand, three coins saw double-digit drops. Story (IP) fell 15%, now trading at $3.34. It’s followed by Aave (AAVE)’s 13.6% and Hedera (HBAR)’s 10.4% to $185 and $0.1606, respectively. ‘Bitcoin Appears To Be Fighting One Battle After Another’ Nic Puckrin, crypto analyst and co-founder of The Coin Bureau, argues that the “crypto market has been struggling to regain momentum since October’s pandemonium.” “Bitcoin appears to be fighting one battle after another, dragged down by US dollar strength and higher Treasury yields, long-term holders selling, and macro uncertainty,” he says. Puckrin finds it “unsettling” to see crypto and tech stocks diverging when they typically move in lockstep. This dynamic shows that BTC “isn’t just a proxy for the Nasdaq.” Rather, it’s more sensitive to macro headwinds and liquidity concerns and is “perfectly positioned to break out once those concerns dissipate.” Notably, as the US re-opens and data starts flooding back in, “we may see the BTC price wobble over the coming weeks.” The real test could be the interest rate decision in the US on 10 December. Still, “it remains likely that the news will be positive, which could set the stage for a Santa rally in crypto and other risk assets,” Puckrin concludes. Moreover, Dom Harz, co-founder of BOB, commented on institutional involvement in BTC as the coin’s price drops below $100,000. “Despite recent price movement, 2025 has been the year of institutional investment into digital assets, with institutions now holding over 4 million BTC,” Harz writes in an email commentary. These institutions are “increasingly looking to store excess cash in DeFi vaults for higher-yield opportunities. These two movements are converging with Bitcoin DeFi; moving the world’s biggest digital asset beyond a store of value and into a yield-generating asset. “ He continues: “As this mainstream appetite for DeFi grows, serious technological advancements are unlocking Bitcoin’s utility. Key players in institutional crypto and Bitcoin DeFi adoption are opening up access to BTCFi, where institutions can leverage yield-bearing opportunities for their BTC holdings. Bitcoin DeFi is poised to be at the forefront of the global financial system – from Wall Street to Main Street.” Levels & Events to Watch Next At the time of writing on Friday morning, BTC fell below the $100,000 mark and to the $96,000 level, now standing at $97,033. The coin has dropped from the intraday high of $103,737 to the low of $96,170. It’s now down 4.7% in a week, 13.7% in a month, and 22.9% from its all-time high. We may see BTC pull back towards $94,500 and further towards the $90,000 level. A higher plunge could drag it lower. Conversely, if there is a change in course, the coin could climb back above $100,000 and move towards $103,000.Bitcoin Price Chart. Source: TradingView Ethereum is currently changing hands at $3,208. It plunged from today’s high of $3,545 to the currently lowest point of $3,126. Over this past week, it has been trading between $3,172 and $3,633. ETH is down 4.3% in a day, 22.2% in a month, and 35.1% from its ATH. ETH may continue dropping today and over the next few days. Should that happen, it could retreat below the $3,000 level – far from the near-$5,000 zone where it stood just weeks ago. If there is a market rebound, the coin could return to the $3,500 territory and potentially $3,650.
 Ethereum (ETH)
24h7d30d1yAll time Meanwhile, the crypto market sentiment has decreased again, holding firmly to the fear zone and moving to extreme fear. The crypto fear and greed index fell from 25 yesterday to 22 today. Some investors are selling assets, driven by fear and worry over the continuously falling prices. If the market continues to ride this instability, it may decline further. However, if assets are oversold, as high fear can sometimes indicate, the market could potentially see a rebound. Undervalued prices could also present a potential buying opportunity.Source: CoinMarketCap ETFs See Significant Outflows On Thursday, the US BTC spot exchange-traded funds (ETFs) recorded $869.86 million in outflows, the highest since February 2025 and the second-highest on record. The total net inflow is back down to $60.21 billion, but it still stands above $60 billion. Ten of the 12 BTC ETFs recorded negative flows, and there were no positive flows. Grayscale let go of $256.64 million. It’s followed by BlackRock’s $256.64 million. One more triple-digit is $119.93 million by Fidelity.Source: SoSoValue At the same time, the US ETH ETFs continued their outflow streak, recording another $259.72 million leaving on 13 November. The total net inflow pulled back to $13.31 billion. Five of the nine funds recorded outflows. There were no positive flows. BlackRock is the reddest among these, letting go of $137.31 million. Grayscale follows with $67.91 in outflows.Source: SoSoValue Meanwhile, Canary Capital’s XRPC, the first US spot exchange-traded fund offering direct exposure to XRP, made its debut on Thursday with $58 million in trading volume. Such notable opening performance indicates that there is a rising institutional appetite for exposure to other major assets, besides BTC and ETH. Quick FAQ Why did crypto move against stocks today? The crypto market has decreased again over the past day, and the stock market closed sharply lower on Thursday, dragged by technology shares. By the closing time on 13 November, the S&P 500 was down by 1.66%, the Nasdaq-100 decreased by 2.05%, and the Dow Jones Industrial Average fell by 1.65%. Is this drop sustainable? The market may see an extended downturn over the next few days as investors’ worries persist. However, should there be macroeconomic and/or geopolitical signals that would ease these concerns and reassure investors, the market could see a rebound. You may also like: (LIVE) Crypto News Today: Latest Updates for November 14, 2025 Crypto markets slid sharply on Nov. 14, with BTC dropping below $100,000 and ETH plunging more than 6%, as most major sectors posted 2–7% losses. NFTs, Layer 1s, DeFi, CeFi, and Meme tokens all traded lower, though pockets of strength emerged in STRK, MOG, and TEL. Despite the broad downturn, on-chain flows suggest institutions may be accumulating: Anchorage Digital has received 4,094 BTC (≈$405M) over the past nine hours from Coinbase, Cumberland, Galaxy Digital, and Wintermute, hinting that...
Share
CryptoNews2025/11/14 20:11