07 June 2025

(Below are generated by AI translation. The original article was written in Chinese. See Vision and Strategy to the Storage Landscape (Chinese Simplified).)

Vision & Strategy: Insight, Foresight, and Strategy

Vision and Strategy begin with asking questions: where should we be in 1 year, 3-5 years, or even 10 years? What should the team and departments be doing, and how should they be working? Vision is not about chasing the latest technology trends and learning or applying them. Vision means that the “stakeholders” need to correctly predict technology trends, determine investment directions, and support their conclusions with data-driven and systematic analysis.

Overall, this way of thinking is closer to that of a Product Manager, Business Analytics, and Market Research, rather than technical development work. Of course, internally, it also requires a solid technical foundation (see A Holistic View of Distributed Storage Architecture and Design Space). Externally, it requires surveying the market and competitors, as well as our position. Looking forward, it requires forecasting trends and scale.

Why are Vision and Strategy needed? There are many reasons:

  • Career Development: As one advances in level, the expectation for the position gradually shifts from receiving input to providing output. For example, a junior individual developer focuses on completing assigned tasks and receives input from managers. In contrast, an experienced individual developer often needs to formulate project strategies (including technical ones) and regularly provide input to managers, such as possible innovation directions and team development opportunities. The work of a manager is closer to investment (see the Market Analysis section); insight into future trends and executing the right strategies is one of their core responsibilities. See [95].

  • Long-term Planning: Higher positions are expected to manage longer time horizons. For example, a junior developer typically needs to plan for the next 3 months, while an experienced individual developer often needs to plan for the next year. A manager plans even further ahead, often looking 3 to 5 years into the future. Beyond project management plans, this involves foresight and strategy for team development. See [95].

  • Leadership: Leadership means attracting followers through vision and charisma in an environment of equal communication (Manage by influence not authority). Developing leadership requires the individual to become a Visionary or Thought Leader. When others interact with the individual, they should always feel inspired and motivated. Leadership is also one of the requirements for the position of manager. See [96].

  • System Architecture: A good system architecture often lasts more than 10 years, and considering the slower iteration speed of storage systems (data must not be corrupted), development may take 5 years (until maturity and stability). Architects essentially work with a future-oriented approach, needing to understand future market demands and make decisions based on future technological developments. In particular, hardware capabilities develop at an exponential rate. On the other hand, architectural decisions need to be mapped to financial metrics.

  • Innovation: Innovation is a daily task (see the Market Analysis section) and also involves finding growth opportunities for the team. Innovation includes both identifying technological development trends and analyzing changes in market demand, meaning having insight into taking existing systems “a step further,” as well as questioning what the best storage system should be (Gap Analysis). Finding innovation requires Vision, and implementing innovation requires Strategy.

This article mainly focuses on storage systems within the context of the (cloud) storage industry. The following sections will sequentially explain “Some Methodologies,” “Understanding Stock Prices,” “The Storage System Market,” “Market Analysis,” “Hardware in Storage Systems,” and “Case Study: EBOX.”

  • The Methodology section will build the thinking framework for Vision and Strategy.

  • Stock Price section will analyze its principles, understand the company’s goals, and map them to the team.

  • Market section provides an overview of the competitive landscape of storage systems, analyzing key market characteristics, disruptive innovations, and value.

  • Hardware section models its capabilities and development speed, with an in-depth look at the key points.

  • Case Analysis section will apply the analytical methods of this article through an example, yielding many interesting conclusions.

Fun Header Image of Vision

Table of Contents

Some Methodologies

Vision and Strategy involve a series of considerations about the future and trends, as well as an understanding of enterprise architecture to ensure projects achieve tangible returns on investment. More importantly, it requires systematic analysis and data to support the conclusions of predictions. Overall, this way of thinking is closer to that of a Product Manager, Business Analytics, and Market Research, rather than just technical development work.

This chapter introduces methodologies related to Vision and Strategy. It will sequentially cover Critical Thinking, Case Interview, Strategic Thinking, Business Acumen, and information gathering. The content mainly outlines the general framework; what is important are the thinking, practice, and experience.

Critical Thinking

Critical Thinking is not closely related to critical thinking in Chinese; it is closer to Problem Solving. “Critical” is more akin to Critical Path. The reason for using the English term is that direct translation into Chinese somewhat distorts the meaning, and the same applies below. LinkedIn’s Critical Thinking course [91] is a good learning resource. The course contains comprehensive knowledge; this article only lists interesting or important points, the same below.

Critical Thinking 的(部分)重点:

  • Address the root cause rather than the symptoms. Do not start directly on the task assigned by your boss. First, trace upstream and downstream, and stakeholders, to find the real, root cause problem that needs to be solved. Next is defining the problem, where the most important parts are the Problem Statement and defining the goal. Before starting, present your definition to other teams and members to see if it is reasonable.

  • Work efficiently. You need to repeatedly switch between the High Road and Low Road, zooming in and out of perspectives. The High Road is a bird’s-eye view; according to the 20/80 Rule, find the effective 20%, Don’t Boil the Ocean. The Low Road is a ground-level view, such as analyzing specific data. The key is to return to the High Road from time to time to verify its business value (Business Impact), Don’t Polish the Dirt.

  • “Critical” Path. The path you take to solve a problem should be a critical path (graph theory), where each task node is necessary and not redundant (related to the following MECE). Doing work that the boss doesn’t care about is meaningless. What you provide is a professional service, and purchasing your service is expensive, so don’t waste the client’s funds. Individual developers should not only take the Low Road. Finally, carefully consider the priority of tasks.

  • Some Tools. For example, the “5 Whys” is used to trace root causes. The “Seven So-whats” is used to infer the consequences of actions. “We did it this way before” is a “bad smell” that triggers Critical Thinking—does yesterday’s strategy fit today? The First Principles, more plainly, means breaking down problems (Top-down) and then adding up (Bottom-up) to calculate reasonably; this is similar to systems analysis methods. Shadow Practice, such as Gartner reports, where you test if you can independently analyze and reach the same conclusions. Switching Perspectives, for example, explaining to someone else and seeing how they reframe your problem. Heuristic questions, such as looking back over the past 10 years, if you were to start over, what would you do differently? Another heuristic question is how to double your output metrics (Performance Metrics)?

  • To some extent, attention to both high-level and detail is necessary. High-level abstract thinking has great power, but if you live in the “cloud” for too long, you can easily fall into stereotypes and be unable to self-correct (something managers need to avoid). Self-correction requires returning to the detail level to test existing assumptions; in other words, it is the First Principles.

Architecture design feedback loop

Case Interview

Case Interview [92] belongs to the field of Business Analytics or Business Consulting and is a part of the interview process for consulting firms like McKinsey. However, as the book Case Interview Secret by Victor Cheng explains, it covers a large number of frameworks and examples for analyzing business problems, which is very beneficial, with the “interview” aspect being secondary.

Some Key Points of Case Interview:

  • Estimating Using Proxy. Quick estimation and mental calculation are fundamental requirements in business consulting. There are various estimation methods, among which the Proxy method is quite interesting, such as estimating the revenue of a newly opened store by counting street foot traffic or the occupancy rate of nearby stores. The Proxy method can be taken further by stratifying according to demographics, identifying Proxy variables, breaking down the problem, and switching to another Proxy for cross-verification.

  • Mindset. The thinking approach in business consulting overlaps significantly with Critical Thinking, for example, Don’t Boil the Ocean, time is expensive, and professional services. On the other hand, it involves being an Independent Problem Solver: if you were placed alone in a department of a Fortune 500 company, could you convince the client (as in the following Conclusiveness), solve problems, and maintain the employer’s image? Strong Soft Skills are essential.

  • Issue Tree Framework. A classic analytical method in Case Interviews is the Issue Tree Framework, such as the Profitability framework. It requires first establishing a hypothesis, then breaking down the problem layer by layer. The breakdown must satisfy the MECE test and the Conclusiveness test; the former means no overlap and no omission, while the latter means if all branches are True, the conclusion of the parent node cannot be denied. The analysis proceeds layer by layer down the tree, then aggregates back up to the root, often dynamically adjusting the tree structure during the process. There are many templates for the Issue Tree Framework, but it is often customized according to the problem when used.

Issue Tree Framework: Profitability

Strategic Thinking

Strategic Thinking is used to formulate company strategies, especially long-term strategies, and is often linked with Decision Making. It is the action plan and navigation map for where the company should be positioned 3 to 5 years, or even 10 years, into the future. LinkedIn’s Strategic Thinking course [93] provides more explanation.

Some key points of Strategic Thinking:

  • Win the Game. Compared to Critical Thinking, which only involves “me” and “the problem,” Strategic Thinking requires winning the “game,” adding “opponents” into the picture. In the context of market analysis, its perspective is closer to Porter’s Five Forces Analysis.

  • Observation. Strategic Thinking first requires observing people and opponents, paying attention not only to trends but also to micro-trends. One key point is Bad Small, which applies not only to programming but also to organizations and culture, such as hearing phrases like “We’ve never done it this way” or “We used to do it this way.” Another key point is Not be Surprised. Surprise is a term in management; if you feel surprised, it means your observation was insufficient. After observation comes reflection, It doesn’t take time, it takes space (see course section Embrace the strategic thinking mindset).

  • Action. Deciding to do something means deciding not to do something else. Doing nothing is also a decision. Attention should be paid to the multiplicative effect in the value chain: your time -> investment in tasks -> your strategy -> effectiveness at the company level. Ask yourself, 3 to 5 years from now, how will “I” win the game, and where will “I” be? Action means establishing a breakdown of tasks from long-term to short-term (related to the Issue Tree Framework), ultimately mapping to tactics, i.e., specific tasks executable daily. During execution, occasionally switch between High Road and Low Road (related to Critical Thinking).

  • Making Informed Strategy. A good strategy does not need to be innovative; the focus of strategy is decision making. First, consider market trends and the classic Porter Five Forces analysis. Pay attention to collecting perspectives from different sources, including both new and old groups, especially from different viewpoints. Map out your assets and allies. Map out your constraints, especially structural obstacles. Also conduct a SWOT analysis. Place them on the previous action map, which needs to be realistic and attainable.

  • Gaining Support. How to gain support from your boss, colleagues, and employees? Don’t rush to announce your strategy or plan in meetings; there is a lot of work beforehand. First, systematically meet with stakeholders to discuss your plan, get feedback, and address questions. It is foreseeable that there will be a lot of opposition; the key is that you need to anticipate all possible objections (related to Not be Surprised), and reach consensus before the meeting using appropriate concessions and negotiation skills (related to BATNA). Finally, ensure all decisions and tasks are accountable, such as through email meeting summaries and regular timeline reviews.

  • Monitoring Execution Progress. Common project management is used in the strategy execution process. More importantly, set expectations and assumptions. The environment is constantly changing, so frequently re-examine assumptions and ask if there are better alternatives. Reviews should be conducted upfront and retroactively. High-risk parts require earlier and more frequent execution.

Points in Strategic Thinking

Business Acumen

Business Acumen explains how business is conducted from the company’s perspective, as well as how to advance and optimize various parts (Pull the Lever). It covers reading financial reports, business models, strategies, operations, and more. LinkedIn’s Developing Business Acumen course [94] offers further elaboration.

Some key points related to this article in Business Acumen:

  • Financial Reports: Corresponding to company financial statements, the Profit & Loss statement (P&L statement) can be broken down layer by layer: Revenue -> minus COGS -> Gross Profit -> minus Operating Expenses (Operating Expense, SG&A) -> Operating Profit -> minus interest, taxes, depreciation, etc. -> Net Income. There are many adjustable “levers” (Pull the Lever) related to financial performance, corresponding to the Profitability Issue Tree Framework mentioned above. Some are long-term, such as R&D and facility construction, while others are short-term, like production cuts and layoffs. You can try to draw conclusions from past financial reports and compare them with the management reports (Financial Brief) released by the company.

  • Business Model: The business model defines how to profit from producing goods. From the perspective of raw material processing, there is the Value Chain; from the perspective of business growth, there is the Growth Strategy; from the investment perspective, there is ROI; and from the cost perspective, there are concepts such as CapEx, Fixed Cost, and Variable Cost.

  • Operation: The business model drives company strategy and personnel allocation, personnel drive operations, and ultimately financial performance is reflected in the financial reports. Centered around operations, first is Strategy, as mentioned earlier. The company’s investment portfolio forms the Initiative Pipeline, which unfolds step by step to realize the company’s future. Marketing strategy selects and helps the company control customers. R&D is often considered together with Mergers and Acquisitions, the latter saving time to go public and allowing market share acquisition. In competition, protecting products requires strategy, such as rapid release, or copyrights and patents. Personnel strategy involves how to find personnel, training, organizational structure, turnover rate, etc. Observing the open recruitment positions at various job levels in a company can reveal its personnel strategy, which can be used to infer its business strategy.

Points in Business Acumen

Information gathering

This section briefly describes how to collect market information to support Vision and Strategy analysis.

  • “Underwater” information. Many cutting-edge and valuable pieces of information are often unpublished. For example, paper authors often know about a valuable research direction a year in advance and start working on it. If you only read the papers, you will learn about this information a year later. The same applies to university labs, corporate research institutes, open-source communities, and so on. Obtaining “underwater” information relies more on social interactions, participating in various conferences, face-to-face communication, finding collaborators, and mutual benefits. On the other hand, companies have real customers and supply chains, from which it is easier to obtain potential market trends or even make decisions.

  • Investment. Investment news can provide insights into new technology trends. Compared to reading papers, technologies that receive investment have been “money-validated”, and the amount of investment can be used to gauge their strength. Common investment news includes startups, financing rounds, acquisitions, etc., or seeing a new company start to have “money” to write articles in various media to promote itself. On the other hand, promising papers often quickly receive funding and establish startups, or at least launch websites.

  • Research reports. Many market research firms are happy to predict future directions, such as Gartner and IDC. Although companies may pay for promotion, well-funded companies at least indicate that the direction has development prospects. More examples will be seen in the Storage Systems Market chapter.

Understanding Stock Price

From the company’s perspective, an extremely important goal is the growth of the stock price (sometimes even the sole goal). What kind of stock price growth is reasonable? How can stock price growth be mapped to actual products? For departments or teams, what kind of objectives need to be achieved to support the stock price?

This goal is further broken down into a 3-5 year plan for the team, mapped to Vision and Strategy. In other words, stock price analysis can tell the team how well they should perform. Although stock price may seem unrelated to Vision and Strategy, it is actually an excellent entry point.

What constitutes the stock price

The key to understanding stock prices is the Price-Earnings Ratio (P/E). The English term is easier to understand: the ratio of stock price to earnings.

  • Think of a stock as a passbook, and the reciprocal of the P/E ratio is its interest rate. Here, the stock price is the Share price.

P/E formula

Earnings correspond to Earnings Per Share (EPS). The English term is straightforward: earnings per share. It is obtained by dividing the company’s Net Income by the total number of shares (Average common shares).

  • In the formula, preferred dividends are the dividends of preferred stock. They can be ignored, as they are usually small in amount and rarely used [48].

EPS formula

By substituting earnings per share into the price-to-earnings ratio formula, we can find:

  • The reciprocal of the price-to-earnings ratio is the company’s earnings divided by its market capitalization (the sum of all shares). Imagine the company as a huge passbook; the reciprocal of the P/E ratio is the interest rate of the “company passbook.”

  • In other words, the P/E ratio calculates how many years the “company passbook” needs for its “interest” to cover the market value. That is, the P/E ratio indicates how many years it takes for the company to break even.

P/E interest formula

How high is the “company passbook” interest rate mentioned above in reality? Take MSFT [49] as an example:

  • Interest rate = 1 / P/E = 1 / 37.32 = 2.68%.

  • In comparison, the 30-year US Treasury yield is 4.6%[50]. (The 3-month US Treasury yield during the same period is even higher, about 5.3%.)

The US Treasury yield is even much higher than the interest rate of the above “company passbook.” Relative to the stock price, the company’s profitability is not as good as purchasing risk-free government bonds. Why?

  • Traders believe that although the company’s current profitability is insufficient, the stock price may appreciate in the future, so they continue to buy, causing the stock price to rise and the P/E ratio to increase.

  • In other words, the expectation of stock price appreciation is reflected in the price-to-earnings ratio. Or, the price-to-earnings ratio reflects the company’s future expectations, that is, the expectation of stock price increase [52].

  • If the company’s outlook improves, the stock is bought more, and the price-to-earnings ratio rises. Conversely, if the company performs poorly, the price-to-earnings ratio falls. If the company is expected to maintain current performance, the price-to-earnings ratio should remain stable.

On the other hand, net profit can be further broken down, mapping to market size:

  • Net profit equals the company’s revenue multiplied by net profit margin.

  • Revenue can continue to be broken down into market size and market share.

Net income decomposition formula

From this, the composition of the stock price can be summarized:

  • First is the company’s profitability, which depends on market size, market share, and net profit margin. Profit compared to the company’s market value is reflected as an interest rate, corresponding to the inverse of the price-to-earnings ratio.

  • Then there is the traders’ expectation for the company’s future, also reflected by the price-to-earnings ratio. Comparing this to the risk-free government bond rate can indicate the strength of belief.

Stock price decomposition formula

How fast should the stock price rise?

How fast should the stock price increase? It should cover the opportunity cost and risk premium; otherwise, traders would choose to sell the stock and buy risk-free government bonds. Besides price appreciation, another return to stockholders is dividends.

  • Opportunity cost corresponds to the risk-free rate, usually measured by short-term government bond yields.

  • Dividend yield is the dividend income per share as a proportion of the stock price.

  • Risk Premium, stocks carry higher risk than government bonds, so traders demand extra returns.

Stock price growth formula

First, let’s look at how the dividend yield is calculated. Dividends come from the company’s net profit, which is proportionally allocated and distributed per share.

  • For simplicity, the dividend yield can generally be directly checked[51] on stock trading websites.

  • Dividend Yield can be broken down into the EPS payout ratio divided by the price-to-earnings ratio. The higher the P/E ratio, the lower the stock yield.

  • EPS Payout Ratio refers to the proportion of a company’s net profit distributed as dividends; it is also equal to the proportion of earnings per share paid out as dividends. The EPS payout ratio is generally not affected by stock price fluctuations. It can also be found on stock trading websites.

Dividend yield formula

Next, let’s look at the calculation of the risk premium, here using the common CAPM [55] asset pricing model.

  • Risk Premium can be obtained by multiplying the Beta coefficient by the equity risk premium (ERP).

  • Beta Coefficient reflects the volatility of a stock’s price relative to the market average [53], and can usually be directly found on stock trading websites [49].

  • Equity Risk Premium is obtained by subtracting the risk-free rate from the expected market return. The expected market return can be derived from an index fund, commonly using the S&P 500 [56]. The risk-free rate was discussed earlier.

  • In CAPM, the Cost of Equity is exactly the sum of the previously mentioned risk-free rate and the risk premium here. The cost of equity does not depend on the stock price but is determined by the market environment.

Risk premium formula

Applying the previous formula, we can now calculate how fast the stock price should rise. Taking MSFT stock as an example:

  • The risk-free rate is taken from the 30-year US Treasury bond, 4.6%. The dividend yield is directly queried, taken as 0.72%. The Beta coefficient is directly queried, taken as 0.89. The expected market return is taken as the average growth rate of the US S&P 500 over the past five years, 12.5% (remarkable).

  • Share price growth (annual) = 0.046 - 0.0072 + 0.89 * (0.125 - 0.046) = 10.9%. It can be seen that the share price needs to increase by 10.9% in one year to satisfy the trader’s cost-benefit balance.

  • The relatively high share price growth requirement comes partly from the higher US Treasury interest rate that year and partly from the strong upward trend of the US stock market.

It can be seen that in a high-interest-rate bull market, traders have stringent profit expectations for companies. If the share price does not meet expectations, traders will incur losses due to opportunity costs or risk premiums, leading them to sell the stock, which lowers the P/E ratio and share price. Where is the final share price equilibrium point? Assuming the share price no longer changes:

  • Continuing from the previous values, the stock price is set at $420. Other assumptions remain the same. Assuming the dividend payout ratio remains unchanged (26.9%) and earnings per share remain unchanged ($11.25). A lower stock price and lower P/E ratio will increase the dividend yield, thereby balancing opportunity cost and risk premium.

  • Stable Point Stock Price Share price = (0.0072 * 420) / (0.046 + 0.89 * (0.125 - 0.046)) = 26.0. At this point, the P/E ratio is 2.31, and the dividend yield is 11.6%. At this point, the dividend yield exactly equals the cost of equity of the stock, 11.6%.

  • Besides the low stock price, this appears to be a good stock. Indeed, there are many similar real stocks [57] with low stock prices, high dividend yields, and low P/E ratios. Note that this article is for theoretical analysis only, does not constitute any prediction of stock price movements, and does not constitute any investment advice.

Stable stock price

There are some additional inferences:

  • Stock price increase expectations are unrelated to company market value. Assuming dividend yield is ignored, which is usually very low for tech companies. From the formula above, it can be seen that the stock price increase demanded by traders depends on the market background interest rate and risk. The company’s stock price, market value, or even market capitalization does not appear in the formula (ignoring dividend yield).

  • High stock price is a negative factor. Now consider dividend yield. It depends on company factors, determined by earnings per share and dividend payout ratio. Stock price appears in the denominator, which lowers the dividend yield, making it harder to meet the expected stock price increase.

  • High P/E ratio is a negative factor. Similarly, because it appears in the denominator of the dividend yield formula. A high P/E ratio means the company’s profitability is insufficient, but the stock price is relatively high.

This section has shown, from the trader’s perspective, how fast the stock price is expected to rise to maintain stability in the P/E ratio and stock price. So, from the company’s perspective, how should it promote stock price increases to meet expectations?

What Drives Stock Price Growth

As mentioned earlier, the price-to-earnings ratio reflects traders’ expectations for the company’s future. To maintain the company’s P/E ratio and stock price stability, how can the stock price be encouraged to rise as expected? From the formula that constitutes the stock:

  • First, the stock price needs to rise sufficiently to outperform the risk-free rate and risk premium, that is, the cost of equity. Technology companies usually have very low dividend yields.

  • The driving force behind stock price growth comes from the company’s net profit, and net profit needs to increase proportionally with the stock price to support the stock price.

P/E interest formula

  • Net profit is broken down into market size, market share, and net profit margin, which are the directions for seeking growth.

Stock price decomposition formula

  • Dream building. Even if the company’s profitability remains unchanged, by weaving traders’ optimistic expectations for the future, the price-to-earnings ratio can be driven up, thereby raising the stock price.

From the company’s perspective, the best strategy is to seek high-growth emerging markets:

  • For example, the global cloud storage market is growing at a rate of over 20% annually [46]. Simply entering this market is expected to meet the previously mentioned 11.6% cost of equity. It doesn’t even require outperforming peers, similar to free-riding.

  • Compared to mature large enterprises, small emerging companies (SMBs) do not have the burden of an existing market. Instead, they have stronger momentum for stock price growth.

  • Technology, innovation, and new markets are necessary to sustain stock prices.

Secondly, companies can seek to increase market share:

  • Increasing market share means competing with rivals, and the company’s performance must be better than its peers. This is a challenging path.

  • On the other hand, this means that in a low-growth mature market, it is more difficult for companies to meet stock price growth expectations. Being large and mature is not necessarily an advantage.

The next direction is to improve net profit margin:

  • A good approach is to sell high value-added products, leverage comparative advantages, enhance technological levels, and increase market recognition.

  • Another approach is to seek economies of scale. As scale expands, fixed costs decrease, and net profit margin improves.

  • A common approach is cost reduction and efficiency improvement. When net profit margins are low, cost reduction and efficiency improvement are more effective. See the figure below.

Net income growth by decrease cost

In addition, companies can increase their price-to-earnings ratio by creating dreams:

  • The price-to-earnings ratio reflects traders’ expectations of the stock market. Creating dreams raises expectations and sells concepts without requiring improvements in the company’s profitability.

  • This method is suitable for businesses with large initial investments that have economies of scale or technological accumulation. However, once the dream is broken, the stock price can fall rapidly.

Finally, what does the stock price increase of a real company look like? Taking MSFT [58] as an example:

  • The company’s overall revenue grew by 17% year-over-year, while net profit grew even faster, reaching 20%. An outstanding performance.

  • Xbox revenue grew by 62%, Azure cloud services by 31%, Dynamics 365 by 23%, and Intelligent Cloud by 21%. They far exceeded the cost of equity at 11.6%.

  • In addition, Office, Windows, Search, and LinkedIn all showed good growth, ranging between 10% and 15%.

MSFT cloud revenue growth

Team Goals

Stock price analysis helps build a top-down framework, from the company’s top level to specific teams, clarifying what goals to work towards:

  • Company needs the stock price increase to meet traders’ expectations and cover the cost of equity (how fast the stock price should rise).

  • Growth goals decomposed into market size, market share, net profit margin, and dream-building (what drives stock price growth).

  • For a specific team, a plan needs to be developed to achieve the above growth.

  • For a specific product under team management, its market size, market share, net profit margin, etc., need to reach the growth targets.

What exactly is the growth target?

  • Based on the previous analysis, taking MSFT as an example, the growth target is an annual 10.9%.

  • For other companies, calculate according to the previous formula 4.6% - 股息收益率 + Beta 系数 * (12.5% - 4.6%) . Dividend yield and Beta coefficient are related to the stock and can be directly queried [49]. The calculation result is usually around 11% (internet technology companies).

  • For products occupying emerging markets, it is even required that their growth exceeds the above growth targets to compensate for the company’s products in the market decline phase.

  • Ordinary products should reach the above targets as the company’s average. They constitute the majority of the company. However, the average requirement of 11% is not low.

  • Products below the average are possible. This means they are in the market decline phase, employees face the risk of layoffs, and career development opportunities are limited.

  • Essentially, the growth target is to outperform the stock market index and government bond yields.

Final question:

  • For individual employees, how to achieve at least an 11% average annual growth? Note that this is required every year. (Today, are you holding the company back?)

  • For teams, how to develop a 3-5 year plan to ensure 11% or more growth each year? This is where Vision and Strategy come into play.

Note

This article is a personal, non-professional analysis. All content expresses the author’s personal views and does not constitute any investment advice regarding the assets mentioned.

The Market of Storage Systems

Business strategy analysis can typically be broken down into the levels of customers, products, company, and competitors, with further depth (see the diagram below). Customers, products, and competitors can be summarized as the “market” landscape. This chapter will provide an overview of the storage system market, listing the main market segments, product functions, and participants. Subsequent chapters will delve deeper.

In an ever-changing market landscape, where do we stand? What will the market map look like in 3-5 years or 10 years, and where should we be? Understanding the market is the foundation of Vision and Strategy. Focusing on the market, we can gradually reveal its structure and growth potential, what constitutes value, demand, evolution cycles, and the driving factors behind them.

Business Situation

(See details below)

Storage market size compare

Classification

The first question is how to classify the storage market? This chapter uses the following classification to organize the content. The letters before the subsection titles correspond to the classification groups.

  • A. The classic classification is cloud storage and primary storage. Cloud storage comes from the public cloud. The term primary storage [49] is mostly used by Gartner, referring to storage systems deployed on the customer’s premises that serve critical data, usually from traditional storage vendors. Primary storage is also called “enterprise storage.” Additionally, another major category of storage used locally by enterprises is backup and archive systems.

  • B. According to the interface used, storage can be classified as object, block, and file systems. Object storage services consist of immutable BLOBs queried by Key, typically images, videos, virtual machine images, backup images, etc. Block storage is usually used by virtual machines as their disk mounts. File storage has a long history, storing directories and files that users can directly use, commonly including HDFS, NFS, SMB, and others. Additionally, databases can also be considered storage.

  • C. According to storage media, storage can be classified as SSD, HDD, and tape systems. SSD storage is expensive and high-performance, often used for file systems and block storage. HDD storage is cheap and general-purpose, often used for object storage or storing cold data. Tape storage is generally used for archival storage. Additionally, there is fully in-memory storage, typically used as cache or analytical databases.

The above classification of the storage market is classic and commonly used, also convenient for explaining in this chapter. However, in reality, products in the storage market are more organic and intertwined to penetrate each other’s markets and gain competitive advantages. For example:

  • A. Cloud storage also sells edge storage deployed near customers, such as AWS S3 Express. Main storage also offers cloud-deployed and cloud offloading versions, such as NetApp ONTAP. Backup and archival are especially cost-effective in cloud storage, such as AWS Glacier.

  • B. Object storage is becoming increasingly like a file system, such as the AWS S3 Mountpoint that simulates a file system, supporting metadata and search on objects, and hierarchical object paths. Databases include products with Key-Value interfaces like RocksDB, while SQL databases often support unstructured data, similar to object storage. Block storage is not only used for virtual machine disks but can also provide page storage for databases. Additionally, the underlying layers of various storage systems can be unified into shared log storage, such as Azure Storage, Apple FoundationDB, and the Log is Database design.

  • C. SSD storage often offloads cold data to HDD storage to save the expensive cost of SSDs. HDD storage often uses SSDs as cache or write staging. Memory is used as cache and index for various storage media, and in-memory storage systems often support writing cold data or logs to SSD.

In addition, for simplicity, this chapter omits some minor classifications. For example,

  • Based on the size of user enterprises, the market can be classified into SMB, large enterprises, and special fields. This classification is based on the customer side.

  • Enterprise storage is also often classified as DAS, SAN, and NAS. This classification partially overlaps with object, block, and file storage.

  • Besides tape, archival storage can also use DNA technology, which is currently developing rapidly.

  • Cyberstorage is an emerging storage category in the context of ransomware, but it is more often integrated as a security feature within existing products.

  • Vector databases are an emerging type of database in the context of AI, while traditional databases often integrate vector support as well.

Storage Market Categorization

A. Cloud Storage

Regarding predicting the future direction of the market, consulting firms’ analysis reports are good sources of information (Gartner, IDC, etc.). Although the reports are paid, there are usually additional sources:

  • Leading companies are usually willing to provide free public versions as a form of self-promotion.

  • Blogs and reports, although not first-hand information, can also reflect the main content. Some bloggers have specialized channels.

  • Adding filetype:pdf before a Google search can effectively find materials.

  • Adding "Licensed for Distribution" after a Google search can find publicly available Gartner documents.

  • Switching between English and Chinese search engines, as well as Scribd, can find different content. Chinese communities may have some documents saved.

  • In addition, reading the user manuals of leading products can also help understand the main features and evaluation metrics in the field.

Fortune predicts the global cloud storage market size to be around $161B, with an annual growth rate of approximately 21.7% [46]. In comparison, the global data storage market size is around $218B, with an annual growth rate of about 17.1% [60]. It can be seen that:

  • The cloud storage market has an excellent growth rate. Combined with the Understanding Stock Prices section, it is clear that this growth rate is very favorable for supporting stock prices, without needing to rely heavily on squeezing competitors or cutting costs.

  • In the long term, data storage is trending towards being largely replaced by cloud storage. This is because the proportion of cloud storage is already high and its growth rate exceeds that of overall data storage. At least, this is what the forecasts indicate.

Fortune storage market size forecast

From Gartner’s Magic Quadrant for Cloud Infrastructure [61] (2024), the leading market participants can be identified:

  • Amazon AWS: A persistent leader. AWS has large-scale infrastructure globally, strong reliability, and an extensive ecosystem. AWS is the preferred choice for enterprises seeking scalability and security. However, its complex services can be challenging for new users.

  • Microsoft Azure: A leader. Azure benefits from hybrid cloud capabilities, deep integration with Microsoft products, and collaboration with AI leader OpenAI. Azure’s industry-specific solutions and collaborative strategy are attractive to enterprises. However, Azure faces scaling challenges and has received criticism regarding security.

  • Google GCP: A leader. Leading in AI/ML innovation, the Vertex AI platform is highly praised, and its cloud-native technologies are distinctive. In environmental sustainability and AI services, GCP is appealing to data-centric organizations. However, GCP falls short in enterprise support and traditional workload migration.

  • Oracle OCI: Leader. OCI excels in providing flexible multi-cloud and sovereign cloud solutions, attracting enterprises that require robust integration capabilities. Its investments in AI infrastructure and partnership with NVIDIA have solidified its market position. However, OCI’s generative AI services and resilient architecture remain insufficient.

  • Alibaba Cloud: Challenger. As a major player in the Asia-Pacific region, Alibaba Cloud leads in domestic e-commerce and AI services. Despite having an excellent partner ecosystem, Alibaba Cloud’s global expansion is constrained by geopolitical issues and infrastructure limitations.

  • IBM Cloud: Niche player. IBM leverages its strengths in hybrid cloud and enterprise-focused solutions, seamlessly integrating with Red Hat OpenShift. Its solutions are attractive to regulated industries. However, its product portfolio is fragmented, and its Edge strategy is underdeveloped.

  • Huawei Cloud: Niche participant. Huawei is a key player in emerging markets, with strengths in integrated cloud solutions for the telecommunications sector. It excels in AI/ML research and has achieved success in high-demand enterprise environments. However, geopolitical tensions and sanctions limit its global expansion.

  • Tencent Cloud: A participant in specific fields. Optimized for scalable and distributed applications, with unique advantages in social network integration. However, its global partner ecosystem is limited, and it lags behind global peers in maturity.

Gartner Magic Quadrant for Cloud Platforms 2024

What key features should cloud storage provide? Gartner’s Cloud Infrastructure Scorecard [62] (2021) compares major public cloud providers, showing the list of categories as seen in the figure below. AWS’s strong capabilities are evident.

Gartner Cloud Platforms Storage Scorecard 2021

On the other hand, cloud storage can be viewed as gradually moving traditional storage functions to the cloud, benchmarking cloud storage against primary storage. From this perspective, what features should cloud storage have? Which features are already present in primary storage, and which might cloud storage develop in the future? What are the key metrics for measuring storage? See the next section on primary storage.

A. Primary Storage

This article corresponds primary storage to enterprise storage deployed locally under the cloud, serving critical data, which is a long-established traditional domain of storage. Its growth rate roughly corresponds to the overall storage market, as seen from [60] and its accompanying chart (previous section), with an annual growth rate of about 17.1%, gradually being eroded by cloud storage. Of course, in reality, primary storage has already deeply integrated with the cloud.

From Gartner’s Magic Quadrant for Primary Storage [59] (2024), the leading participants in this market can be identified:

  • Pure Storage: A persistent leader. Through Pure1, it provides users with proactive SLAs, benefiting IT operations and maintenance. The integrated control plane requires no external cloud communication or reliance on AIOps. The DirectFlash Module operates directly on raw flash memory, driving innovation in hardware, SLAs, and data management. However, Pure Storage lags in user diversification outside the US, lifecycle management plans increase array asset and support costs, and it does not support compute-storage separation.

  • NetApp: A leader. NetApp offers ransomware recovery guarantees and immutable snapshots. It simplifies IT operations through Keystone policies and Equinix Metal services. The BlueXP control plane provides sustainability monitoring to manage energy consumption and carbon emissions. However, NetApp does not offer competitive ransomware detection guarantees for block storage, its product line does not support larger 60TB/75TB SSD drives, and it does not support compute-storage separation.

  • IBM: Leader. IBM’s consumption plan offers unified pricing for product lifecycle and upgrades, providing guarantees for energy efficiency. Flash Grid supports partitioning, migration, continuous load optimization, and cross-platform functionality. However, IBM does not offer capacity-optimized QLC arrays, does not provide file services on block storage, and local flash deployments do not support performance and capacity separation.

  • HPE: Leader. HPE’s Alletra servers allow users to independently scale capacity and performance to save costs. GreenLake can be deployed identically on-premises and on AWS, enabling hybrid management. Load simulation provides users with comprehensive global recommendations for performance and capacity load placement. However, HPE lags in Sustainability and Ransomware aspects, does not support larger 60TB/75TB SSD drives, and there is confusion in product-load combinations.

  • Dell Technologies: Leader. After acquiring EMC, Dell offers a flexible full line of storage products, with APEX providing multi-cloud management and orchestration across on-premises and cloud environments. PowerMax and PowerStore deliver industry-leading 5:1 data reduction and SLA, integrated with Data Domain data backup. However, Dell does not provide a unified storage operating system suitable for mid-range and high-end, which increases management complexity.

  • Huawei: Challenger. Huawei’s multi-layer Ransomware protection is excellent, using network collaboration. Flash arrays offer a three-year 100% reliability and 5:1 capacity reduction guarantee. NVMe SSD FlashLink supports high disk capacity, accelerated by an ASIC engine. However, Huawei is limited in the North American region, does not offer multi-cloud expansion solutions for AWS, Azure, or GCP, customers are concentrated in a few vertical sectors increasing risk, and multiple storage product licenses are overly complex.

  • Infinidat: Challenger. Infinidat enjoys a good reputation in the high-end global enterprise market, offering high-quality services. SSA Express can consolidate multiple smaller flash arrays into a more cost-effective single InfiniBox hybrid array. Data can be recovered from immutable snapshots after a cyberattack. However, Infinidat lacks mid-range products, the cloud version of InfuzeOS is limited to a single-node architecture, and SSDs only support 15TB drives.

  • Hitachi Vantara: Challenger. Hitachi allows users to upgrade to the next-generation solution within five years of installation to reduce carbon emissions. EverFlex simplifies the subscription process for users, charging based on actual usage. EverFlex Control modularizes features, allowing users to customize according to platform needs. However, Hitachi lags in ransomware detection, does not offer disaggregated scaling of compute and storage, and falls behind in QLC SSDs used for backup.

  • IEIT SYSTEMS: Niche player. IEIT features a unique backplane and four-controller design with autonomous load balancing, scalable up to 48 controllers. It offers online anti-ransomware capabilities through snapshot rollback. The Infinistor AIOps tool provides performance workload planning and simulation. However, IEIT is unknown outside the Chinese market, lags in global multi-cloud expansion, and trails in the independent software vendor (ISV) ecosystem.

  • Zadara: Niche player. Zadara offers global, highly skilled managed services based on low-cost object storage and a disaggregated key-value architecture, using flexible lifecycle management to reduce hardware waste. Hardware in multi-tenant environments can be dynamically reconfigured. However, Zadara’s SLAs are limited, such as in ransomware protection, with smaller commercial scale and coverage, and third-party integrations and ISVs depend on managed service providers.

Gartner Magic Quadrant for Primary Storage 2024

What functions should primary storage have? Combining the Magic Quadrant report [59], Gartner Primary Storage Critical Capabilities report [64] (2023), and the Enterprise Storage Mainstream Trends report [66] (2023), we can see:

  • Consumption-based Sales Model: Unlike the traditional purchase of complete storage hardware and software, this model is similar to cloud services, charging based on actual consumption. Accordingly, SLAs are redefined according to user-end metrics, such as 99.99% availability. Gartner predicts that by 2028, 33% of enterprises will invest in adopting the Consumption-based model, rapidly growing from 15% in 2024. Related concept: Storage as a Service (STaaS).

  • Cyberstorage: Detection and protection against ransomware are becoming standard for enterprises, including features such as file locking, immutable storage, network monitoring, proactive behavior analysis, and Zero Trust [65]. Gartner predicts that by 2028, two-thirds of enterprises will adopt Cyber liability, rapidly increasing from 5% in 2024.

  • Software-defined Storage (SDS): SDS frees users from vendor-proprietary hardware, providing cross-platform, more flexible management solutions that utilize third-party infrastructure to reduce operational costs. On the other hand, SDS allows for the decoupled deployment of compute and storage resources, independently and elastically scalable, improving economic efficiency. AIOps capabilities become important and are often combined with SDS. The use of public cloud hybrid cloud features becomes common, which are also often categorized under SDS.

  • Advanced AIOps: For example, real-time event streams, proactive capacity management and load balancing, continuous optimization of costs and productivity, responding to critical operational scenarios such as Cyber Resiliency combined with global monitoring, alerting, reporting, and support.

  • SSD / Flash Arrays are growing rapidly. Gartner predicts that by 2028, 85% of primary storage will be flash arrays, gradually increasing from 63% in 2023, while flash prices may drop by 40%. QLC Flash is becoming widespread, bringing ultra-large SSD drives of 60TB/75TB with better power consumption, space, and cooling efficiency.

  • Single Platform for File and Object. For unstructured data, a Unified Storage platform supports both file and object simultaneously. Integrated systems save costs, and multiprotocol simplifies management. Files and objects themselves have similarities; images, videos, and AI corpus files are used similarly to objects, while objects with added metadata and hierarchical paths resemble files.

  • Hybrid Cloud File Data Services. Hybrid cloud provides enterprises with unified access and management across Edge, cloud, and data centers, with consistent namespaces and no need for copying. Enterprises can perform low-latency access and large-scale ingestion at the Edge, complex processing in data centers, and store cold data and backups in the public cloud. It is evident that traditional storage products are moving to the cloud, and public clouds are developing Edge deployments.

  • Data Storage Management Services. Similar to data lakes, data management services read metadata or file content to classify, gain insights, and optimize data. They span multiple protocols, including file, object, NFS, SMB, S3, and different data services such as Box, Dropbox, Google, and Microsoft 365. Security, permissions, data governance, data protection, and retention are also topics of discussion. Against the backdrop of rapid growth in unstructured data, enterprises need to extract value from data and manage it according to importance.

  • Other common features include: Multiprotocol support for multiple access protocols. Carbon Emissions continuous measurement, reporting, and energy consumption control. Non-disruptive Migration Service, ensuring 100% data availability during migration from the current array to the next. NVMoF (NVMe over Fabric) is a native NVMe SAN network. Container Native Storage provides native storage mounting for containers and Kubernetes. Captive NVMe SSD, similar to Direct Attached drives, customized for dedicated scenarios to enhance performance and endurance.

Gartner Top Enterprise Storage Trends for 2023

Additionally,

  • Key user scenarios that primary storage needs to support include OLTP online transaction processing, virtualization, containers, application consolidation, hybrid cloud, and virtual desktop infrastructure (VDI).

  • 主存储的 关键能力指标:性能,存储效率,RAS(Reliability, availability and serviceability),Scalability,Ecosystem,Multitenancy and Security,Operations Management。

Gartner Critical Capabilities for Primary Storage 2023

Another way to understand the required functions of primary storage is through user feedback. How do users perceive our products? [67] lists feedback from user interviews about likes and dislikes regarding a certain storage product. [68] provides a common user tender document. From these, some easily overlooked aspects can be seen:

  • Usability. For example, simple configuration and convenient management hold an important place in users’ minds, comparable to performance and cost factors. For enterprise users, permission management and integration with other commonly used systems and protocols are also crucial, such as file sharing and Active Directory. Customer service and support can translate into real monetary value.

  • Resource Efficiency. Storage deployed in users’ local data centers often faces issues of idle resources or some resources being exhausted while others remain unused. Expansion is a common demand and needs to be compatible and integrated with legacy systems. Cloud-like load migration, balancing, and continuous optimization are very useful. Disaggregated scaling and purchasing resources separately, avoiding bundling, can bring economic benefits to users.

  • The screenshot includes only part of the user feedback; the full text can be found in [67][68] original sources.

Customer interview Like and Dislike FlashBlade

Customer bidding storage example

What is the future development direction of primary storage technology? This can be learned from Gartner’s Hype Cycle. The following figure comes from [69][70], with different classifications. It can be seen that:

  • Object Storage, Distributed File Systems, and Hyperconvergence have been validated. DNA Storage, Edge Storage, Cyberstorage, Computational Storage, and Container Storage and Backup are emerging.

  • Distributed Hybrid Infrastructure (DHI) and Software-Defined Storage (SDS) are technologies poised to bring transformation. DHI provides cloud-level solutions for users’ on-premises data centers, such as consumption-based models, elasticity, resource efficiency, and seamless connectivity with external public clouds and Edge clouds. Its related concept is Hybrid Cloud.

  • The Hype Cycle of Storage and Data Protection chart is similar. Hybrid Cloud Storage is similar to DHI. Immutable Data Value falls under Cyberstorage. Enterprise Information Archiving belongs to archival storage, which is also a validated technology and will be discussed in the next section.

Gartner hype cycle storage technologies 2024

Gartner hype cycle storage technologies priority matrix 2024

Gartner hype cycle storage and data protection technologies 2022

A. Backup and Archival Storage

The first question is, how large is the market size and growth rate for backup and archival storage?

  • Market Research Future predicts [72]enterprise backup storage will have a market size of about $27.6B in 2024, growing thereafter at an annual rate of approximately 11.2%. Market growth is mainly driven by data volume growth, data protection, and the demand for ransomware protection.

MarketResearchFuture Data Backup And Recovery Market Size

  • As part of enterprise backup storage, archival storage holds a smaller share but grows faster. Grand View Research forecasts [73] a market size of about $8.6B in 2024, with an annual growth rate of approximately 14.1% thereafter. Market growth is primarily driven by data volume increase, stricter compliance requirements, data management, and security.

GrandViewResearch Enterprise Information Archiving Market Size

The next question is, who are the main market players in backup and archival storage? From Gartner’s Magic Quadrant for Enterprise Backup and Recovery Software Solutions [74] (2023) [75] (2024), the leading players in this market can be identified:

  • Commonvault: Leader. BaaS coverage is extensive, including SaaS applications, multi-cloud, and on-premises deployment, supporting Oracle OCI. Backup & Recovery interoperability is good. Commonvault brings enterprise-level features at a competitive price. However, Commonvault’s innovation in on-premises deployment lags behind the cloud, with some users reporting poor experience, and the HTML5 user interface lacks features compared to the on-premises application.

  • Rubrik: Leader. Rubrik innovates in product pricing portfolios, such as offering capacity-based user tiers for Microsoft 365. Rubrik excels in ransomware protection, including machine learning and anomaly detection in backup data. Rubrik’s scalability and excellent customer service continue to attract large enterprises. However, Rubrik needs to balance investments in security and backup, with limited SaaS application coverage and optional cloud storage mainly on Azure Storage.

  • Veeam: Leader. Veeam has a loyal user base and the Veeam Community. Veeam supports hybrid cloud and all major public clouds. Veeam has a large number of partners worldwide. However, Veeam is slow to respond to market demands for BaaS, SaaS, and Ransomware; the software is overly complex, and implementing a secure platform deployment requires careful design and configuration.

  • Cohesity: Leader. Helios is a SaaS-based centralized control plane that provides a unified, intuitive management experience for all backup products. DataProtect and FortKnox allow users to choose multiple public cloud storage locations. Cohesity actively forms Data Security Alliances with vendors from different fields. However, Cohesity’s new investments introduce third-party technology dependencies, its Backup as a Service (BaaS) capabilities are insufficient, and its geographic coverage is limited.

  • Veritas: Leader. Veritas offers comprehensive backup products, such as cloud and scale-out & scale-up solutions. NetBackup and Alta services support cloud-native operations and run Kubernetes on public clouds. Services and partners cover the globe. However, some of Veritas’s cloud products are still in early stages, it focuses on large enterprises and is less friendly to small and medium businesses, and it lacks SaaS application support (Microsoft Azure AD, Azure DevOps, Microsoft Dynamics 365, and GitHub).

  • Dell Technologies: Leader. PowerProtect provides data protection and Ransomware protection, supporting both on-premises and cloud deployments. It allows users to balance capacity across multiple appliances. It offers consistent management across multiple public clouds and is available on the Marketplace. However, Dell lacks a SaaS control plane, does not support alternative backup storage options, and advanced Ransomware analysis requires a dedicated environment.

  • Others: challengers, visionaries, and niche players. Briefly mentioned, see the original report [74] for details.

Gartner Magic Quadrant for Enterprise Backup and Recovery 2024

The next question is, what are the key features required for backup and archiving products? From [74], a series of Core Capabilities and Focus Areas can be seen:

  • Backup and Data Recovery: Based on this foundation, support for on-premises data centers and public cloud. Support for point-in-time backups, business continuity, disaster recovery, and other scenarios. Configure multiple backup and retention policies aligned with company policies. Tier cold and hot backup data to different locations, such as public cloud, third-party vendors, and object storage. Global deduplication and data reduction.

  • Cyberstorage: Backup data to immutable storage, Immutable Data Vault. Detect and defend against ransomware. Support disaster and attack recovery testing and drills. Provide protection for different targets such as containers, object storage, and Edge, covering on-premises, cloud, and hybrid cloud environments. Fast and reliable recovery, restoring archives, virtual machines, file systems, bare-metal machines, and different points-in-time.

  • Control Plane: A centralized control plane, unified across different products, unified locally and in hybrid cloud environments. Manages distributed backup and recovery tasks, manages testing and drills. Manages company compliance, data protection, and retention policies. Integrates with common other SaaS products and BaaS products. The control plane should be SaaS-based, cloud-like, rather than requiring users to manage installation and upgrades themselves.

  • Cloud-native: Backup software itself can be deployed cloud-natively, for example on Kubernetes. Data protection covers cloud-native workloads, such as DBaaS, IaaS, and PaaS. Integrates with public cloud services, supports storing data in the cloud, and supports scheduling tasks in the cloud. Backup products provide services in a BaaS manner close to the cloud. Payment is based on actual usage (consumption-based), rather than forcing users to purchase entire appliances.

  • GenAI & ML: Supports generative AI, for example in task management, troubleshooting, and customer support. Supports machine learning, for example for ransomware detection and automatic data classification.

The final question is, what is the future development direction of backup and archiving technology? This can be learned from Gartner’s Hype Cycle [69] (2024), as shown in the figure below. It can be seen that:

  • Data archiving, archive-dedicated appliances, and data classification have been validated. Cyberstorage, generative AI, cloud recovery (CIRAS)[76], backup data reuse analysis, and others are emerging.

Gartner hype cycle backup and data protection technologies 2024

B. File Storage

File storage holds an important position in enterprises and cloud storage. First, how large is the market size of file storage, and how fast is its growth rate? The VMR report [78] points out,

  • Distributed file systems & object storage had a market size of about $26.6B in 2023, with a compound annual growth rate of approximately 16%. This growth rate is roughly comparable to primary storage, slightly slower than overall cloud storage.

  • In many reports, file systems and object storage are combined in statistics. Indeed, the user scenarios for these two types of storage overlap, and in recent years their development has also absorbed each other’s characteristics. See the “Intertwined” section of this article.

  • Additionally, the Market Research Future report [79] provides the market size for (cloud) object storage alone (object storage is mainly cloud-based). By comparison, the object storage market size in 2024 is only $7.6B, with a annual growth rate lower than that of file storage, about 11.7%.

  • Another report from VMR [81] gives the market size for block storage, which can be used for comparison. In 2023, it was about $12.8B, with an annual growth rate of approximately 16.5%. The growth rate of the block storage market size is faster than that of object storage and similar to file storage.

VMR Global Distributed File Systems and Object Storage Solutions Market By Type

In Gartner’s Magic Quadrant for file and object storage platforms [77] (2024), the main players in this market can be seen. Note that file systems and object storage are still combined in the statistics. Also note, this mainly targets storage vendors, similar to primary storage, rather than public cloud (public cloud is covered in the “Cloud Storage” section).

  • Dell Technologies: Leader. After acquiring EMC, Dell has the broadest portfolio of software and hardware products, including unstructured data and purpose-built products. Dell has a global supply chain and suppliers. Dell works closely with Nvidia and invests in AI projects. However, PowerScale lacks a global namespace and Edge caching, faces intensified competition from modern flash storage with different architectures, and relies on ISVs to address critical needs.

  • Pure Storage: Leader. FlashBlade uses NVMe QLC SSDs, offering the industry’s highest density and lowest TB power consumption, with pricing competitive compared to HDD hybrid arrays. Evergreen//One and Pure1’s AIOps capabilities and monitoring ensure user SLAs. FlashBlade partners with Equinix Metal to extend on-premises infrastructure globally. However, the Evergreen//Forever program significantly increases capital expenditure, ransomware detection capabilities are limited, and hybrid cloud support is limited, such as deployments on AWS, Azure, and GCP using VMs and containers.

  • VAST Data: Leader. VAST’s strategic partnerships and marketing have greatly increased large customers. VAST uses QLC flash, advanced data reduction algorithms, and high-density racks. End users recognize its excellent customer service, including knowledge, pre-sales, architecture, ordering, and deployment. However, VAST lacks brand-integrated appliances, making it difficult to attract conservative global enterprises; frequent software updates cause instability; it lacks enterprise features such as synchronous replication, Stretched Cluster, Geodistributed Erasure Coding, Active Cyber Defense; and has limited hybrid cloud appeal.

  • IBM: Leader. IBM leads in the HPC market, combined with AI. File and object storage provide a global namespace across data centers, cloud or Edge, and non-IBM storage. IBM continuously enhances Ceph storage, favored by open-source users, unifying file, block, and object storage. However, IBM’s product portfolio is complex, cloud support is insufficient, and file storage tends to focus on HPC rather than general scenarios.

  • Qumulo: Leader. Qumulo offers the simplicity of SaaS and cloud elasticity on Azure. Its software provides consistent functionality and performance both on-premises and in the cloud. Qumulo’s global namespace enables access across on-premises and multiple clouds. However, Qumulo lacks ransomware detection capabilities, does not provide its own hardware and relies on third parties, and has limited global coverage.

  • Huawei: Challenger. OceanStor Pacific offers a unified platform for file, block, and object storage. From AI performance to data management, Huawei possesses proprietary hardware technologies, including chips and flash memory. Customer support and service are highly rated. However, U.S. sanctions and geopolitical issues limit global expansion, support for other public clouds like AWS, Azure, and GCP is limited, and flexible SDS solutions are not offered.

  • Nutanix: Visionary, with foresight surpassing all leaders. The NUS platform can consolidate various user storage workloads and centrally manage them under hybrid cloud. NUS simplifies implementation, operations, monitoring, and scaling management. Customer support services are recognized for reliability and responsiveness. However, the hyperconverged platform is not favored by users who only want to purchase storage, file and object storage have limited acceptance in hybrid cloud deployments, and it does not support RDMA access to NFS, making it unsuitable for low-latency scenarios.

  • WEKA: Visionary. The parallel file system is suitable for the most demanding large-scale HPC and AI workloads. The converged mode allows the file system and applications to run on the same server, improving GPU utilization. Hybrid cloud is widely available across public clouds and Oracle OCI. However, backup and archiving solutions are not cost-effective, S3 and object support are limited, and it lacks ransomware protection, AIOps, synchronous replication, data efficiency guarantees, and geographically distributed object storage.

  • Scality: The Visionary. Scality’s RING architecture supports EB-level deployments, with independent scalability of performance and capacity. Scality pursues a pure software solution that can run on a wide range of standard hardware, whether at the Edge or in data centers. RING data protection supports geographic distribution across multiple availability zones, with zero RPO/RTO, and extremely high availability and durability. However, as an SDS solution, it relies on external vendors and lacks the capability to deliver turnkey appliances; files are implemented via POSIX integrated with object storage, making it unsuitable for HPC.

  • Others: Participants in specific fields. Briefly mentioned, see the original [77] report for details.

Gartner Magic Quadrant for File and Object Storage Platforms 2024

What are the main features that file and object storage systems should have? A series of Core Capabilities and Top Priorities can be seen from the Gartner Magic Quadrant report, as listed below. On the other hand, it can be observed that these are largely consistent and similar to the main features of primary storage, backup, and archival storage.

  • Global Namespace: Unified management and access of files across local data centers, Edge, and multiple public clouds. Supports geographic distribution and replication protection. Supports hybrid cloud, S3, and multiple file access protocols. Unified Storage: Files, blocks, and objects are served by a unified platform. A single platform handles high performance and data lakes.

  • AIOps: Supports AIOps, simplified and unified management configuration, and automation. Provides excellent customer service in knowledge and architectural solutions. Data management, such as metadata classification, cost optimization, data migration, analysis, and security. Data lifecycle management. Metadata indexing, file and object labeling/tagging. Software-defined storage (SDS).

  • Cyberstorage: Provides ransomware detection and protection, maintaining business continuity during attacks. Response and data recovery. Of course, traditional security features such as data encryption and authentication are essential.

  • Cost and Performance: Uses QLC flash with advantages in capacity and power consumption. Increases rack storage density. Data reduction technologies such as deduplication, compression, and erasure coding, as well as data efficiency guarantees. Uses flash or SSD to accelerate file access, provides caching, and performs data reduction on flash. RDMA access reduces latency, Edge storage reduces latency. Supports linear scaling, supports separate scaling of performance and capacity, and properly handles performance and capacity bursts. STaaS model with consumption-based payment. Manages power consumption and carbon emissions.

  • Different User Scenarios: General file systems, databases, objects (or using files in an object manner), HPC, and AI represent different user scenarios, each with trade-offs in functionality and performance. See the “Enterprise Files - Data Volume” chart from Nasuni below [80].

Nasuni Types & Volume of Files in the Enterprise

Files are also one of the primary storage functions. Regarding future development trends and the Hype Cycle, they will not be repeated here. Refer to the “Primary Storage” section.

B. Object Storage

Object storage and file storage are often combined in statistics due to their similar functions, such as in Gartner’s Magic Quadrant for File and Object Storage Platforms [77]. The previous section “File Storage” already included object storage, so it will not be repeated here.

On the other hand, the classic scenario for object storage is cloud storage, as mentioned in the previous section “Cloud Storage,” which will not be repeated here. The functionalities required for cloud storage can also be benchmarked against storage vendors, as seen in the “File Storage” section.

B. Block Storage

The VMR report [81] provides the market size and growth rate for block storage, which has already been included in the chart in the “File Storage” section.

Block storage is one of the core functions of primary storage and is usually combined into primary storage statistics, which have already been covered in the “Primary Storage” section and will not be repeated here. Modern platforms are often unified storage systems, simultaneously providing file, block, and object services.

On the other hand, block storage is one of the classic scenarios of cloud storage, as mentioned in the previous section “Cloud Storage”, and will not be repeated here.

B. Database

What is the market size and growth rate of databases? According to the forecast from Grand View Research:

  • Databases had a market size of approximately $100.8B in 2023, with an annual growth rate of 13.1% [82]. Among them, the global cloud database market was about $15.05B in 2022, with an annual growth rate of 16.3% [83].

  • Compared with the previous section “Cloud Storage”, it can be found that: 1) The main market for databases is non-cloud. 2) Cloud data grows faster than non-cloud, but far less than cloud storage (21.7%). 3) The cloud storage market is much larger than the database market.

GrandViewResearch Database Market Size

Based on the market size and growth rates from the previous sections, various storage types can be plotted and compared, showing:

  • Cloud storage has the largest market size and the fastest growth rate (21.7%), offering good investment value. Next is the non-cloud database market, which is large in size but has a lower growth rate (13.1%).

  • The markets for file storage, block storage, and cloud databases are smaller but have decent growth rates (16%~17%). Meanwhile, object storage is weaker, with a small market size and a lower growth rate (11.7%).

  • In backup and archival storage, archival storage (14.1%) is growing faster than backup storage (11.21%). The former is growing quickly, while the latter has a larger existing volume.

Storage market size compare

Zoomed-in section of the lower market scale:

Storage market size compare

Although databases store data, in market segmentation, databases are generally not classified as part of the “storage” market. Storage usually refers to files, blocks, and objects, while databases operate on file and block storage. Databases have a large and complex content and a persistently active market, worthy of a separate article, whereas data lakes span both database and storage attributes (structured and unstructured data).

This article focuses on storage and therefore does not delve further into databases. The following is listed only:

  • Gartner Cloud Database Magic Quadrant (2024)[84].

Gartner Magic Quadrant for Cloud Database Management Systems

  • Gartner Data Management Technology Hype Cycle (2023)[71].

Gartner hype cycle backup and data management 2023

C. SSD Storage

Market Research Future predicts[85] that enterprise flash storage will have a market size of approximately $67.17B in 2025, with an annual growth rate of about 9.89%.

MarketResearchFuture Enterprise Flash Storage Market Size

Flash storage is commonly used for primary storage, file storage, and block storage, which are more common classifications in analytical reports and have been discussed in previous sections, so they will not be repeated here. Compared to SSD, flash is the storage medium, while SSD usually refers to the storage drive packaged with a controller.

C. HDD Storage

Market Research Future predicts [87] that the HDD market size will be about $62.43B in 2024, with an annual growth rate of approximately 6.1%. Note, this is the market for disks, not storage. The HDD market is facing decline, mainly due to being replaced by SSDs.

MarketResearchFuture Global Hard Disk Market Size

HDD storage is commonly used for primary storage, hybrid (flash) arrays, object storage, and backup systems. The latter is a more common classification in analytical reports rather than categorizing by SSD/HDD, and storage often does not rely on a single medium. These have been discussed in previous sections and will not be repeated here.

C. Tape Storage

Market Research Future predicts [86] that tape storage will have a market size of approximately $3.5B in 2024, with an annual growth rate of about 5.82%. Compared to SSD and HDD, the market size of tape is small and its growth rate is low.

MarketResearchFuture Tape Storage Market Size

Tape is commonly used for archival storage. The latter has been discussed in previous sections and will not be repeated here.

C. Memory Storage

Memory storage is generally used for databases or caches. Storage is usually not purely memory-based because it is difficult to ensure data persistence, especially during power outages in data centers. Memory in storage systems is typically used for serving metadata or indexes and is not independent of other storage categories. Therefore, this section will skip memory storage.

Market Analysis

The previous chapter covered the main segments of the storage market, key participants, their products, core product requirements, and possible future directions. This chapter will continue to delve deeper. Focusing on the market, it can reveal its structure and growth potential, driving factors, and core value.

Where do we stand in the constantly changing market landscape? Where will we be 3~5 years, or 10 years from now? By understanding the patterns, we can assist Vision and Strategy analysis to plan where we should be 3~5 years, or even 10 years into the future.

Market Structure

Basic market analysis includes market segmentation, market size, user scenarios, competitive landscape, products, and features, as discussed in the previous chapter. For the storage market, there are more dimensions to consider. For example, what is the market’s “natural structure”? It determines the product’s ceiling and growth model.

Market overview structure

Customer Composition

Consider developing new products and features, and identify which customers make up the corresponding market.

A typical classification is SMB, large enterprises, and specialized fields. As customer types, SMBs (small and medium-sized businesses) and large enterprises have significantly different product demands and marketing strategies. Although large enterprises can provide substantial sales profits, SMBs have lower buyer power requirements, avoiding a large number of customization demands that could even turn the company into an operations and maintenance provider.

Today, government procurement should be added as a new customer type. Additionally, individual consumers should also be included, as they often purchase cloud storage (see below Empower Everyone). On top of this, the degree of buyer monopoly should be considered a key factor in market structure. On the seller side, open source should be added as a competitor.

This topic further leads to Porter’s Five Forces Analysis[88]: competitors, supplier power, buyer power, threat of substitutes, and threat of new entrants.

A similar classification is low-end, mid-range, and high-end, covering customers with different preferences and scales. The low-end focuses on volume and standardization. The high-end serves large enterprises, customized needs, or specialized professional fields.

Another dimension regarding customers is stickiness, such as social networks. For details, see the section What is Value below.

The natural structure of the market

Some markets naturally have scale effects, such as hydropower and cloud computing. Competition ultimately leads to mergers among participants, leaving only a few companies, while the survivors enjoy a dual increase in revenue and profit margins.

Some other markets exhibit anti-scale effects, such as education and training, consulting, headhunting, and investment. These markets allow new small participants to continuously join, large participants cause fragmentation, and mature individuals or teams tend to operate independently.

The corresponding market dimension is growth pattern. Under scale effects, the number of users of an internet product can grow exponentially. When COGS and labor costs occupy a fixed proportion, such as in manufacturing, operation and maintenance services, and outsourcing customization, the product tends to grow linearly. Under anti-scale effects, product growth may even decline; another form of decline is market recession.

Market ceiling

In a market lifecycle (see next section), how high can the market size ultimately grow? This is related to the inherent structure of the market. One reference is O(P): surveying every individual, with a probability P of using the product.

Industries with O(1) scale are rare and valuable, such as social apps and payment applications, which everyone uses. Although Hollywood movies are well-known, not everyone has necessarily seen them. O(1) industries have extremely high ceilings and strong penetration. Conversely, industries with lower ceilings often need to pursue a high-end route to increase unit prices.

In a sense, the economic benefit of Enable Everyone / Empower Everyone is to increase the number of O(1) scale industries and expand the coverage of O(P<1) industries.

Penetration into adjacent markets

Emerging and rapidly developing technologies not only revolutionize their own industries but often penetrate adjacent industries, further expanding market and sales scope.

For example, cloud computing started by selling computing and storage resources but gradually replaced local operations and maintenance for enterprises. Object storage was originally used for storing images and videos, but unified storage platforms have the capability to manage files and block services as well. The penetration of internet platforms into various industries is evident.

Besides growth models, penetration capability is another dimension for measuring market potential. In other words, 1+1>2, multiple products form a progressively strengthening feedback loop (closed loop).

Conversely, a market vulnerable to penetration is unfavorable and often requires complementary investment in adjacent markets, using a product portfolio to build a moat.

Inference

From the perspective of personal career development, the market segment joined plays an important role. When products have scale effects, companies tend to retain a small number of top-tier talents and are willing to pay high salaries, as costs do not lie there. Is human resources a cost or a multiplier?

When a product has a linear growth model, compensation is often not high, but fortunately, there is a larger volume of work. Market natural profit margin expected wage level. See the Understanding Stock Prices chapter.

Large-scale layoffs often indicate that the market is in a declining or recession phase, which is very unfavorable for career development. Being “forced out” at least avoids actively being in a disadvantageous market.

Market Life Cycle

Continuing from market structure, the next key dimension is the market life cycle. Where will our team and products be positioned in a few years? Market structure explains growth and limits, while the market life cycle predicts its stage. Strategies are formulated based on this to pave the way for the next cycle.

Market growth stages

Market Stages

The market stages can be divided into the Introduction Stage, Growth Stage, Maturity Stage, and Decline Stage.

New technologies lie dormant among niche enthusiasts during the Introduction Stage, being advanced but growing slowly. They rapidly explode and grow exponentially in the Growth Stage. In the Maturity Stage, intense competition and mergers occur, focusing on quality and customer retention. In the Decline Stage, they are gradually replaced, with both revenue and profit margins declining.

More importantly, through market analysis, it is possible to predict when the market will enter stages such as the Growth Stage or Decline Stage, thereby planning strategic shifts accordingly.

Sources of new markets

New markets often arise from scale growth, new technologies, maturity, business model changes, and policy compliance. These are explained in detail in the following section Driving Factors. Disruptive innovation is the driving force behind market renewal.

Disruptive Innovation

This section still belongs to the market lifecycle chapter but is separated due to its importance. It can be said to be the most important concept. In the technology industry, disruptive innovation is the driving force of market renewal. Disruptive innovation marks both the beginning and the end of the market cycle.

(For more on innovation, see Methodologies for Skilled Innovation).

Incremental Innovation

Before breaking out of a single market lifecycle, business growth generally relies on incremental innovation. However, as complexity accumulates, marginal gains decrease and resistance increases. Market growth slows, competition intensifies, leading to “ involution” or stagnation.

On the other hand, it can be seen that whether it is incremental innovation or disruptive innovation, daily work in enterprises cannot do without innovation. Incremental innovation itself is not simple; it requires experience and insight to find effective ways to “make further progress on a hundred-foot pole” and lead the team to successful implementation.

Disruptive Innovation

Disruptive innovation brings new technologies and new paradigms, a new market cycle thus begins, replacing and ending the previous market.

New technologies lurk in the low-end market during the introduction phase and are often unnoticed by mature participants in the original market. After entering the growth phase, new technologies rapidly capture a large number of users, forcing the original market into decline. For the displaced participants in the original market, large scale becomes a negative factor (see the Understanding Stock Prices chapter), making it difficult to respond freely. Ultimately, new technologies claim the crown of the high-end market, completing market replacement. The cycle of old and new alternates and repeats, with the industry developing through successive overlapping waves [90].

Disruptive Innovation Growth

There are many examples of disruptive innovation. For instance, cloud computing penetrates enterprise storage, databases, and operations markets; NewSQL brings scale-out distributed architecture to databases; unified storage introduces SDS and implements distributed file systems; containers and Kubernetes revolutionize cluster management. More examples are shown in the figure below [90].

Disruptive Innovation Growth

Characteristics of Disruptive Innovation

The “development and progress” of disruptive innovation is reflected in multiple aspects. New technologies have higher productivity and efficiency than old technologies, and after complete replacement, the market ceiling is further raised. New technologies are more dynamic; besides replacing the original market, they penetrate adjacent markets, further expanding market size. New technologies require renovation of existing products and upstream and downstream support, leading to rewriting code, bringing a new wave of labor demand and freeing from “involution.”

“Rewrite” means that disruptive innovation does not discard the products of the previous market cycle. Knowledge, experience, and old paths are brought into the next cycle for reuse, spiraling upward. For example, DPU is a recent innovation in the storage field, but ASICs have long been used in switches [89], and storage before SDS was already “specialized hardware.” In the long run, software and hardware alternately swing. The experience from the previous and the one before that cycles has high reuse value.

In recent years, disruptive innovation has accelerated, and market cycles have shortened. In traditional industries, old technologies could last a lifetime. Backend technologies like storage and servers could be reused for about ten to twenty years. However, the rapid iteration of the internet and frontend technologies may completely change within five years. The rapid development of generative AI is even more astonishing, with breakthrough results published monthly. This acceleration trend benefits from improved production efficiency, global collaboration convenience, mature open-source infrastructure, visionary financial investment, and support for rapid enterprise expansion.

Inference

The new market cycles under disruptive innovation often “rewrite” the products of the last cycle and repeat the paths of the cycle before that, “spiraling upward.” This means experienced employees are especially important, because they have gone through the last cycle and even the one before that; their experience and witnessed history can be replicated in the next market cycle. (This contrasts with the current workplace trend of eliminating employees at age 35.)

On the other hand, newcomers have special value. Disruptive innovation requires thinking outside the box, and newcomers provide a rare opportunity to shield against fixed mindsets and to shift perspectives. Consult them before they get “contaminated” by the team. Mature employees are often more or less “contaminated” by the team, following the “We always do this before” approach, accustomed to “mature” experiences and perspectives, and they “contaminate” the newcomers.

Disruptive innovation means that the current work is bound to “fail”. Modern disruptive innovation is accelerating, and market cycles are shortening. This means that individual career spans will be shorter, possibly facing technological upgrades and large-scale layoffs within five to ten years. Meanwhile, new graduates are more competitive, having received systematic and comprehensive training for new technologies.

Those displaced by technological upgrades are also former beneficiaries. In a rapidly evolving market, newcomers always have many opportunities to enter, surpass veterans, and earn high salaries. The market is less likely to form “seniority” and barriers, making it more attractive for newcomers to join.

Driving factors

What drives the continuous emergence of new demands, allowing various market participants to survive and grow, and periodically initiating new market cycles? For the storage market, the driving factors come from multiple aspects: scale growth, new technologies, maturity, business model changes, and policy compliance. Understanding these driving factors helps determine future development space and direction.

Market demand driving factors

Scale Growth

Compared to other markets, a major characteristic of the storage market is natural scale growth, continuous and at a considerable pace, supporting about 10%~20% market size growth.

To cope with scale growth, various innovations are spawned. For example, on the software side, distributed file systems support linear expansion of data scale, as well as the entire big data ecosystem. On the hardware side, most hardware capabilities grow exponentially year by year, see the Hardware in Storage Systems section, accompanied by generational technology upgrades such as PCIe and QLC flash. In operations and management, SDS allows more convenient and flexible management of large-scale, heterogeneous storage devices, as well as cloud computing.

Data growth is accompanied by improvements in capability efficiency. How much data can one person manage? From the old era where one person managed one machine, to operations where one person manages 1PB of data, and now in the cloud era, a small team manages hundreds of data centers worldwide.

New technology

The driving force behind market updates comes from disruptive innovation. Technology upgrades and paradigm shifts bring about the “rewriting”, rebuilding, and reconfiguration of upstream and downstream demands for original products, creating numerous job opportunities. During the process of replacing the original market, new markets generate a large amount of purchasing demand. New technology often creates new scenarios that did not exist before, further stimulating demand.

Another source of growth for new technology comes from hardware development, whose capabilities improve exponentially. Stronger, faster, larger, and cheaper, making impossible scenarios possible, turning expensive products into affordable ones, with miracles bottled, sold at discounts, and clearance sales.

What follows is the demand at the software level. Software needs to integrate heterogeneous hardware, adapt to the next generation, optimize SKU combinations, and even co-design. Software needs to incorporate different enterprise systems and unify management of a large number of devices. Software needs to improve resource efficiency while preserving the native performance of the hardware as much as possible. To cope with complexity and variability, more technologies have also emerged. See the Hardware in Storage Systems section - “The Value of Software.”

Maturity

As the market matures, customers expect products to offer richer and more refined features, a process accompanied by incremental innovation. These provide participants with daily work.

For storage systems, specific expectations are: larger scale data, better performance, higher reliability, lower cost, more convenient management, greater security, richer features, stronger integration, customer service, and support.

Market driving factor of maturity

First, larger scale data has given rise to a series of capacity and performance technologies related to Scale-out and Scale-up. Based on capacity, distributed consistency (such as Paxos), distributed transactions, data organization (such as columnar storage), and indexing technologies (such as Mass-tree) are required. To manage large-scale data, cluster management (such as K8S), deployment orchestration, operation automation, monitoring, and alerting technologies (such as time-series databases) have emerged.

Second, better performance has driven optimizations across multiple dimensions, simplifying call paths (such as DPDK, SPDK), high-speed networks (such as RDMA), load balancing (such as Hedged Request), and dynamic migration. On the other hand, integration with hardware development (such as DPU, ZNS SSD) is also advancing.

Driven by the demand for lower costs, data storage costs have been continuously reduced through methods such as hot and cold tiering, erasure coding, data compression, and global deduplication technologies. Meanwhile, the service costs of data writing and reading have been lowered through Foreground EC, chip acceleration, and distributed caching technologies.

Next, more convenient management encompasses multiple aspects. Customers expect simple and user-friendly graphical interfaces with automatic upgrades. Management needs to be unified, for example in hybrid clouds that span local, edge, and cloud boundaries. Namespace unification is necessary, commonly used in multi-cloud file systems, as well as global deployment and access. Resource responsiveness and elasticity should be fast, such as with containers.

A major category is Safer Safety. Typical requirements include data replication, verification, QoS, backup, snapshots, and disaster recovery. Under global deployment, geographic replication and availability zone disaster recovery are becoming increasingly common. Protection levels are gradually improving, from 5min RPO to Zero RPO, from infrequent manual snapshots to Point-in-time and Time Travel. On the other hand, formal verification methods like TLA+ are also becoming widely applied in storage.

Another major category is Safer Security. Some requirements come from traditional storage encryption, transmission encryption, authentication, permissions, firewalls, key management, patch upgrades, etc. Other requirements arise from emerging scenarios such as Zero Trust, Ransomware protection, immutable storage, and privacy protection.

Next is Richer Functionality, with multiple directions. A single function can extend upstream and downstream, for example: data format -> visual tables -> automated ETL -> BI statistical reports -> complex data queries -> large-scale storage -> data lakes -> dedicated servers, forming a product portfolio and building a moat. A single function can be more complete and refined, such as file systems supporting more formats, access protocols, and providing various tools. It can even support functions beyond the definition, such as data warehouses supporting data modification and transactions, “open file” windows supporting convenient editing of unrelated files, databases integrating BLOBs, time series, and vectors. The richness of functionality is like fractal tentacles, delving deeper and becoming more detailed layer by layer under conflicting demands.

Finally, Stronger Integration is a common requirement for enterprise applications. In the previous chapter Storage System Market, integration was repeatedly mentioned as a necessary competitive advantage. For example, office software integrates databases and AI, storage platforms integrate third-party ISVs, and data management integrates general file sharing, Active Directory, etc. Integration also provides a window for products to cross boundaries and penetrate adjacent markets. Integration involves numerous complex tasks, such as incompatible APIs, diverse data formats, volatile business processes, diverse human demands, reporting and visualization, management portals, and ongoing maintenance costs.

In addition, customer service and support are also among the requirements for storage products to mature. Customer service not only refers to resolving product faults but also includes providing architectural solutions for customer scenarios, selecting cost-effective purchase combinations, deployment and implementation, a series of tasks that require extensive knowledge and professional communication. Additionally, there is the need for professional and comprehensive documentation writing.

Changes in Business Models

The drivers of demand come from technology on one side and from business models and customers on the other. Compared to e-commerce, social networks, and the internet, changes in storage business models have been relatively slow, but there have still been some changes in recent years.

The pandemic brought about the popularization of working from home and remote work. Enterprises took this opportunity to reduce office rental costs, benefit from global recruitment, and operate cross-state teams daily. Remote work has generated demand for Zero Trust, as well as growth in office software (such as Office 365) and remote meetings (such as Zoom). Office documents, file sharing, and meeting videos have driven storage demand.

Another frequently mentioned trend is STaaS (Storage as a Service). Providing storage services via web services is not only convenient and relieves customers of upgrade management burdens, but also makes consumption-based payment possible. Compared to upfront machine purchases, enterprises reduce costs. Cloud computing is also considered part of this trend.

There are more trends, such as the integration of generative AI in user interfaces and customer support, customers favoring SDS storage, containers becoming the fundamental pattern for cluster management, and the popularization of archival storage under GDPR.

The database market is more active, for example, the transition among SQL, NoSQL, and NewSQL, the hybrid of OLAP, OLTP, and HTAP in data warehouses, as well as data lakes.

Policy Compliance

Another source of change in market demand is policy compliance. Recent changes include GDPR, data sovereignty, and corresponding geo-cross-region storage, among others.

GDPR forces an increase in business costs, which also means increased customer spending, thereby expanding the market size. New policy regulations give rise to new technologies to meet the needs of privacy management and data archiving, further enlarging the market and driving growth.

Specifically, policies can be proactively driven by leading market participants to accelerate the relatively slow changes in business models. On the other hand, stringent compliance requirements raise industry entry barriers, excluding smaller competitors and widening the moat of existing players.

What is value

In market analysis, the most core question is what constitutes enduring true value. The “value” here demands a high standard:

  • Premium: Even if competitors sell products with the same feature set, we can still sell at a higher price.

  • Non-replicable: Even if the technology is fully leaked (or open-sourced), our products can still be sold at a higher price.

  • Stable: Does not degrade over time, withstands market cycles, remains stable amid risks.

Technology can provide a good competitive advantage, but the requirements for “value” are higher. Technology is easily copied, and patents can be circumvented. Disruptive innovation is destined to undermine the advantages of existing technology, and market cycles ensure this will happen.

Core Technology has the same disadvantageous problem. “Core” means it can be “reused” in a “large number” of places. Therefore, it is easy to migrate, and once copied by competitors, it can be used by them. Competitors are likely to encounter similar scenarios and spontaneously develop similar technologies. When one person successfully develops a technology, it can be reused everywhere, causing everyone else to lose their jobs (for example, the natural monopoly of open source). On the contrary, complexity and monopoly are easier to build moats (see below).

Technology brings progress in efficiency, such as the decisive advantage of online shopping platforms compared to physical stores. But this does not prevent competitors from entering the market and building new online shopping platforms. On the contrary, moats are often built around trust relationships, monopoly, and iteration speed (see below).

Below, this article breaks down real value into multiple aspects and explains them one by one: relationships, complexity, speed, culture, assets, data, stickiness, monopoly.

Overview of what composes true value

Relationships

The relationship here refers to connection, closer to graph theory, rather than just interpersonal relationships. Naturally, entities with extensive and rich connections are more robust and stable. Connections form chains, creating a network.

What are some examples of relationships? Good government relations help large enterprises secure high-value procurement orders and even influence policies to set industry entry barriers. Social networks are built on interpersonal relationships, which are broad, sticky, and have huge value potential. E-commerce platforms connect with a large number of customers, build trust, and gain traffic. The relationship between enterprises and financial trust (Credit) helps them obtain financing loans with lower costs and more flexibility. Enterprises cultivate their supply chains and work closely with many suppliers to improve product quality and stabilize risks. Enterprises that own the entire industry chain can focus investment on a single product to achieve breakthroughs in volume, giving them an advantage over enterprises with only a single product.

Extensive connections make intermediaries brokers, who often have advantages over individual producers they connect with, while a 1-to-N monopoly advantage is formed. In favorable times, intermediaries use the flexibility of relationships to rapidly scale up. When facing risks, intermediaries can easily transfer risks through multiple connections, whereas single producers must respond rigidly. The advantage of intermediaries does not depend on technology; broad connections facilitate obtaining information advantages, and they can even share profits through connections with the technology side. Those adorned in fine silk are not the ones who raise silkworms; those who herd livestock produce meat and milk, but the harvest belongs to others.

For people, when free to choose, the essence of relationships is trust. Trust is an asset, a value that requires time to accumulate and maintain, and appreciates over time.

Complexity

As mentioned above, a single core technology is difficult to match with true value. Lasting competitive advantage often comes from the complexity of a technology system composed of numerous technologies. It stems from first-mover advantages and years of accumulation. It is extremely difficult for an individual to fully explain, clarify the entire system, or even replicate it.

Moreover, the technology system is alive: it has a real active market, competitors, customers, and sales, improving continuously through a feedback loop. It involves people and experience. Replicating a living entity is even more difficult.

How to accelerate the accumulation of the technology system? Sustained high speed is one of the true values, explained in the next section Speed. It requires maintaining high-quality corporate organizational relationships, culture, sustaining large-scale, complex mature products and product portfolios, maintaining efficient execution, talent composition, engineering practices, sustaining relationship networks among departments, customers, and supply chains, and striving to reduce communication and transaction costs. These relate to various chapters of this article. The complexity embodied by these intangible assets requires long-term construction and is extremely difficult to replicate.

There are more examples of complexity. A typical one is experience, which is difficult to learn directly through storytelling or books, as it involves a mix of numerous fragmented pieces, chaotic interweaving, and intuition. It must be personally experienced and accumulated over time, making it “hard to replicate.” On the other hand, it is often difficult to transfer and reuse, leading to poor dissemination and scarcity, which in turn enhances its value within local enterprises.

Another example is data, which will be explained below. Data is one of the core competencies of modern enterprises—large, complex, yet to be fully mined, and continuously injected.

Velocity

The English term Velocity is more fitting, referring to an enterprise’s ability for rapid iteration of products or technology systems. If the product’s feature set is likened to city buildings, then velocity is similar to highways, which, although not directly corresponding to product features, are crucial to the speed of construction.

The ability of a company to continuously and rapidly iterate products is one of its true values. Even if competitors completely replicate products and technologies (such as open source), the company can still win customers through future speed. For example, SpaceX’s decisive advantage over traditional space agencies, or the frequent updates of the Chrome browser versus the former “annual” updates of the IE browser. Moreover, beyond speed, acceleration is even more powerful.

How do companies maintain a speed advantage? The previous section on “Complexity” has already listed this. Companies need a good management system to control cost and benefit, deliver high-quality products and services, and continuously advance in maturity (see the earlier subsection), all underpinned by the company’s culture. On the other hand, companies must focus on investment and innovation incubation to cope with cyclical disruptive innovation and original market decline.

From a system development perspective, mature Continuous Integration (CI) systems, global collaboration, and GitHub are highly valuable. This explains why GitHub was acquired by Microsoft. On the other hand, 2C products naturally have an advantage in iteration speed over 2B products, especially data storage products in the latter, which are particularly slow (data cannot be corrupted). A common strategy is to use 2C products to drive 2B, for example, public cloud providers Amazon, Google, Microsoft, and Alibaba all have both types of products.

From the perspective of speed as a “value,” it is understandable why internet companies generally pursue the 996 work schedule and overtime culture; this “value” even surpasses core technology. Although the value to the company and the value to the individual often do not align, see the following section on “Inference.”

Culture

Corporate culture has been mentioned many times earlier. Rooted in the intangible assets of human-to-human relationships, it is often the hardest to replicate. Corporate culture involves complex organizational development, accumulated over many years. Speed requires the support of corporate culture to maintain strong execution and innovative spirit.

In customer relationships, brand culture is extremely valuable in the consumer goods market, such as with many luxury brands. Even if competitors copy the same products, culturally enhanced products can be sold at higher prices.

Within the company, a good employment culture helps maintain employee stability and attract the same talent at more favorable costs. The bottom-line elimination system often harms culture, sacrificing real value to save expenses. Managers’ authority comes from layoffs; rehiring is like drawing lots again, without fixing any underlying problems.

Assets

Simply put, a large amount of money is “value.” Assets appreciate through interest rates, which is multiplicative, growing according to exponential growth, earning money for you while you sleep, and continuing persistently. Labor, on the other hand, is additive; “time-income” is linear, requiring your presence to work, and once paid, it disappears, depreciating with age.

The difference between individual developers and manual laborers is that knowledge is an asset. Knowledge can provide additional wage returns, appreciating with accumulated experience and more knowledge, reusable after investment, and protected by complexity. The capacity of knowledge can grow tenfold, whereas it is hard to imagine physical ability improving tenfold, such as running 100 meters in 12 seconds becoming 1.2 seconds. Moreover, sports competitions have only one champion, and their “involution” is more severe than knowledge work. Similarly, the 996 style of pure physical output work is unwise; humans cannot work ten times more hours every day.

Accordingly, technology is like knowledge assets for companies. Companies can also own a large amount of other types of assets, such as cash, securities, deposits, brands, users, real estate, machinery, property rights, and patents. Numerous assets require effective management (see below “Managers”) to ensure that each asset’s output at least reaches the market interest rate level. A counterexample is a project canceled after development costs have been invested. On the other hand, asset management must consider maintenance and depreciation, tax costs, and policy risks. Depreciation is a major cost of assets.

Like depreciation, knowledge assets depreciate over time (Expiring Asset), which has a greater impact on individual developers. Ordinary assets, on the other hand, often generate stable interest, appreciate over time, and prevent depreciation through monetization. More importantly, ordinary assets are protected by private property laws, which do not directly apply to knowledge in the mind.

Rapidly depreciating assets require more management. One of the principles of investment management is diversification. The expert path of individual developers specializing in a single technology can be said to be the opposite. Although specialized technical experts are favored in recruitment, for companies, as mentioned earlier, selling a product portfolio is more competitive. (Similarly, is a personal skill portfolio more advantageous for recruitment?)

Diversification is like many connections, as discussed in the previous section on “relationships,” which partly applies to companies. For individual developers, one option is to accumulate experience and build a broad network within a company (networks are reciprocal, not exclusive assets); another is to establish connections outside the company to ensure the transferability of knowledge and work (e.g., open-source technology); the third option is to become a manager.

An important part of company operations is managers, whose work essentially is investment (as mentioned earlier regarding the “market interest rate” of asset management). For example, choosing project directions and hiring personnel correspond to investments such as salaries. Managers use company funds for investment and pay their own salaries from the investment returns. Compared to individual developers, managers’ work inherently involves diversification (often “multithreading” as well). For example, simultaneously running multiple projects, incubating multiple innovations, and hiring multiple employees. Managers and employees satisfy a 1-to-N relationship, with the former being more stable amid risks and able to allocate layoffs. Managers do not directly rely on technology to gain competitive advantage but are extremely valuable.

Moreover, many jobs do not have the characteristics of assets, yet they are essential for the functioning of society, and those who dedicate themselves to these roles are true warriors.

Data

Data is one of the core competencies of modern enterprises. Data usually has a massive volume, a multidimensional complex structure, untapped value, is protected by property rights, and is “alive.” It is easy to replicate product features, but difficult to replicate the underlying data.

“Alive” data means it is used by a large number of real customers, maintained, and continuously updated. More importantly, it is verified, especially for storage products. In the era of user creation, alive data accumulates a large amount of customer creations, social relationships, usage habits, and history. Users, data, and products form an ever-evolving feedback loop.

On the other hand, data often has stickiness (see the subsection below). The cost for customers to migrate data is high and proportional to the amount of data accumulated. Data migration often brings issues such as format incompatibility, feature mismatches, and high transmission costs.

The company’s speed (see the previous subsection) depends on data. Modern project management, product evaluation, and innovation are built on data-driven approaches. For example, assessing the cost-benefit of a new storage feature and monitoring real-time feedback after deployment.

Stickiness

Stickiness is related to the product’s scenarios and characteristics. It increases the cost for users to migrate away from or abandon the product, such as the large amount of data accumulated and the user habits developed.

Another situation is external stickiness, such as social networks, where one user using the product leads other users in their social circle to also prefer using the same product. Another example is the bundling of enterprise product portfolios, commonly known as the “family bucket,” which optimizes interconnection among products from the same company while degrading compatibility with competing products.

Beyond technology and features, stickiness increases the product’s price and even helps companies move toward monopoly. Even if competitors completely replicate the product, it is difficult to win over customers.

Monopoly

Monopoly provides companies with a decisive competitive advantage and is highly valuable. A monopoly can sell products at high prices and influence policies to maintain its position. Monopoly does not necessarily rely on technology and often even slows down technological development.

Common monopolies arise from high economies of scale, such as hydroelectric infrastructure, cloud computing, and internet platforms. Participants often invest heavily in expanding scale early on and obtain excess profits after achieving a monopoly position. Ultimately, under antitrust policy restrictions, the market usually ends up with two companies, forming an oligopoly.

Another type of monopoly comes from stickiness (see previous section), such as social networks. The third type comes from policy monopolies, commonly found in public utilities and national security, such as the oil industry.

In addition, a similar corporate behavior is rapid market expansion, increasing revenue, lowering profit margins, and accelerating product iteration. It leverages its huge scale to obtain convenient bank financing, pressures inventory and payments up and down the supply chain to gain financial spreads, and secures policy benefits by employing a large number of jobs. For example, the new energy vehicle manufacturing industry.

About: Valuable Technology

What kind of technology can approach the requirements of “value”? One is the technology system mentioned earlier, characterized by complexity and speed. Alternatively, cutting-edge technology that is protected to resist replicability.

Generative AI is a recent cutting-edge technology and also possesses complexity, but it is still difficult to prevent many competitors from entering the field and replicating and spreading the technology across countries. Holding this technology alone is not enough for a company to survive; it must either move fast enough, see the speed section, or be combined with other products, meaning the company also holds other forms of “value.”

About: Value and Market Cycles

Although “value” helps enterprises gain a competitive advantage, in the long run, many pursuits of “value” reduce market vitality and accelerate market aging. For example, the pursuit of monopoly and setting policy barriers, deliberately creating product stickiness to bind customers, and weaving special political-business relationships.

Ultimately, value is exclusively locked within leading participants, and newcomers entering the market will be at a disadvantage. They must either choose to “compete intensely,” engaging in fierce competition and 996-style labor output, or choose to “lie flat,” following the market aging.

Alternatively, revolutionize within the old market and open up a new world. Thus, disruptive innovation arrives, initiating a new market cycle. Another name for disruptive innovation is destructive innovation.

Inference

There are many discussions about personal career development, as seen in the previous section “Assets”. On the other hand, in the fan economy, fans are closer to assets, and their value far exceeds mere labor output. This explains why “people would rather be streamers than work in factories,” even though fans are not legally protected like ordinary assets.

Further thoughts on relationships. The technology chain is also a network of relationships; a single technology is just the language defined by the upstream and downstream of the industrial chain (non-transferability), partially reflecting the physical world (transferability). Compensation depends on the upstream and downstream markets; technical difficulty does not guarantee it. On the contrary, the virtual is often more valuable because it satisfies one-to-many relationships, such as software managing hardware, bosses managing employees, and finance investing in industry. The virtual part controls resource allocation, communication channels, information flow, and other more valuable aspects, including laws and policies.

More in-depth thoughts on assets. A well-trained analyst sees everything marked by interest rates, and money is silently screaming, more trustworthy than human words. The same applies to laws and policies. The noisy information on social networks can be ranked by the money flowing behind it. In modern society, being unable to hear the voice of money is tantamount to being blind, deaf, and mute.

Modeling money can also restore the true preferences of individuals and even society. The desire to kill to defend life far exceeds the demand for ordinary consumer goods. Social hatred has good asset characteristics, and the profit space of geopolitical conflicts is huge, making sustainable warfare unfortunately brilliant.

Summary

Based on the previously listed storage market, this article analyzes many characteristics of the market, such as natural structure, life cycle, disruptive innovation, and so on. Next, the article analyzes the driving factors behind them, which guide the prediction of the future and the search for demand. Finally, the article reveals what true value is. Beyond technology, it brings irreplaceable competitive advantages to enterprises and individuals.

Additionally, other articles worth reading:

Hardware in storage systems

Compared to the manpower costs and lengthy cycles required for software-level optimization, hardware performance often grows exponentially. The rapid development of hardware and shifts in paradigms are among the enduring driving forces behind the evolution of storage systems, reshaping users and markets from the top down. What hardware can offer is the cornerstone for considering system architecture and future strategy.

This chapter discusses hardware related to storage systems, evaluating them from a data perspective, focusing on their performance, cost, and future growth. The article then considers their impact on storage systems.

(关于存储中的软件,见 A Holistic View of Distributed Storage Architecture and Design Space)。

Data Table

The table below shows capacity, bandwidth, and power consumption data for common storage, network, and computing hardware, along with a comparison of unit prices. The data are sourced from the internet and aim to reflect approximate scales. Accurate data often require professional market teams; influenced by brand and procurement combinations, large-scale suppliers can even offer discounts.

  • HDD throughput growth is relatively slow. The capacity cost of HDDs gradually decreases with new technologies such as SMR and HAMR, leading to a slow decline in performance per unit capacity. Roughly, over the past decade, HDD capacity has increased tenfold, but throughput has only doubled. The table below uses throughput (bandwidth), ignoring IOPS, as the former better reflects overall performance.

  • SSD under NVMe technology has seen exponential throughput growth in recent years, with latency approaching that of native flash memory, making PCIe the bottleneck. Technologies like ZNS further improve capacity and cost, and large-capacity flash memory enhances performance through concurrent channels. Under high throughput, flash wear remains an issue. Recently, the unit capacity cost of SSDs has rapidly declined, even showing a trend of being lower than HDDs. Counterintuitively, despite the high unit price, the cost per unit bandwidth and energy consumption of SSDs are lower than those of HDDs.

  • DRAM bandwidth grows exponentially with DDR technology upgrades, while energy consumption gradually decreases with lower voltage. Amazon provides sample capacity and prices; the table below uses DDR4 as a baseline. Modern servers often feature dual-channel or quad-channel configurations to further increase bandwidth. DRAM energy consumption is often divided into static refresh and read/write transfer parts, overall proportional to the square of the voltage.

  • HBM (High-bandwidth memory) is commonly used in GPUs or on-chip memory. Compared to DRAM, it uses a wide interface and stacking to increase bandwidth to extremely high levels, with very low read/write transfer energy consumption. HBM is one of the new technologies that has become widely known with GPUs; its current drawback is its high cost, with unit capacity prices about fifteen times that of DRAM.

Hardware prices - Storage

  • Ethernet is one of the fastest-growing technologies in recent years, with bandwidth increasing exponentially, nearly doubling every two years, while costs decline exponentially. Today, 100Gbps network cards are common, 200Gbps are being deployed, and 400Gbps are gradually emerging. Storage system architectures are being reshaped by rapidly increasing network bandwidth. Compared to DPUs in servers, ASICs are widely used technology in switches.

  • PCIe bandwidth grows exponentially with each PCIe Gen upgrade, nearly doubling every few years. Its latency depends on frequency (100MHz). The table below uses PCIe Gen5.0 16x as a baseline. PCIe pricing is tied to the motherboard; Amazon samples motherboard prices, which have remained basically stable in recent years. Although PCIe bandwidth is high, it is far less than GPU/NVLink and only barely keeps pace with SSDs (multi-drive per machine) and Ethernet.

  • CXL is promoted mainly by Intel, returning the memory bridge originally integrated into the CPU back to motherboard control, integrating PCIe, and enabling cache coherence for remote access. Although it is widely discussed, there are few clear products. CXL 1.1 and 2.0 use PCIe Gen5.0 as the physical layer; this article infers its performance and cost based on PCIe.

  • NVLink can replace PCIe to achieve GPU interconnection, offering extremely high bandwidth and similar latency compared to PCIe. This year, NVLink bandwidth is increasing exponentially; given the revolutionary impact and investment enthusiasm in generative AI, its development may even accelerate. Combined with AI’s demand for high bandwidth and all-to-all communication, NVLink is reshaping AI cluster architectures. The table below uses NVLink Gen4.0 as a baseline (A100 uses Gen3.0). In terms of price, NVLink is bundled with Nvidia GPUs.

Hardware prices - Networking

  • CPU performance can be broken down into the number of cores × frequency × IPC. IPC is reflected by single-thread performance and improves slowly year by year. The number of cores has increased rapidly in recent years, especially for server CPUs, with even manycore research directions. Frequency growth is limited by chip heat dissipation and server power consumption. Overall CPU performance is often measured by Linux kernel compile time or network communication throughput. The table below uses the AMD EPYC 7702P 64-Core as a baseline.

  • GPU provides extremely high FLOPS computing performance, with core counts and concurrency far surpassing CPUs. Compared to CPUs, which are limited by DRAM bandwidth memory walls, GPUs have very high NVLink and HBM bandwidth. Today, GPUs are expensive and hard to obtain, but future expectations are for exponential increases in computing power per unit cost and reductions in power consumption. The table below uses the Nvidia A100 as a baseline.

Hardware prices - Computation

Note that even a 10% difference in the Projection column of the above table can lead to significant differences in doubling/halving years due to the exponential nature. This can be compared with the table below:

Hardware prices - Projection scale

Data sources and references:

[1]

Goughlui.com  HDD throughput by year

[3]

BlocksAndFiles.com Enterprise SSD prices vs HDD by year

[6]

DRAM capacity, bandwidth and latency

[8]

AIImpacts.org Trends in DRAM price

[12]

NextPlatform.com Trends in Ethernet price

[19]

AIImpacts.org Geekbench score per CPU price

[26]

KarlRupp.net CPU, GPU Performance Per Watt

[31]

FLOPs/clock-cycle CPU vs GPU

[32]

Google Data Centers PUE

[33]

Uptime Global Data Centers PUE

[34]

Server component-wise energy

The above data can be used to calculate storage costs under different architectures, as well as future competitiveness. Specifically, capacity and bandwidth prices reflect purchase costs, while energy consumption reflects operating costs (TCO).

Additional Costs

First are energy consumption and cooling costs. Energy consumption simultaneously incurs corresponding cooling costs, and the sum of the two constitutes the data center’s electricity expenses, which after a few years can even exceed the purchase costs.

  • PUE: Data center energy efficiency. If a server consumes 500W, and the data center’s PUE is 1.5x, then this server including cooling requires 500W * 1.5 = 0.75KW of power. Assuming electricity costs $0.1 per KWh, the annual electricity cost is $657, and over five years, the electricity cost can even exceed the server purchase cost.

  • Common data center PUE is around 1.5x and remains relatively stable [33]. Google data centers can even compress it to 1.1x [32].

Next is network bandwidth. The bandwidth of a server requires corresponding bandwidth for T0 (TOR), T1, T2, and other switches.

  • Assuming a 100Gbps server NIC, T0 and T1 layers use 100% bandwidth full-provisioning, and the T2 layer uses 50% bandwidth provisioning, ignoring higher-level T*. The total additional bandwidth ratio is 2.5x.

Besides storage, network, and computing components, servers consume a considerable amount of energy on additional components. About 30% of extra energy consumption is spent on power transmission (15%), motherboard (10%), and cooling fans (4%) [34].

Comparison with public cloud prices

Today’s public cloud prices serve as a good reference for cost evaluation, unless there is a need to rethink the architecture of the storage system and the possibilities offered by hardware. This section uses public cloud storage Azure Storage [35] as a benchmark to compare the storage cost data calculated above.

Azure Blob Storage pricing

In addition to the extra costs mentioned earlier, it is necessary to calculate the DRAM, Ethernet, PCIe, etc., required to support 1GB of storage. DRAM involves the proportion of cached data. HDD and SSD often have significantly different prices. CPU needs to be configured according to bandwidth processing capacity. Data replication, compression, and erasure coding can significantly affect the physical data volume. Cold data can be stored at extremely low bandwidth costs.

  • The units are unified throughout the text: capacity in GB, bandwidth in GBps, currency in $.

Storage cost parameters

The table below calculates the cost required for 1GB of HDD storage, considering major server components, additional network switch support, data center cooling, as well as data replication, compression, and some cold data. Both purchase cost and energy cost are included, with purchase cost amortized over 60 months. It can be observed that:

  • Compared to Azure Storage Reserved Capacity, the calculated data cost is about 1/22x that of Cool and about 1/4.5x that of Archival.

  • Of course, the calculations in the text aim to be simple and clear, ignoring many overheads compared to reality. For example, internet bandwidth costs, data center construction, R&D expenses, sales expenses, read/write IO amplification, SSD wear, cross-region replication, backup, and disaster recovery, etc.

  • Besides storage, Azure Storage charges extra for data read and write. Sustained high IOPS or large bandwidth access to cold data is expensive. The calculated data cost already includes the bandwidth corresponding to 1GB.

Storage cost HDD

  • In the calculated HDD storage, the significant purchase costs from highest to lowest are HDD and DRAM. The significant energy consumption costs from highest to lowest are data center cooling, HDD, additional server costs, and DRAM. Total energy consumption costs account for about 40% of the purchase cost.

Storage cost HDD ratio

  • Similarly, the data calculated for 5 years later shows little change in the cost components. This is mainly because the costs of all components are decreasing. However, the total energy consumption cost as a percentage of purchase cost rises to about 60%. (Although the HDD bandwidth corresponding to 1GB capacity is decreasing, the calculation assumes the supporting bandwidth remains unchanged.)

Storage cost HDD ratio 5 years

The table below shows the cost required for 1GB of SSD storage. Compared to the HDD version, it can be observed:

  • Compared to Azure Storage Pay-as-you-go, the calculated data cost is about 1/34th of Premium. Of course, the calculations in the text ignore many actual costs.

  • The calculated SSD storage cost is about 12 times that of HDD, similar to Azure Storage, where Premium is about 15 times that of Cool.

  • The calculated cost per unit bandwidth for SSD is lower than HDD, which aligns with Azure Storage, where Premium’s read/write prices are lower than Hot, and the colder the data, the higher the read/write cost.

Storage cost SSD

  • In the calculated SSD storage, the significant purchase costs from highest to lowest are SSD and CPU. The significant energy consumption costs from highest to lowest are data center cooling, additional server costs, SSD, and CPU. The total energy consumption cost accounts for about 50% of the purchase cost.

Storage cost SSD ratio

  • Similarly, the data for 5 years later is calculated, with little change in each cost component. This is mainly because the cost of each component is decreasing. However, the total energy cost as a percentage of purchase cost rises to about 58%. (Although the SSD bandwidth per 1GB capacity is increasing, the calculation assumes the supporting bandwidth remains unchanged.)

Storage cost SSD ratio 5 years

It can be inferred that for SSD storage, if the data is hot, operations are frequent, transmission bandwidth is large, and functions are simple, using a local data center may have a cost advantage. For cold data, the public cloud is an ideal storage location, and using Reserved Capacity can further reduce costs.

Choosing HDD and SSD

Based on the required capacity and bandwidth, the cost of purchasing HDD or SSD from the data table above can be plotted to compare different choices. The results are similar after adding energy consumption overhead. It can be seen that:

HDD/SSD price selection

  • Area 1 and Area 2 correspond respectively to 1) low bandwidth, high capacity demand, using HDD; 2) high bandwidth, low capacity demand, using SSD. Compared to SSD, HDD has very low bandwidth, so Area 1 is smaller in size.

  • Area 2 uses only HDD, but with higher bandwidth demand, requiring additional HDD capacity to meet bandwidth. Area 3 uses only SSD, but with lower bandwidth demand, resulting in bandwidth underutilization for the given SSD capacity.

  • As an improvement, Area 2 and Area 3 are suitable for hybrid storage using both HDD and SSD, or using SSD as a cache for HDD.

Driving factors of hardware development

With hardware performance advancing so rapidly, what drives it? The driving factors can be analyzed from both the technological and market perspectives.

On the technical level, overall, smaller process nodes, higher integration, custom-designed chips, higher frequencies, and new physical media, Moore’s Law drive the exponential improvement of hardware performance. In various components, there have been a series of technological innovations in recent years:

  • HDD: Continuously increasing storage density, the gradually adopted SMR technology in recent years, and future use of more complex but higher-density storage technologies such as MAMR and HAMR [36].

  • SSD: A series of technologies improve SSD performance from different aspects. Interface protocols like NVMe, NVMoF. Simplified FTL layers such as ZNS, FDP [37]. Flash architectures like 3D NAND. Increasing flash density and using new physical media such as TLC, QLC, PLC.

  • DRAM: Each generation of DDR continuously increases clock frequency, improves architecture, and reduces voltage. DRAM density and packaging are also improving, such as 3D Stacking technology.

  • HBM: Alongside DRAM, HBM stacking technology is continuously improving, allowing for more layers and higher cross-layer transmission speeds. The signal transmission rate of the lines and the interface width are also increasing.

  • Ethernet: The Ethernet protocol is constantly evolving, significantly increasing transmission bandwidth. In recent years, RDMA RoCEv2 has been widely adopted, with servers using ASIC chips to replace CPUs for high-speed networking. Fiber optic switches are also used in data centers.

  • PCIe: The continuous speed increase of each PCIe generation is due to improvements in encoding protocols and synchronization efficiency, allowing more data to be transmitted in one clock cycle. Improvements in transmission media allow for higher rates. Then, the lane multiplier doubles for parallel transmission.

  • CPU: Performance is improved on multiple levels. More transistors, higher integration, smaller process nodes. Multi-core and manycore integrated into a single processor. Slight frequency increases. Microarchitecture improvements, higher IPC. New vector processing instructions SIMD. Integrated accelerators for dedicated tasks. Power optimization, DVFS, and other technologies.

  • NVLink: Similar to PCIe, which itself is rapidly evolving. NVLink’s speed improvements additionally benefit from its high integration with GPUs, wider connection bandwidth, and investments driven by the AI boom.

  • GPU: Thanks to the revolutionary investment surge driven by generative AI, the GPU field is developing rapidly. Each new generation of Nvidia GPUs updates the architecture with more cores and execution units, integrates larger and more components, and reduces the manufacturing process node. Tensor Core and RT Core are optimized for specialized tasks. Compared to CPUs, GPUs integrate memory and buses directly on their own boards. HBM, PCIe, and NVLink themselves are also rapidly evolving. Although GPUs have lower clock frequencies than CPUs, these frequencies have been steadily increasing in recent years.

Another driving force behind hardware development comes from market demand. This can be understood from common metrics used to measure storage performance:

  • Capacity: The classic big data 3V [38] theory—volume, velocity, and variety. Compared to other industries such as TVs, refrigerators, and automobiles, few markets grow as greedily as data. The amount of hot data is always limited, proportional to the business activity cycle multiplied by transaction frequency, which means there is vast optimization space for cold data; meanwhile, policy compliance further drives demand. Users are willing to pay for additional capacity-based needs such as security encryption, backup and disaster recovery, and analytics mining.

  • Throughput: A richer media experience, including images, videos, streaming media, AI training and services, has continued to drive demand growth in recent years.

  • IOPS: Transaction processing leans towards IOPS demand. Databases are a rare mature yet enduring market, with startups still continuously emerging in recent years. On the other hand, the web, mobile applications, and the internet reach and penetrate everyone. O(1) scale businesses are extremely rare, even Hollywood blockbusters struggle to reach this (O(P) refers to surveying every person, with P probability of using the product).

  • Latency: Compared to human perception, today’s hardware speeds are extremely fast, and latency below a certain threshold becomes unimportant. However, quantitative trading relentlessly pursues lower latency; latency requirements for complex AI computations and autonomous driving remain unmet; as well as fields connected to the physical world such as IoT and robotics. On the other hand, software is becoming increasingly complex, meaning latency optimization is continuously needed.

Observations and key points

Careful observation of the hardware data table above reveals many points worth considering:

  • Latency cannot be purchased. From the data table above, it can be seen that bandwidth and capacity both have prices; more money can buy more, and technology allows for horizontal scaling. But latency is an exception, and unlike bandwidth and capacity, it does not show significant annual improvement. Improving latency often requires a technology upgrade (unpredictable) or replacing the storage medium from scratch (with huge cost and migration). Latency is the most expensive.

  • The cost corresponding to data capacity remains high. From the data table above, whether it is purchase cost or energy consumption cost, whether HDD or SSD storage, hard drives occupy a significant position. It is imaginable that any Data Reduction technology, such as compression, deduplication, or erasure coding, has significant potential to improve the cost-effectiveness of storage systems.

  • DRAM accounts for significant purchase and energy costs. Compared to SSD, DRAM has a high purchase cost. Even if bandwidth is not actually used, DRAM’s static refresh continuously consumes power. Especially for HDD storage, DRAM costs are more significant compared to cheap hard drives. Some new technologies are cost-effective, such as using SSD instead of DRAM to manage metadata, and offloading cold (meta) data from DRAM to SSD.

  • DRAM bandwidth may become a bottleneck in the future. A server does not have many DRAM channels, but it can be equipped with dozens of SSDs. Ethernet bandwidth growth also far exceeds that of DRAM. GPU/NVLink bandwidth is much higher than DRAM. Meanwhile, DRAM capacity is expensive and power-hungry. As a bridge between CPU and IO, DRAM bandwidth is consumed multiple times by the same data. The CPU memory wall problem is already significant today. There are some makeshift solutions, such as inserting additional small-capacity DRAM into servers, and using DDIO [39] technology to let short-lived data bypass DRAM.

  • Although SSDs are expensive, their price per unit bandwidth is much better than HDDs. This means using SSDs for Write Staging and using inexpensive SSDs to provide universal acceleration for cloud storage is a natural trend. On the other hand, after SSDs become widespread compared to HDD storage, storage systems need to support a moderate mix of SSD/HDD to accommodate various levels of “bandwidth/capacity ratio” requirements. Similarly, NVDIMM-N uses DRAM for Write Staging and flash memory for power-loss storage.

  • In SSD storage, the purchase and energy costs of CPUs are significant. This stems from the high bandwidth requirements. This highlights the great potential of DPUs and dedicated network chips in improving costs. In recent years, ARM CPUs have been increasingly adopted, AWS Nitro chips have achieved great success, and dedicated cards for compression and encryption are no longer uncommon.

  • CPU performance improvements are slow and cannot keep up with SSDs and Ethernet, and CPU energy consumption overhead is significant. This has given rise to a series of technical approaches in recent years: 1) using DPDK and SPDK to bypass the operating system kernel; 2) using DPUs and accelerator cards to replace CPU processing loads; 3) using ARM to replace Intel CPUs; 4) bypassing the CPU-DRAM-PCIe ecosystem, such as using GPU-HBM-NVLink instead of PCIe.

  • The bandwidth and price of Ethernet and PCIe have reached similar levels, and both are improving exponentially. A reasonable speculation is, can Ethernet replace PCIe to simplify computer architecture? Ethernet is easier to scale horizontally, interconnect multiple machines, and pool extra bandwidth. But compared to PCIe, Ethernet has higher latency and it is difficult to solve lossless transmission and consistency issues. CXL is on this path.

  • Conversely, can PCIe replace Ethernet? Cluster architecture largely depends on how machines are interconnected, such as Hyper-converged, Disaggregated, Geo-replicated, etc. Given the demand of generative AI for TB-level interconnect bandwidth, future cluster architectures may diverge into different paths: 1) large-scale storage systems with GB-level interconnect bandwidth; 2) small-scale HPC-GPU clusters with TB-level interconnect bandwidth. (1) separates compute and storage, while (2) integrates compute and storage by co-locating. Public clouds need to sell new products targeting (1) and (2).

  • Ethernet is developing rapidly, surpassing all other hardware. This aligns with the evolution of storage systems and databases towards Disaggregated architectures, Shared-nothing architectures, compute-storage separation, Shared-logging, Log is database, and other directions. On the other hand, only the bandwidth per unit capacity of HDDs continues to decline, posing challenges for future storage design [40]. The main reason is that mechanical hard drive technology is already very mature, and performance improvements are limited by mechanical physics, while technologies like SMR and HAMR that increase storage density still have room for development.

  • High-performance hardware does not necessarily mean expensive prices; in fact, the cost per unit bandwidth can be lower, for example DRAM < SSD < HDD. Ethernet shows a similar trend, with cost per unit bandwidth 100 Gbps < 40 Gbps < 10 Gbps. This suggests that sharing and pooling are profitable and have economies of scale, allowing cloud storage to adopt high-end hardware first.

  • The energy cost of data center cooling is significant. If the PUE is reduced from the average level of 1.5x to the Google data center level of 1.1x, the benefits are huge; alternatively, directly using cloud computing services. Energy consumption is not the only problem with data center cooling; cooling system failures (such as thunderstorms, overheating) are not uncommon. Delivering sufficient power to high-density racks and providing adequate cooling also pose challenges. Server energy consumption is a huge issue, which may even exceed the performance concerns of storage systems, especially in cloud storage. Conversely, highly scaled cloud storage is easier to optimize energy consumption than private data centers, for example through site selection and Free Cooling. On the other hand, public cloud providers can leverage scale advantages to demand customized server designs to further save energy.

Where is the value of software?

From the data table above, it can be seen that hardware generally experiences exponential growth in performance or exponential decline in cost. Improving performance through software not only incurs high R&D costs but may only achieve about a 30% improvement per year. So, where does the value of software lie?

  • Exposing bare hardware latency to users: As mentioned above, “latency cannot be bought” is one of the values of software, which is to provide users with the native latency level of bare hardware (rather than bandwidth) as much as possible. At the same time, the software layer needs to combat system complexity, physical component distances, and dynamic load variations. Many architectural technologies originate from this.

  • Managing large amounts of hardware: Only software can do this, as hardware itself cannot require one SSD to manage another SSD. This gives rise to directions such as managing distributed systems, managing complexity, and managing resource efficiency. The related demand is system integration, where enterprises commonly require integration and unified management of different brand systems, as discussed in the Storage Systems Market chapter.

  • Distributed systems: The software layer unites large amounts of hardware into distributed systems, operating complex technologies in between. Virtualization, scheduling, fault recovery, disaster recovery, replication, and so on. Only the software layer can provide hardware with horizontal scaling, high reliability, load balancing, geographic replication, and other functions, leading up to cloud computing.

  • Complexity: Within the system, the software layer manages complex user requirements and complex system requirements. Across systems, the software layer provides interoperability, compatibility, and cross-hardware protocol interfaces. At the product and market level, software maintains an ecosystem involving multiple participants. Unified namespaces, file systems, databases, access protocols, and more are designed, enabling local hardware capabilities to become globalized. Software benefits from differentiated competition brought by complex, non-standard functions, while hardware interfaces tend toward standardization, with intense performance and cost competition reducing profit margins. Another way to put it is that software provides extensive functionality, uniformity, and simplified management.

  • Resource Efficiency: The software layer improves hardware resource utilization through load balancing, congestion control, pooling and sharing, parallel processing, and other methods. Software makes it possible to collect monitoring data from a large amount of hardware. Software can coordinate migration between high- and low-performance hardware to achieve the best cost ratio. Software can predict the future, scheduling loads and managing hot and cold data. More importantly, software can manage energy consumption, which is a primary cost in data centers. Additionally, it is necessary to reduce the extra management costs brought by the software itself.

On the other hand, this means that when choosing a storage system architecture or considering career development as a programmer, it is necessary to carefully think about what constitutes a high-value direction. For example, imagining development salaries as investment input, is the return on performance optimization at the software layer high enough?

Case Study: EBOX

This chapter uses a case study to demonstrate how to apply the previously discussed framework for analysis. It helps the team identify forward-looking investment directions, map technological innovations to financial metrics, and plan development strategies for the next 3 to 5 years. EBOX is an interesting technological innovation.

This chapter first introduces what EBOX is, its innovations, potential benefits, and risks. Next, it analyzes costs and benefits, as well as future expectations, from the perspective of storage systems. Then, it examines how R&D costs can be amortized. Finally, it analyzes from the supplier’s perspective whether selling EBOX is profitable.

What is EBOX

EBOX is an interesting potential innovation direction for storage systems. It further breaks down the traditional storage server into hard drive enclosure server EBOX and a storage server with only computing functions left. Both can be independently optimized, and a series of technological innovations are based on this. Several sources mention EBOX technology and explain how EBOX works:

  • zStorage[42]: The lower-level storage uses dual-controller EBOF all-flash drive enclosures, while upper-level services run on standard server nodes. All service nodes share access to the EBOF storage nodes. Vast Data does not manufacture EBOF drive enclosures themselves but commissions other manufacturers to produce them, aiming to make EBOF drive enclosures as cheap as standard servers and to develop an ecosystem.

  • Vast Data[43]: No direct mention of EBOF or EBOX by Vast Data (same name but different meaning; Vast Data’s refers to Everything Box) was found. However, as summarized by zStorage, Vast Data uses DBox (NVMe JBOF) to store data and CNode to compute and manage the cluster, both connected via NVMoF. Any CNode can access any DBox, and this shared architecture greatly improves the availability (not durability) of data nodes. Ultra-long erasure codes are allowed, reducing data replica overhead to 1.0x~1.1x.

Vast Data DBox

  • NVMoF for Disaggregated Storage[44]: As shown in the figure below, NVMoF brings many innovations to storage architecture. If the data server is simple and standardized enough, SSDs can be accessed through Direct Access methods without the need for a CPU. Even PCIe can be eliminated and integrated with Ethernet.

NVMoF E-BOF Disaggregated Storage

  • HammerSpace NFS-eSSD[45]: HammerSpace’s network file system first utilizes the NFS4.2 protocol, allowing clients to bypass the metadata server and directly access storage nodes. Furthermore, storage nodes no longer require CPU, DRAM, or PCIe, connecting SSDs directly to Ethernet, controlled by custom chips.

HammerSpace NFS-eSSD

It can be seen that EBOX has a series of advantages, and this article analyzes cost-benefit based on them:

  • Full Interconnection of Storage Servers and Data Nodes (Shared Everything). Unlike traditional data disks being exclusively owned by a single storage server, the full interconnection architecture can improve the reliability (Availability) of data disks by several orders of magnitude. On this basis, ultra-long erasure codes further reduce data replica overhead. Fan-out connections help with load balancing. DSR (Direct Server Return) from EBOX to the client helps reduce latency.

  • Standardized EBOX can replace the CPU with custom chips or DPUs. As seen in the Hardware in Storage Systems section, CPUs account for significant purchase and power consumption costs. Replacing the CPU with custom chips helps greatly reduce these costs. On the other hand, compared to traditional storage servers, custom chips handle massive data traffic processing, allowing the storage server CPU to focus on metadata tasks and use cheaper CPUs instead.

  • Ethernet SSD replaces PCIe. If any access to the EBOX comes remotely from a server via Ethernet, then PCIe can be eliminated and integrated into Ethernet. Besides simplifying the EBOX architecture, replacing PCIe with Ethernet also benefits from the rapid advancements in Ethernet bandwidth and cost in recent years.

  • Disaggregation of storage servers and data drives. Disaggregation designs often help improve resource efficiency and enable independent horizontal scaling. Imagine an HDD-based cluster where data gradually becomes cold; the associated storage servers can be gradually shut down while keeping the HDD EBOX online, saving power consumption. Traditional servers, however, cannot decouple storage servers and hard drives, so shutting down servers while keeping drives online is not possible.

  • Direct Access between EBOXes. EBOXes can communicate directly with one or more other EBOXes to migrate data. Storage servers do not need to participate in the data transfer process except during the initiation phase. This is very beneficial for implementing common storage system functions such as data repair and data migration, similar to RDMA at the EBOX level, saving bandwidth and CPU resources on the storage servers themselves.

On the other hand, EBOX also has a series of additional costs and risks:

  • EBOX lacks mature solutions, suppliers, and ecosystem. Immature manufacturing means high early-stage costs. Although public cloud can provide large-volume orders, suppliers need to consider why they should participate. Of course, a low starting point also means a high growth rate and high return on investment for the stock.

  • Full connectivity improving data reliability is based on assumptions: EBOX has much higher reliability than storage servers. This is justified, as complete storage servers are far more complex than EBOX, requiring frequent software, operating system upgrades, and reboots. EBOX is simple enough to standardize operations. It is known that the reliability of individual hard drives is often much higher than that of an entire server. Additionally, dual controllers require extra hardware costs.

  • Replacing PCIe with Ethernet is based on assumptions: Ethernet has lower costs, higher bandwidth, and faster future growth than PCIe. This is not necessarily true; PCIe is specialized for intra-server transmission, and specialization is likely superior to Ethernet, which must accommodate both near and far transmissions. More importantly, there are additional Ethernet infrastructure costs.

  • Additional Ethernet Construction Costs. Storage servers and hard drives are decoupled, requiring new Ethernet connections between them, new switches, and new ports. However, there are ways to avoid this, such as connecting the EBOX to an existing network without building a new one. Using DSR to return customer data, the storage server and EBOX only exchange metadata, sometimes not even requiring bandwidth expansion.

  • R&D Costs and Data Migration Costs. Developing a new system based on an entirely new hardware architecture is not easy, but there are ways to avoid this. For example, designing a hardware-software isolation layer and striving to replace only low-level components, or leveraging a large user base to dilute costs. Similarly, data migration costs can be diluted or pricing strategies can be designed to encourage users to migrate data themselves.

Which advantages of new technologies have great potential, which advantages are less important than they appear, and how advantages and disadvantages map to cost-benefit for systematic comparison require further analysis.

Cost-benefit of storage systems

First, a qualitative analysis can be conducted to determine whether EBOX is suitable for various current storage scenarios:

  • High Capacity, Low Throughput: Typically HDD storage systems. Applicable. EBOX brings many advantageous features: ultra-long erasure codes reduce the replica overhead of cold data, replacing CPUs with dedicated chips to lower costs, shutting down some storage servers to save power, and EBOX directly accesses beneficial data migration balancing.

  • Low Capacity, High Throughput: Typically SSD storage systems. Applicable. In addition to the above advantages, EBOX eliminates the CPU in the data path, and DSR can improve throughput and reduce latency.

  • High Capacity, High Throughput: Can be merged into “Low Capacity, High Throughput.”

  • Low Capacity, Low Throughput: This scenario is unrealistic and can be merged into “High Capacity, Low Throughput.”

Regarding the most expensive attribute of storage systems—latency (Hardware in Storage Systems section), can EBOX provide an advantage:

  • Favorable Factors: Replacing the CPU with dedicated chips and eliminating complex software such as operating systems. Although the frequency of dedicated chips is often lower than that of CPUs (due to power consumption and stability constraints), latency benefits from higher parallel processing capabilities, reducing the length of waiting queues.

  • Unfavorable Factors: Decoupling storage servers from hard drives, where the original PCIe connection is replaced by Ethernet. PCIe latency is at the 100ns level, while Ethernet latency is at the 10us level, and network packets also pass through additional switches.

Next, for the main metrics of storage systems—capacity, bandwidth, power consumption, and cost—an analysis can be conducted based on the cost data table in the Hardware in Storage Systems section. The following shows SSD storage:

  • Parameter Settings: To compare EBOX, parameters are set based on the potential advantages and additional costs mentioned earlier. The reliability improvement brought by “full connectivity” reduces the erasure coding redundancy from 1.3x to 1.1x. PCIe is replaced by Ethernet. CPU costs decrease due to replacement with dedicated chips. The separate EBOX incurs additional network card overhead. Due to early immaturity, supplier production costs have an additional penalty, which decreases by 5% annually. In the 5th year, with maturity and standardization, there is no penalty, and costs decrease by 5%.

EBOX cost parameters

  • Other parameters reuse the Hardware in Storage Systems section. Units are unified throughout the text: capacity in GB, bandwidth in GBps, currency in $.

Storage cost parameters

  • Purchase and Energy Consumption Cost Table: The table below shows the purchase and energy consumption costs of each component corresponding to 1GB SSD storage in year 0 after adopting EBOX. Compared with the SSD storage costs in the Hardware in Storage Systems section, the costs are similar. CPU purchase and energy consumption expenses decrease, SSD purchase costs decrease, but the savings are offset by supplier manufacturing penalties.

EBOX SSD storage cost

  • Purchase and Energy Consumption Cost Ratios: The table below shows the proportions of each component in the purchase and energy consumption costs in year 0 after adopting EBOX. Compared with the Hardware in Storage Systems section, the proportion of energy consumption relative to purchase cost decreases. The proportions of each component are roughly similar, but the CPU purchase and energy consumption proportions decrease, corresponding to an increase in SSD proportion and also an increase in DRAM proportion.

EBOX SSD Storage cost ratio

Next, consider the performance and cost changes of hardware over the next 5 years, using the storage cost calculation from the hardware in storage systems chapter as a baseline to compare the benefits of EBOX. First, we show SSD storage. The following charts display the impact of different features on cost, with the vertical axis representing the proportion of cost savings (the higher, the better).

  • Each legend stacks more features from left to right. For example, “++++ NIC cost extra” indicates that four features are enabled: reduced erasure coding redundancy, reduced PCIe cost, reduced CPU cost, and additional NIC cost (4 plus signs).

  • The main cost savings come from erasure coding and CPU. Erasure coding reduces the purchase cost of SSDs, which account for a large portion of the total cost, resulting in significant benefits. On the other hand, the high bandwidth of SSD storage leads to high CPU costs, so improvements on the CPU side have a noticeable effect.

  • The benefit of replacing PCIe with Ethernet is negative but not significant. Ethernet costs are still higher than PCIe. PCIe originally accounted for a very small proportion of the cost. The additional NIC overhead is relatively small, which is also because the Ethernet overhead originally accounted for a relatively small proportion of the cost. This also indicates that the Disaggregated architecture does not introduce excessively high costs due to network disaggregation.

  • Total cost savings are around 20%, which requires suppliers to mature manufacturing in 5 years. A 20% reduction over 5 years, or an average annual cost decrease of 4%, can support how much stock price increase? Combining the calculations from the Understanding Stock Price section, assuming the company’s revenue remains unchanged and the initial profit margin is 20%, it can support approximately 16% stock price growth in the first year.

  • The overall cost savings ratio slightly decreases year by year, even if the impact of manufacturing penalties is excluded, the decrease is very slight. The main reason is the rising proportion of energy consumption in the purchase cost, while the reduction in erasure coding redundancy is not counted as energy savings. Cost components that do not decrease annually will gradually increase their share, thereby dragging down the total savings, such as SSD energy consumption and DRAM purchase costs.

EBOX SSD Storage cost compare 5 years

The HDD version is similar, so the similar charts are skipped. Below shows the cost changes over the next five years, using the HDD storage calculation table from the Hardware in Storage Systems section as a baseline for comparison.

  • The main cost savings come from erasure coding. The reason is similar to SSD storage: even though HDDs are cheap, their cost still accounts for a significant portion of storage. Savings from CPU improvements are not significant because their original share is small. Similarly, the benefit of replacing PCIe with Ethernet is not significant, and the additional network card overhead is also negligible.

  • The benefit of replacing PCIe with Ethernet is not significant. The trend is similar to SSD storage. However, in HDD storage, the overhead of PCIe or Ethernet is greater because of the high bandwidth brought by SSDs.

  • The proportion of costs for each component changes little after adopting EBOX. It is worth noting that the purchase and energy consumption of DRAM are significant costs for HDD storage, but EBOX does not provide improvements in this regard.

  • The overall cost savings are around 10%, which requires suppliers to mature manufacturing over 5 years. Compared to SSD storage, the savings ratio is lower because CPU costs in HDD storage are not high. Calculated in the same way, 10% over 5 years corresponds to an average annual cost reduction of 2%. This can support roughly an 8% stock price increase in the first year.

EBOX HDD Storage cost compare 5 years

It can be seen that for SSD storage, EBOX yields good returns. The most effective improvements come from erasure coding and the CPU. Unexpectedly, replacing PCIe with Ethernet does not bring much benefit, and the additional network costs introduced by EBOX separation are also low.

Dilution of R&D costs

Continuing from the previous text, the next question is, how much R&D cost does EBOX require? How many PB of storage does EBOX need to sell to dilute its cost? First, we can reasonably infer the cost-related parameters:

  • The storage cost per GB data comes from the calculations in the Hardware in Storage Systems chapter. This cost is relatively low; referring to the public cloud storage price comparison in that chapter, the sales price below is set at 10x (markup).

  • The approximate monthly salary data for developers hired from different countries comes from the internet, with the highest in the United States. It can be assumed that developing EBOX while maintaining the original product operation requires 200 people. Compared to the monthly salary, the company needs to pay 2x the employment cost.

Storage dev cost parameter

How many PB of storage does a development team of this scale need to sell to sufficiently cover its own monthly salary? This can be calculated, as shown in the figure below:

  • Taking the most expensive US employment as an example, selling HDD storage requires reaching about 1.8K PB. In contrast, selling SSD storage is more profitable, requiring only about 1/10x PB to cover salaries.

  • If hiring from other countries, there is hope to immediately reduce the required storage PB sales by half, with the other half turning into profit. This shows that cross-border employment has huge potential benefits.

Storage dev sell PB to pay salary

An interesting finding is that the 996 work schedule can significantly increase employee output, thereby reducing the number of employees and cutting expenses. See the figure below. Of course, the following analysis is still based on a normal working hours system (40 hours per week).

  • Under a 40-hour workweek, actual development output is only 31% of the 40 hours. This is because the fixed costs of development work are very high, for example, 20% of the time is spent in meetings, 20% on operations and troubleshooting, and 20% on learning. Additionally, public holidays and paid annual leave take up about 9% of the time.

  • Compared to the 40-hour workweek, the 996 work schedule can quickly increase output by 2.6 times, corresponding to 60 hours of work per week. This is because the additional working hours do not change fixed costs and directly translate into development output, resulting in significant marginal effects. This does not take into account the fatigue caused by long-term 996.

  • More aggressively, the 7 am to 10 pm, 7 days a week work schedule can further increase output to 5 times, corresponding to 90 hours of work per week. This allows cash-strapped startups to significantly reduce costs while waiting for returns from later scaling. Note that working 90 hours per week is a common level at SpaceX[63].

Storage dev efficiency

As the scale of storage sales grows, how can R&D costs be diluted? The chart below shows the revenue changes as storage PB sales increase. The unit is $M, and the period is annual.

  • As storage PB sales increase, revenue growth is linear, you get what you pay for. For the same storage PB sales, SSD revenue is about ten times that of HDD.

  • If revenue exceeds $1B, HDD needs to sell about 20K PB of storage, while SSD needs to sell about 2K PB of storage. If revenue exceeds $10B, HDD needs to sell about 200K PB of storage, and SSD needs to sell about 20K PB of storage.

  • For the global cloud storage market size of about $160B [46], it can be inferred that $1B corresponds to the level of a small public cloud, about 1% of the global share. $10B corresponds to the level of a top public cloud, about 10% of the global share.

Storage revenue yearly by sold PB

Here is the key: how does the net profit margin change with the sales volume of storage PB? The net profit margin is calculated by deducting storage costs and R&D costs from revenue. The following calculation uses the example of employment in the United States, where labor costs are the highest. In the chart below, a dramatic scale effect can be observed.

  • When does profitability begin? Profitability starts when HDD storage sales exceed the critical point of 2K PB, while SSD only needs to exceed 200 PB. Although at smaller sales volumes, losses are severe due to R&D costs. HDD losses can even reach about -700%, and SSD losses about -60%.

  • After surpassing the critical point, not only does revenue increase, but the net profit margin also rises rapidly, which is an excellent scale effect. In ordinary businesses, net profit margins often decline as revenue increases.

  • Net profit margin rapidly rises to 90%. This is an extremely profitable business level. The usual experience is that manufacturing has a net profit margin of around 5%, advanced manufacturing can reach 15%~20%, and excellent software businesses can achieve 30% [47].

  • Net profit margin of 80%~90% requires only about $1B in revenue. Combining with the previous paragraph, $1B in revenue corresponds to about 1% of the global market share, representing a small public cloud provider. This means that the scale effect of ultra-high profits does not actually require a very large scale. Meanwhile, a top public cloud provider with 10% global market share can reliably earn huge revenue and extremely high net profit margins.

  • Net profit margin is insensitive to costs. The chart below additionally shows net profit margins after doubling and quadrupling R&D costs. It can be seen that after reaching a certain scale, doubling R&D costs hardly affects the net profit margin, which still remains at 80%~90%. The business is very robust.

Storage net income % by sold PB

It can be seen that for storage businesses of a certain scale, bearing the R&D costs of EBOX is sufficient, even more than enough. Of course, the analysis in this article is much simplified compared to reality, aiming for simplicity and clarity to illustrate the thought process.

Suppliers and Market

Strategic thinking means considering not only the storage side itself but also the other side, which here is the suppliers selling EBOX. A successful cloud storage strategy requires supplier cooperation, especially for new hardware. Assuming suppliers originally sell hard drives to public clouds, the question in this section is: from the supplier’s perspective, should they launch a new EBOX product for sale?

As a basis for thinking, the Issue Tree Framework (see the earlier Analysis Methods section) can be used to break down the problem:

  • Market Demand

    • Public Cloud Demand
    • Competitive Products
  • Product Feasibility

    • Technical Feasibility
    • Manufacturing Feasibility
  • Financial Capability

    • Potential Revenue
    • R&D Costs
  • Risk

    • Customer Adoption
    • Supply Chain

First, look at the market demand aspect:

  • From the previous analysis, it is possible to collaborate with the public cloud for joint production and design to ensure demand. The public cloud side intends to gain cost advantages through EBOX and seek suppliers.

  • Launching the product ahead of others is beneficial for surpassing competitors and expanding the existing market share. The public cloud is also more likely to make bulk purchases.

  • Expanding from selling hard drives to selling the complete EBOX machine broadens the sales scope, which is conducive to increasing revenue and additional profits.

Next, looking at the product feasibility aspect:

  • In terms of technical feasibility, EBOX is similar to a simplified customized server and is not a completely new technology. The key lies in integration and cost control, making it feasible.

  • In manufacturing, additional costs are incurred initially due to immaturity, but these can be passed on through pricing. The previous calculations show that public cloud providers can accept this manufacturing penalty in price.

Next, let’s look at financial capability:

  • The premise mentions a cloud storage market size of $160B, with hard drives accounting for about 80% of the cost according to the data table. Assuming the supplier currently holds a 10% market share, the supplier’s current revenue is approximately $13B.

  • Launching the EBOX product ahead of competitors can expand market share. Assuming market share grows from 10% to 20%, revenue increases by approximately $26B.

  • Expanding sales from hard drives to EBOX broadens the sales scope and further increases revenue. Based on estimates from previous data tables, revenue increases by about 13%, with revenue further growing to approximately $29B.

  • Compared to selling hard drives, EBOX is more complex and can provide higher added profit. Assuming the net profit margin improves from the original 10% to 12%, the supplier’s net profit grows from about $1.3B to approximately $3.5B.

  • From $1.3B to $3.5B, net profit increases by 270%. If this growth occurs over 10 years, combined with calculations from the Understanding Stock Prices section, it can support an average annual stock price growth of 10%. The returns are favorable.

  • In terms of R&D costs, the large-scale procurement brought by public cloud can dilute costs. Moreover, EBOX is not a brand-new technology; its major cost component, hard drives, is an area where suppliers have mature experience. Even with R&D costs for 200 people, using the data table mentioned earlier, it accounts for less than 1% of $13B.

Finally, regarding risks:

  • Whether public clouds are willing to continuously purchase EBOX products in large quantities is a major risk. From the supplier’s perspective, it is best to avoid being tied to large customers, while also selling EBOX to private clouds and gradually increasing investment based on actual revenue.

  • Compared to simply selling hard drives, EBOX includes many additional components, with the CPU being the second largest hardware cost. Products can start with more general-purpose CPUs such as ARM, and only consider more customized DPUs or specialized chips in later generations. Under the guise of “allowing customization,” software costs can be passed on to public cloud customers.

From the above simple analysis, it can be seen that suppliers can also profit from launching EBOX products, and even achieve good returns.

Summary

End of text. This article, set against the technical and industry background of (cloud) storage, sequentially explains methodology, understanding stock prices, market, market analysis, hardware, and EBOX case study. The methodology section builds the thinking framework for Vision and Strategy. The stock price section analyzes its principles, understands the company’s goals, and maps them to the team. The market section provides an overview of the competitive landscape of storage systems and analyzes key market characteristics, disruptive innovation, and value. The hardware section models its capabilities and development speed, with an in-depth look at key points. Finally, the case study section applies the article’s analytical methods using the EBOX example, yielding many interesting conclusions.

References

[1] Hard Drive Performance Over the Years : https://goughlui.com/the-hard-disk-corner/hard-drive-performance-over-the-years/

[2] Disk Prices: https://diskprices.com/

[3] Enterprise SSDs cost ten times more than nearline disk drives : https://blocksandfiles.com/2020/08/24/10x-enterprise-ssd-price-premium-over-nearline-disk-drives/

[4] Next-generation Information Technology Systems for Fast Detectors in Electron Microscopy : https://arxiv.org/ftp/arxiv/papers/2003/2003.11332.pdf

[5] SSDs Have Become Ridiculously Fast, Except in the Cloud : https://databasearchitects.blogspot.com/2024/02/ssds-have-become-ridiculously-fast.html

[6] A Modern Primer on Processing in Memory : https://arxiv.org/pdf/2012.03112

[7] Wikipedia DDR SDRAM: https://en.wikipedia.org/wiki/DDR_SDRAM

[8] Trends in DRAM price per gigabyte : https://aiimpacts.org/trends-in-dram-price-per-gigabyte/

[9] Bandwidth and latency of DRAM and HBM : https://www.researchgate.net/figure/Bandwidth-and-latency-of-DRAM-and-HBM-and-the-impact-of-latency-on-application_fig2_329551516

[10] High-bandwidth memory (HBM) options for demanding compute : https://www.embedded.com/high-bandwidth-memory-hbm-options-for-demanding-compute/

[11] Wikipedia High Bandwidth Memory : https://en.wikipedia.org/wiki/High_Bandwidth_Memory

[12] More Than Anything Else, Cost Per Bit Drives Datacenter Ethernet : https://www.nextplatform.com/2021/08/30/more-than-anything-else-cost-per-bit-drives-datacenter-ethernet/

[13] Wikipedia PCI Express : https://en.wikipedia.org/wiki/PCI_Express

[14] Quora Motherboard price changed over time : https://www.quora.com/How-has-the-cost-of-CPUs-on-motherboards-changed-over-time-Is-there-a-significant-difference-in-their-usage

[15] Quora What is the latency of a PCIe connection : https://www.quora.com/What-is-the-latency-of-a-PCIe-connection

[16] Wikipedia NVLink : https://en.wikipedia.org/wiki/NVLink

[17] Doubling of Data Center Ethernet Switch Bandwidth Every Two Years : https://www.prnewswire.com/news-releases/doubling-of-data-center-ethernet-switch-bandwidth-every-two-years-continued-in-2022-reports-crehan-research-301793556.html

[18] Timed Linux Kernel Compilation: https://openbenchmarking.org/test/pts/build-linux-kernel-1.16.0

[19] 2019 recent trends in Geekbench score per CPU price : https://aiimpacts.org/2019-recent-trends-in-geekbench-score-per-cpu-price/

[20] NVIDIA A100 GPU Benchmarks for Deep Learning : https://lambdalabs.com/blog/nvidia-a100-gpu-deep-learning-benchmarks-and-architectural-overview?srsltid=AfmBOoqh1Spj-txULhl0GTfLiqVJ2A_G-Sv3mCNiPC5UC2fnpuWI9o9s

[21] Trends in GPU Price-Performance : https://epoch.ai/blog/trends-in-gpu-price-performance

[22] Scality claims disk drives can use less electricity than high-density SSDs : https://blocksandfiles.com/2023/08/08/scality-disk-drives-ssds-electricity/

[23] Wikipedia Solid-state drive : https://en.wikipedia.org/wiki/Solid-state_drive

[24] GreenDIMM: OS-assisted DRAM Power Management for DRAM with a Sub-array Granularity Power-Down State : https://dl.acm.org/doi/fullHtml/10.1145/3466752.3480089

[24] ChatGPT Datacenter Networking Device Consuming Power Watt: https://chatgpt.com/share/676e8a82-bbf4-800f-8859-b34f22f95fee

[25] Gigabyte Server “Power Consumption” Roadmap Points To 600W CPUs & 700W GPUs By 2025 : https://wccftech.com/gigabyte-server-power-consumption-roadmap-points-600w-cpus-700w-gpus-by-2025/

[26] CPU, GPU and MIC Hardware Characteristics over Time: https://www.karlrupp.net/2013/06/cpu-gpu-and-mic-hardware-characteristics-over-time/

[27] Understanding the Energy Consumption of Dynamic Random Access Memories : https://www.seas.upenn.edu/~leebcc/teachdir/ece299_fall10/Vogelsang10_dram.pdf

[28] AMD Dives Deep On High Bandwidth Memory - What Will HBM Bring AMD : https://www.anandtech.com/show/9266/amd-hbm-deep-dive/4

[29] Reddit HBM cost and CPU memory cost comparison : https://www.reddit.com/r/chipdesign/comments/166thgi/hbm_cost_and_cpu_memory_cost_comparison/

[30] Compute Express Link (CXL): All you need to know : https://www.rambus.com/blogs/compute-express-link/

[31] Instructions/clock-cycle for each core of Intel Xeon CPUs compared with FLOPs/clock-cycle of Nvidia high-performance GPUs : https://www.researchgate.net/figure/nstructions-clock-cycle-for-each-core-of-Intel-Xeon-CPUs-compared-with-FLOPs-clock-cycle_fig6_319072296

[32] Google Data Centers Efficiency : https://www.google.com/about/datacenters/efficiency/

[33] Global PUEs — are they going anywhere? : https://journal.uptimeinstitute.com/global-pues-are-they-going-anywhere/

[34] Component-wise energy consumption of a server : https://www.researchgate.net/figure/Component-wise-energy-consumption-of-a-server-23-24_fig5_355862079

[35] Azure Blob Storage pricing: https://azure.microsoft.com/en-us/pricing/details/storage/blobs/

[36] Seagate Plans To HAMR WD’s MAMR; 20TB HDDs With Lasers Inbound : https://www.tomshardware.com/news/seagate-wd-hamr-mamr-20tb,35821.html

[37] Using SSD data placement to lessen SSD write amplification : https://blocksandfiles.com/2023/08/14/using-ssd-data-placement-to-lessen-write-amplification/

[38] Big Data: The 3 Vs explained : https://www.bigdataldn.com/en-gb/blog/data-engineering-platforms-architecture/big-data-the-3-vs-explained.html

[39] Intel Data Direct I/O Technology : https://www.intel.com/content/www/us/en/io/data-direct-i-o-technology.html

[40] Declarative IO - Cluster Storage Systems Need Declarative I/O Interfaces : https://youtube.com/watch?v=TGWKZnJeNmA&si=AC6gaUtfnPjIt_vB

[41] CPU open IPC_benchmark : https://openbenchmarking.org/test/pts/ipc-benchmark&eval=a29a620e89e1cb4ff15d5d31d24eaae1cc059b0e

[42] zStorage Distributed Storage Technology: Summary of 2023, Outlook for 2024 : https://mp.weixin.qq.com/s/uXH8rkeJL_JMbKT3H9ZuCQ

[43] Vast Data white paper : https://www.vastdata.com/whitepaper/#TheDisaggregatedSharedEverythingArchitecture

[44] NVMe Over Fabrics Architectures for Disaggregated Storage : https://www.snia.org/sites/default/files/ESF/Security-of-Data-on-NVMe-over-Fabrics-Final.pdf#page=25

[45] Network Attached Storage and NFS-eSSD : https://msstconference.org/MSST-history/2023/FlynnPresentation2.pdf#page=7

[46] Fortune Cloud Storage Market Size, Share & Industry Analysis : https://www.fortunebusinessinsights.com/cloud-storage-market-102773

[47] Margins by Sector (US) : https://pages.stern.nyu.edu/~adamodar/New_Home_Page/datafile/margin.html

[48] MSFT (Microsoft) Preferred Stock : https://www.gurufocus.com/term/preferred-stock/MSFT

[49] Stock Analysis MSFT stock analysis : https://stockanalysis.com/stocks/msft/statistics/

[50] Trading Economics United States 30 Year Bond Yield : https://tradingeconomics.com/united-states/3-month-bill-yield

[51] Investing.com Microsoft corp historical dividends : https://in.investing.com/equities/microsoft-corp-dividends

[52] Jacob’s view on whether Microsoft is worth investing in now based on EPS three years from now : https://mp.weixin.qq.com/s/IU03qeV53bcK75U-sfRMGg

[53] Wikipedia Beta Coefficient : https://zh.wikipedia.org/wiki/Beta%E7%B3%BB%E6%95%B0

[54] Equity Risk Premium (ERP) : https://www.wallstreetprep.com/knowledge/equity-risk-premium/

[55] Capital Asset Pricing Model (CAPM) : https://www.wallstreetprep.com/knowledge/capm-capital-asset-pricing-model/

[56] Investing.com S&P 500: https://in.investing.com/indices/us-spx-500

[57] TradingView US companies with the highest dividend yields : https://www.tradingview.com/markets/stocks-usa/market-movers-high-dividend/

[58] Microsoft Cloud strength fuels third quarter results: https://news.microsoft.com/2024/04/25/microsoft-cloud-strength-fuels-third-quarter-results-3/

[59] Gartner Magic Quadrant for Primary Storage Platforms 2024: https://www.purestorage.com/resources/gartner-magic-quadrant-primary-storage.html

[60] Fortune Data Storage Market Size, Share & Industry Analysis: https://www.fortunebusinessinsights.com/data-storage-market-102991

[61] Key Insights for Gartner Magic Quadrant 2024 for Strategic Cloud Platforms: https://alnafitha.com/blog/key-insights-from-gartner-magic-quadrant-2024-for-cloud/

[62] Gartner Cloud Integrated IaaS and PaaS Solution Scorecard Comparison 2021: https://clouddecisions.gartner.com/a/scorecard/#/iaas-alibaba-vs-aws-vs-google-vs-ibm-vs-azure-vs-oracle

[63] Quora What is it like to work at SpaceX? : https://www.quora.com/What-is-it-like-to-work-at-SpaceX

[64] Gartner Critical Capabilities for Primary Storage 2023 : https://mp.weixin.qq.com/s/O5j1nNt3cqQT6RmG7wEy_g

[65] RackTop The Buyer’s Guide to Cyberstorage Features : https://www.racktopsystems.com/the-buyers-guide-to-cyberstorage-features/

[66] Gartner Top Trends in Enterprise Data Storage 2023: https://www.purestorage.com/resources/type-a/gartner-top-trends-enterprise-data-storage-2023.html

[67] Customer feedback from 6 storage system companies: https://mp.weixin.qq.com/s/Ri6pdeJ5-82pHBaGz-wlKw

[68] Hunan Province Provincial-level E-Government External Network Unified Cloud Platform Resource Supplement Project: https://mp.weixin.qq.com/s/S4-2XbFDp6qB-8S01qSEEw

[69] Xigua Brother Gartner Hype Cycle for Storage Technologies 2024: https://mp.weixin.qq.com/s/Ct5bq_QsF7Tu_r6bvqUSFg

[70] SmartX Hype Cycle for Storage and Data Protection Technologies, 2022 : https://www.smartx.com/blog/2022/08/gartner-hype-cycle-storage/

[71] Deep Data Cloud Gartner Hype Cycle for Data Management, 2023: https://zhuanlan.zhihu.com/p/656920047

[72] MarketResearchFuture Data Backup And Recovery Market Overview : https://www.marketresearchfuture.com/reports/data-backup-recovery-market-29073

[73] GrandViewResearch Enterprise Information Archiving Market Size: https://www.grandviewresearch.com/industry-analysis/enterprise-information-archiving-market-report

[74] Gartner Magic Quadrant for Enterprise Backup and Recovery Software Solutions: https://www.zen.com.my/wp-content/uploads/2024/01/Backup-Vendor-Magic-Quadrant-2023-1-1.pdf

[75] Veeam, Rubrik Lead in Enterprise Backup/Recovery Report : https://virtualizationreview.com/articles/2024/08/09/veeam-rubrik-lead-in-enterprise-backup-recovery-market-report.aspx

[76] Disaster Recovery with Cloud Recovery Assurance : https://www.appranix.com/resources/blogs/2023/07/disaster-recovery-with-cloud-recovery-assurance.html

[77] Gartner Magic Quadrant for File and Object Storage Platforms 2024: https://www.purestorage.com/resources/gartner-magic-quadrant-file-object-storage-platforms.html

[78] VMR Global Distributed File Systems and Object Storage Solutions Market By Type : https://www.verifiedmarketreports.com/product/distributed-file-systems-and-object-storage-solutions-market/

[79] MarketResearchFuture Global Cloud Object Storage Market Size: https://www.marketresearchfuture.com/reports/cloud-object-storage-market-4202

[80] Nasui Types & Volume of Files in the Enterprise : https://youtu.be/8FHihZvyFFM?si=KbiVDWqHfStLMiZU&t=330

[81] VMR Global Block Storage Software Market By Type : https://www.verifiedmarketreports.com/product/block-storage-software-market/

[82] GrandViewResearch Database Management System Market Size : https://www.grandviewresearch.com/industry-analysis/database-management-systems-dbms-market

[83] GrandViewResearch Cloud Database And DBaaS Market Size: https://www.grandviewresearch.com/industry-analysis/cloud-database-dbaas-market-report

[84] Gartner Magic Quadrant for Cloud Database Management Systems 2024: https://www.databricks.com/resources/analyst-paper/databricks-named-leader-by-gartner

[85] MarketResearchFuture Enterprise Flash Storage Market Overview: https://www.marketresearchfuture.com/reports/enterprise-flash-storage-market-31294

[86] MarketResearchFuture Tape Storage Market Overview : https://www.marketresearchfuture.com/reports/tape-storage-market-33976

[87] MarketResearchFuture Global Hard Disk Market Overview : https://www.marketresearchfuture.com/reports/hard-disk-market-8306

[88] MBA智库 Porter’s Five Forces Analysis Model: https://wiki.mbalib.com/wiki/%E6%B3%A2%E7%89%B9%E4%BA%94%E5%8A%9B%E5%88%86%E6%9E%90%E6%A8%A1%E5%9E%8B

[89] History of Zartbot DPU and Network Processors: https://mp.weixin.qq.com/s/BZOvVrg3GtTurMe2Q6ZIcg

[90] Andy730 Disruptive Innovation? Already Heard in the Storage Industry: https://mp.weixin.qq.com/s/NFQYEwrYCwKvTjpQdLkcQA

[91] LinkedIn Course Critical Thinking by Mike Figliuolo : https://www.linkedin.com/learning/critical-thinking

[92] Profitability Framework and Profit Trees The Complete Guide : https://www.craftingcases.com/profitability-tree-guide/

[93] LinkedIn Course Strategic Thinking by Dorie Clark : https://www.linkedin.com/learning/strategic-thinking

[94] LinkedIn Course Business Acumen by Mike Figliuolo : https://www.linkedin.com/learning/developing-business-acumen

[95] GetAbstract The Unspoken Truths for Career Success : https://www.getabstract.com/en/summary/the-unspoken-truths-for-career-success/46904

[96] LinkedIn Course Management Foundation by Kevin Eikenberry : https://www.linkedin.com/learning/management-foundations-2019



Create an Issue or comment below