What is your data worth? - Short & Todd (2017)

Comment: The previous articles are concerned with the economic value of data for statistical purposes. This article discusses internal business value. The gist of the approach suggested by Short & Todd is to use a triggering event to determine company need for valuation; with requires the need to have systems in place to permit valuation when demanded. Pre-positioning requires the allocation of resources to continuously improve data systems. Here, data is defined as intangible. 

Short, J.E. and S. Todd (2017): What is your data worth?, MIT Sloan Management Review, Vol. 58, No.3. , https://sloanreview.mit.edu/article/whats-your-data-worth/ Reprinted here: https://oag.ca.gov/sites/all/files/agweb/pdfs/privacy/short-whats-your-data-worth.pdf [James Short, Ph.D.: Lead Scientist, San Diego Supercomputer Center (SDSC); Steve Todd: Fellow & VP Strategy and Innovation at Dell EMC/Dell Technologies]


    Part (1) Describes the market impact of data assets
    Part (2) Explores the methods of market data valuation
    Part (3) Constructs a framework for valuing data
    Part (4) Suggests a path forward, circa 2017

Part (I) (p.17): Introduction - Uses two instances where companies needed to determine data valuations, to illuminate the impact of knowing the value of a company's data (Microsoft purchasing LinkedIn; the Chapter 11 bankruptcy proceedings of Caesars Entertainment Corp).

Part (2) (p.17-18): "Exploring Data Valuation" - Discusses project activities: Interviews and research into 36 North American and European companies and nonprofit organizations. The interviewees spanned several sectors, and most earned US$$1 billion+. They discovered that most were focused on managing big data; not valuation; and used the discovery knowledge to determine the business impact of data assets, by:

  • Interviewing "chief financial and marketing officers and, in the case of regulatory compliance, legal officers"; and
  • Identifying significant business events triggering the need for data valuation, such as mergers and acquisitions, bankruptcy filings, or acquisitions and sales of data assets.

As is now (2024) widely known: Every company was overwhelmed with data, the volume of stored data "was growing on average by 40% per year"; that teams were hard pressed to manage their data assets because it is "time-consuming and complex"; and this was placing extreme "on management to know which data was most valuable."

Part (3) (p.18-19): "A Framework for Valuing Data" - The authors used research results to classify business data as a composite of three sources of value:

  1. Data as Strategic Asset (asset value or stock value) - "Monetizing data assets means looking at the value of customer data"; i.e. using customer data to generate monetary value either directly (sell, trade, acquire) or indirectly (data not sold; availability of data used to create a new product or service).
  2. The Value of Data in Use (activity value) - the impact of the cost to access, use and frequently use data; with the additional impact that "data has the potential...to increase in value the more that it is used";
  3. Expected Future Value (the determination of value for recording on balance sheets) - as intangible assets that are "co-mingled with other intangible assets, such as trademarkets, patents, copyrights, and goodwill."

Part (4) (p.19): "What Can Companies Do?" - Moving forward, the authors suggest "three practical steps" to improve company practices:

  1. Make valuation policies explicit and shareable
  2. Build in-house data valuation expertise
  3. Choose a top-down or bottom-up metadata control process

Top-down approach: 

  1. Identify critical applications
  2. Assign a value to the data used in critical applications
  3. Defining the main system linkages (systems connecting systems' data flows)
  4. Use 1-2-3 to develop internal IT and business unit partnerships
  5. Use 1-2-3-4 to develop a prioritizing system

Bottom-up approach - Define value heuristically:

  1. Create a map of data usage across core data sets
  2. Assess data flows and linkages
  3. Produce a detailed usage patterns analysis


Notes and analysis by blogger. Image: Pxhere. CC0: 114437 

Labels: Dimensional;Company;Valuation;Data;Metadata;Control



A Review of Data Valuation Approaches - Fleckenstein, Obaidi & Tryfona (2023)

Comment: A valuation approach guided by application of professional best judgment, guided by the absence of a repeatable, scalable standard to measure the value of data.

Fleckenstein, M., Obaidi, A., & Tryfona, N. A Review of Data Valuation Approaches and Building and Scoring a Data Valuation Model, Harvard Data Science Review, 5(1). https://doi.org/10.1162/99608f92.c18db966 & https://hdsr.mitpress.mit.edu/pub/1qxkrnig/release/1 MITRE Corporation: Approved for Public Release. Distribution Unlimited. Public Release Case Number: 21-3464.

The authors report that there is increasing desire to treat data as an asset "in both the private and public sectors...However, this remains a challenge, as data is an intangible asset. Today, there is no standard to measure the value of data." Much like Azcoitia, it is this team's view that there is no: "repeatable approach to data valuation:" The use case will define selection of the methods that are used to determine value.







  • Part (1) Introduces current practice    
  • Part (2) Reports classification of data valuation models
  • Part (3) Reports an assessment of the model classes
  • Part (4) Reports the results of test case analysis
  • Part (5) presents Conclusions and References.

Part (I): Discusses three overlapping approaches to valuation: Business (P&L), Public Goods (Government/Non-profit), Dimensional (Attributes of Value).

Part (2): Data Valuation Framework & Part 3: Model Details: The authors studied different methods, spanning more than 40years; thence grouping the methods into three classes:  

1) Market-based models (estimates of cost and revenue): "The market-based model values data based on income (e.g., selling data), cost (e.g., buying data), and/or stock value (e.g., value of data-intensive organizations). Organizations routinely buy and sell data and data-intensive companies."

2) Economic models (estimates of economic and public benefit): "The economic model values data in terms of its economic impact. This model is frequently used by governments to assess the value of publicizing data. For example, governments share weather data, which helps sustain an ecosystem of weather forecasting."

3) Dimensional models (using categories or dimensions): "The dimensional model values data by assessing attributes inherent to a data set (e.g., data volume, variety, and quality) as well as the context in which data is used (e.g., how the data will be used and integrated with other data). For example, organizations inherently decide to acquire, keep, or prioritize one of several similar but different data sets. To date, this is an informal process."


The researchers note that the models are not fit-for-purpose for all use cases, are speculative, can overlap, and can be influenced by factors other than the data itself. Figure 1 is a Venn Diagram that helpfully diagrams the overlap of the data classes that are included in the taxonomy. 

Part (3): The authors review the strengths and weaknesses of each class of model.

    1) Market-based: Sec.2.3 & 3.1
    2) Economic: Sec.2.4 & 3.2
    3) Dimensional: Sec.2.5 & 3.3

The authors note there is no single method to deliver a standard valuation. The type must be selected to fit the use case. The value-add of the approach is that many methods may have to be used in a "framework, with each use case leveraging one or more models" to build a multi-dimensional estimate.

Part (4): Building and Scoring a Dimensional Data Valuation Model: Here is detailed the building and scoring of a dimensional data valuation model. The model was used to define dimensions to assess the value of two use cases. The goal "was to design an easy-to-use, customizable approach that helps organizations assess the value of specific data sets for specific use cases using a small, consistent set of dimensions." This method uses "professional data management experience;" and "in the use case of "flight scheduling and navigation data, we vetted the results with the data set owners."

Part (5): Conclusions and References: The authors' report developing "an easy-to-use, repeatable model to value data for two use-cases;", where the model combines (a) dimensional analysis; (b) "professional data management experience;" and (c) for one of the two use cases, review of results by the data owners. 

The authors conclude that the dimensional approach "can be used effectively to compare two similar data sets or to evaluate the addition of a data set to an existing data pool...(but) falls short of being able to value data in monetary terms;" and will likely require use of the other model types to fully develop a valuation.

References included.

Notes and analysis by blogger. Image: Pxhere. CC0.

Labels: Dimensional;Economic;Market-based;Valuation;Data quality


 

Navigating the Pricing Conundrum - Azcoitia, Iordanou, and Laoutaris (2021)

Comment: Recommended. An independent analysis of pricing schema for data products and services.

(2021) S. A. Azcoitia, C. Iordanou, and N. Laoutaris, "Measuring the Price of Data in Commercial Data Marketplaces," in DE '22: Proceedings of the 1st International Workshop on Data Economy, December 2022, 1–7, https://doi.org/10.1145/3565011.3569053 (Accessed Q12024).

The work became a pre-print for: S. A. Azcoitia, C. Iordanou and N. Laoutaris, "Understanding the Price of Data in Commercial Data Marketplaces," 2023 IEEE 39th International Conference on Data Engineering (ICDE), Anaheim, CA, USA, 2023, pp. 3718-3728, doi: 10.1109/ICDE55515.2023.00300.

The Azcoitia, Iordanou, and Laoutaris study (2023) reported earlier is a detailed exploration into the complexities of figuring out how much data is worth in the ever-changing world of commercial data marketplaces. The research delves into the journey of data from creation to trade. This study examines the pricing structures of data products, services and marketplaces.
  • Part (1) introduces the data marketplace ("DM").
  • Part (2) & (3) reports the mechanisms of pricing of data products.
  • Part (4) deep dives into the marketplace for AWS data.
  • Part (5) compares data products across marketplaces.
  • Part (6) dives into the features driving pricing.
  • Part (7) & (8) presents related works, conclusions and future work plans.

Part (1) introduces the problem of developing "Data-driven decision making powered by ML algorithms;" in this study using two data marketplaces to develop a transfer pricing study mechanism.

Part (2) reports the mechanisms of pricing of 10,772 data products, ranging from one-off purchases to telecoms, manufacturing, and gaming data. The researchers interestingly note that a majority of data products are pricing from "direct negotiation between the seller and interested buyers." 

It is very much the picture of an age-old story. Bartering to determine value on the spot.

Part (3) & (4) reports the details of the market for data products and services; using the AWS ecosystem to explore a marketplace. Part 2.1 dives into the current vendor trifecta: Data Providers, Data marketplaces, and Personal Information Management Systems. The section discusses the entities uncovered by the team; and drills down into the characteristics of a sample of each vendor class. The section charts market share of each class. The geographic spread demonstrates US dominance of available data. Of these, "4,162 products from 443 distinct providers provided clear information about their prices" which led to the assessment that the median price is US$1,417 per month. Pricing ranges from free to $500,000; with "one-third of all data products, including targeted market data and reports for example,...sold for US$2,000-5,000 per month."

Part (5) & (6)
continues the exploration of data marketplaces; by developing a pricing concordance and methodology "to build different classifiers to help us compare data products between the two DMs including more price references, namely DataRade (destination DM) and AWS (source DM)." 

The authors used the classification schema to compare pricing distribution of two categories-‘Financial’ and ‘Retail, Location and Marketing’ data products; from this work concluding that "it is mostly ‘what´, as captured in product description and categories, and ‘how much´ data is being traded that determine the prices of data products." 

Therefore, it is data quality evaluation and price--the value determined by the user determining what they want--that determines the value--therefore price--of data products.

Part (7) & (8) concludes with the note that this analysis "is, to the best of our knowledge, the first empirical measurement study that deals with the prices of data products sold in commercial data marketplaces." Further, that "the lack of empirical data around dataset prices is considered as a key challenge in data pricing research."

References section: Includes discussion of the methodology created to construct the analysis. 

Notes and analysis re-written with the assistance of a paid ChatGPT account. Image: Pxhere. Public Domain.

Labels: Costs;Biological system modeling;Ecosystems;Pricing;Data engineering;Data models;Telecommunications;Data economy;data marketplaces;measurement;data pricing.

What is your data worth? - Short & Todd (2017)

Comment: The previous articles are concerned with the economic value of data for statistical purposes. This article discusses internal busi...