Overview: This is a fascinating contribution to the literature. The discussion about content ("Information Lens") driving utility is especially useful to understand where and how to determine when to apply cost/benefit analyses. The authors' proposed framework clearly expresses the need to incentivize private innovation to concurrently aid the public good - a partnership of private, non-profit and public stakeholders.
Here, data is defined as intangible; and is generally viewed as a homogeneous economic good (Note 1); this leads to the view that valuation is not that important because market pricing sets the value.
The authors note the absence of costing because “available empirical studies use market valuations or transactions” to estimate value. These authors note that “the value of different types…can be very different”; and value is tied to the use case (what is the utility?), which they contend leads to a need to determine utility (public, social, private) in which there is a public interest. Finally, the authors call for greater regulation to obtain greater “social welfare value” and to prevent private asymmetric information advantage. But recognize that regulation will affect ROI due to the need to collect and clean data, and to invest to develop complementary skills and assets.
Note: The degree of not-sharing is often key to the profitable continuance of a private enterprise - Ed.
Coyle, D., S. Diepeveen, J. Wdowin, J. Tennison, and L. Kay, (2020) The Value of Data – Policy Implications – Main Report. Bennett Institute for Public Policy, University of Cambridge, and Open Data Institute, February 2020: https://www.bennettinstitute.cam.ac.uk/ publications/value-data-policy-implications/ (Accessed Q12024)
Next month, we review Coyle et al (2022).
____________________
- Part (1) Introduces the public policy interest in data value
- Part (2) Describes current valuation taxonomies and develops a two-lens framework, commencing with the Economic Lens
- Part (3) Deep dives into the second lens: the Information Lens
- Part (4) Provides an overview of the current UK legal framework
- Part (5) Discusses the features driving Market-based Valuation
- Parts (6,7,8) Delve into three related issues
- Part (9) Presents conclusions and recommends future stakeholder work plans
____________________
Part (1) Introduction: The subject of this paper is to discuss the policy interest in data value; and to develop a schema to determine the value of data that is made available for public purposes that encompass all sectors of society.
(A) The policy interest has two dimensions:
(a) Governments need to use data to make policy decisions; and
(b) Governments need to understand the value of the transaction of sharing (or not sharing) data, in the context of:
(i) Understanding the value (and impact) of data transactions to government and society, and
(ii) the worrisome economic problem of private actors mining public data for private purposes at no cost (i.e. taking it for free to modify & resell or to restrict the sale of the improved resource to a limited clientele) [Similar to other State-resources that are licensed to extractors, who require return-on-investment to bear the cost of bringing the refined resource to market - ed.].
(B) The schema to determine data value consists of two lenses: Economic and Information (Note 2).
Part (2) Taxonomies and a new framework: This section describes current taxonomies of valuation; and then discusses part one of the new framework: the Economic Lens (The distinctive economic characteristics of data)
This section introduces the reader to the economic problem of valuing data, a resource which arises from "the creation of value from data of different kinds, its capture by different entities, and its distribution". There is a review of the intangible nature of the resource, high-level use cases, and the impact of externalities (impacts to data which "are often positive, such as additional data improving predictive accuracy, or enhancing the information content of other data": p.5).
Data is described as a non-rival asset (the same asset can be used by many users) which can be “closed, shared, or open. If access to data is restricted, its uses are limited; i.e. it becomes a private good. If it is shared with a select group of people - it is a club good - its uses and analysis can be wider, perhaps creating more value. If data is shared openly - a public good - anyone can use it.” [The economic actors who participate in the life of the State each can individually control data as a private good, group good, and public good. This includes the State, where-for example-records that are restricted are variously a private good, group good, and public good. - ed.]
The authors note that the unique ("intangible") nature of data gives it low utility if not shared, and therefore must be shared or licensed to make good use of the resource. The economic perspective expressed here is that data "is not best thought of as owned or exchanged;...that personal ‘ownership’ is an inappropriate concept for data (and that characterising data as ‘the new oil’ is similarly misleading)." [Note: this is for national statistical accounts and public services; “ownership” and “exchange” is how value is expressed by private buyers and sellers - ed].
The second part is to develop five new characteristics, to develop an improved valuation schema: (a) Marginal return; (b) Externalities; (c) Optionality; (d) Consequences; (e) Costs.
Part (3) discusses part two: the Information lens (the determination of 'economic utility' value):
The Information Lens is defined as
the determination of utility to express economic value. Use
cases are organized to fit five subject areas, and then five sub-frameworks:
- (1a) People
- (1b) Organization Type
- (1c) Natural environment
- (1d) Built environment; and
- (1e) the type of goods or services being offered.
(2) Use Cases are organized into five sub-frameworks, and there is an example table:
- (2a) Generality of the use case (as a non-rival good, data is repeatedly useful for many purposes; "Generality" is a determination of use case repeatability);
- (2b) Granularity (this is the degree to which data is "filtered, aggregated and combined in different ways to reveal different insights" p.8)
- (2c) Geo-spatial coverage: the area that data refers to limits or develops its utility.
- (2d) Temporal coverage: data utility is time-bound;
- (2e) Human User Groups: This sub-schema organizes human users into three groups: Planners (ex. city planner, Operators (commuters), and Historians (police investigating a crime).
The section continues with a review of technical characteristics that contribute to valuations. These include: Quality (p.10), Sensitivity and personal data (p.10), Interoperability and linkability (developing standards and common identifiers to improve aggregation) (p.10), Excludability (data that is not common and so much be actively shared (p.11), and Accessibility (the degree to which data dissemination is organized and executed) (p.11).
Here, the Open Data Institute's Data Spectrum is charted to identify "some of the access conditions determining whether data is a private, shared, or public good. Access conditions can be determined by technology, licensing or terms and conditions, and regulation." (p.12).
Part (4) provides an overview of the current UK legal framework,
which includes Intellectual property rights and licensing (p.14), Intellectual
property rights in public sector information (p.15), and Data protection rights
(p.16).
Part (5) dives into the features driving Market-based valuation of
data.
The primary methods are to use Market Price methods to estimate value:
(a) Stock market valuations;
(b) Income-based;
(c) Cost-based valuation methods.
Method (a) is to compare data-driven companies and non-data-driven companies; analyses suggest that the former have become more valuable that the latter.
Method (b) is to make an estimate of future cash flows (Free Cash Flow or FCF) derived from the asset (Note 3) where the "data value chain" has been developed to visualize this method. The FCF method can be successfully exploited by e-commerce companies such as Amazon that enjoy feedback loops; where insights drawn from data help improve the customer experience, which improves per customer unit sales. Note: Many companies in the space obtain customer data as a free good (not paid for).
Method (c): The public sector's cost approach is to estimate "the aggregate value of data to the economy in the national accounts, as there are relatively few market sales of datasets, with most being generated within the business in the process of providing other goods and services."
Parts (6,7,8) Delve into existing non-market estimates, Creating value through open and shared data, and Institutions for the data economy.
The authors note that: "Market valuations thus provide useful information but do not capture the full social value of data." Their contention is that-- subject to trade-offs' analysis--it would be economically better to create value through open data sharing. Here, citing O'Neill, the authors note that "the real or perceived crisis of trust in many societies reflects suspicion of authority" which—to this reader—suggests that the survival of a system of government that is trusted requires honesty and integrity in data management per "the social and legal ‘permissions’" of the society.
Trade-offs to consider are the need to:
- incentivize investment and innovation,
- maintain data security, and the related need to
- protect personally and commercially sensitive
data.
It is hard to develop and adhere to this form of social navigation. The authors suggest three frameworks:
- Elinor Ostrom’s framework for the management of shared resources: [chart: Ostrom’s principles]
- Paying new attention to creating proper Data Infrastructure (vs. leaving it all to private actors);
- Establishing data trusts, data pools, and brokerages (to create a level playing field of data access).
Part (9) presents conclusions and future work plans.
The
authors note that “the quality of data is a key challenge” in-and-of-itself;
and that the “quality needed depends on what the data is being used for.” The authors also state that: “Asymmetries of information mean that contracts for data use are
incomplete, and the regulatory framework should recognise this, particularly
that schemes for sharing data in a regulated way change the returns on
investment in collecting and cleaning data, and investing in complementary
skills and assets.” In this context, the authors express the need to develop
different use cases to determine utility (public, social, private) in which
there is a public interest; whilst specifying the need to incentivize private
investment to efficiently deliver data goods and services for every sector of
society. The proposals here are to:
· Incentivise investment without disincentivising sharing
· Limit exclusive access to public sector data
· Use competition policy to distribute value
· Explore mandating access to private sector data
· Provide a trustworthy institutional and regulatory environment
· Simplify data regulation and licensing
· Monitor impacts and iterate
----------------
(1) The definition is derived from the UN System of National Accounts or SNA.
(2) "Lens analysis requires you to distill a concept, theory, method or claim from a text (i.e. the “lens”) and then use it to interpret, analyze, or explore something else" cf.: https://pressbooks.cuny.edu/qcenglish130writingguides/chapter/lens-analysis/