Summer Break

We're taking a summer break. The Blog will resume in Sept2024. Best the summer!

Scars - A Sidebar Journey into LLM Philosphical Models


 


"Data is key to developing AI."(1)

OrbMB's mission is to help optimize compute and dataset production, so we watch progress in adjacent spaces. So learned that "Alignment" thought leader Jan Leike (2) resigned from OpenAI. 

Alignment is the structure of belief that: "We need scientific and technical breakthroughs to steer and control AI systems much smarter than us."(3)

Reviewing Leike's work led to the work of Mark Hutter (4) (who seeks to align maths, philosophy and particle physics); and this got me pondering the nature of social values that are at the heart of the philosophy of a training model build. 

Are LLM builders first designing the philosophy, to define the nature of the build, and only then starting the model build? If not, consider using a familiar analogy - the nature of schooling - the mechanism of the teaching that each of us experienced as children, and what results from the mechanism?

A) Heartless: Corporal Punishment = What results?
B) Thoughtful (Disciplined without punishment): Hybrid Training = ?
C) Heartful: Imaginative Play = ?


Last week at the gym, a buddy and me were joking. He is shredded; and I am not, and have scars; and joked that "One day, I will have six-pack scars." His response? "We are what our scars make us."

Everyone of us has lived all three methods mixed together.

Everyone of us carries those lessons along the course of our lives.

Could it be that a "philosophy of heartlessness" is the risk? Could it be that a "philosophy of heartfulness" is what will align self-aware human and self-aware AI values?


-----------------------------

(1) William “Bill” Streilein, 2023 Data, Analytics, and AI Adoption Strategy | 11 April 2024
https://www.youtube.com/watch?v=d4NcRQRwqIo  

(2) Jan Keike: https://jan.leike.name/

(3) https://openai.com/index/introducing-superalignment/ 

(4) Mark Hutter: http://hutter1.net/

 

 

 



What is your data worth? - Short & Todd (2017)

Comment: The previous articles are concerned with the economic value of data for statistical purposes. This article discusses internal business value. The gist of the approach suggested by Short & Todd is to use a triggering event to determine company need for valuation; with requires the need to have systems in place to permit valuation when demanded. Pre-positioning requires the allocation of resources to continuously improve data systems. Here, data is defined as intangible. 

Short, J.E. and S. Todd (2017): What is your data worth?, MIT Sloan Management Review, Vol. 58, No.3. , https://sloanreview.mit.edu/article/whats-your-data-worth/ Reprinted here: https://oag.ca.gov/sites/all/files/agweb/pdfs/privacy/short-whats-your-data-worth.pdf [James Short, Ph.D.: Lead Scientist, San Diego Supercomputer Center (SDSC); Steve Todd: Fellow & VP Strategy and Innovation at Dell EMC/Dell Technologies]


    Part (1) Describes the market impact of data assets
    Part (2) Explores the methods of market data valuation
    Part (3) Constructs a framework for valuing data
    Part (4) Suggests a path forward, circa 2017

Part (I) (p.17): Introduction - Uses two instances where companies needed to determine data valuations, to illuminate the impact of knowing the value of a company's data (Microsoft purchasing LinkedIn; the Chapter 11 bankruptcy proceedings of Caesars Entertainment Corp).

Part (2) (p.17-18): "Exploring Data Valuation" - Discusses project activities: Interviews and research into 36 North American and European companies and nonprofit organizations. The interviewees spanned several sectors, and most earned US$$1 billion+. They discovered that most were focused on managing big data; not valuation; and used the discovery knowledge to determine the business impact of data assets, by:

  • Interviewing "chief financial and marketing officers and, in the case of regulatory compliance, legal officers"; and
  • Identifying significant business events triggering the need for data valuation, such as mergers and acquisitions, bankruptcy filings, or acquisitions and sales of data assets.

As is now (2024) widely known: Every company was overwhelmed with data, the volume of stored data "was growing on average by 40% per year"; that teams were hard pressed to manage their data assets because it is "time-consuming and complex"; and this was placing extreme "on management to know which data was most valuable."

Part (3) (p.18-19): "A Framework for Valuing Data" - The authors used research results to classify business data as a composite of three sources of value:

  1. Data as Strategic Asset (asset value or stock value) - "Monetizing data assets means looking at the value of customer data"; i.e. using customer data to generate monetary value either directly (sell, trade, acquire) or indirectly (data not sold; availability of data used to create a new product or service).
  2. The Value of Data in Use (activity value) - the impact of the cost to access, use and frequently use data; with the additional impact that "data has the potential...to increase in value the more that it is used";
  3. Expected Future Value (the determination of value for recording on balance sheets) - as intangible assets that are "co-mingled with other intangible assets, such as trademarkets, patents, copyrights, and goodwill."

Part (4) (p.19): "What Can Companies Do?" - Moving forward, the authors suggest "three practical steps" to improve company practices:

  1. Make valuation policies explicit and shareable
  2. Build in-house data valuation expertise
  3. Choose a top-down or bottom-up metadata control process

Top-down approach: 

  1. Identify critical applications
  2. Assign a value to the data used in critical applications
  3. Defining the main system linkages (systems connecting systems' data flows)
  4. Use 1-2-3 to develop internal IT and business unit partnerships
  5. Use 1-2-3-4 to develop a prioritizing system

Bottom-up approach - Define value heuristically:

  1. Create a map of data usage across core data sets
  2. Assess data flows and linkages
  3. Produce a detailed usage patterns analysis


Notes and analysis by blogger. Image: Pxhere. CC0: 114437 

Labels: Dimensional;Company;Valuation;Data;Metadata;Control



Summer Break

We're taking a summer break. The Blog will resume in Sept2024. Best the summer!