Knowledge Graph Notes

One major advantage of the graph abstraction allows us [21st Century Humans] to gain quick insights about the flowing data of our systems. Mainly, knowledge graphs provide us with higher order abstractions. The abstractions provide an interface between our queries (and natural language questions) and our data.

Interesting pages:

– Königsberg (Ch 1. Page 3)

– The Property Graph Model (Ch 1. Page 4)

– Multiple Layers of Graphs (Ch 2. Page 19)

– Sophisticated Layering of Ontologies and Taxonomies (Ch 2. Page 22)

– Metadata and Data Layers (Ch 3. Page 29)

– The ‘Data Fabric’ Pattern (Ch 3. Page 31)

– Popular Use Cases for Actioning Knowledge Graphs ( Ch 3. Pages 35-36)

– A Simple Data Ingest Pipeline Pattern (Ch 3. Page 34)

– Anatomy of Decisioning Knowledge Graphs (Ch 4. Page 43)

– ML Workflows for Graphs (Ch 4. Page 48)

– Decisioning Knowledge Graph Use Cases (Ch 4. Pages 50-51)

– Explainability (Ch 5. Page 57)

– Contextual AI Example Data Model of Simplified Smart Home (Ch 5. Page 58)

– Chatbot System Diagram (Ch 5. Pages 65-66)

– Advanced Patterns (Ch 6. Page 72)

– Digital Twin Examples (Ch 6. Pages 74-75)

– More Resources (Ch.7 Page 78)

Key Takeaways:

Building abstractions correctly solves use cases:

For example, create a custom ontology for the recommendation layer of a knowledge graph.

– Shift from ‘what items exist in my graph?’ to ‘how do I find similar items to the one that is missing?’

– Shift from descriptive system to assistive system.

“For example, if our ecommerce retailer ties its product hierarchy into stock control data through another ontological layer, it has a way of offering other good choices to the user when the current item is out of stock or to recommend products that have better margins. All of this comes at the modest cost of writing down how the business works as a machine-readable ontology.” (Ch 2. Page 22)

Build a separate graph ‘layer’ that describes metadata of the primary graph data:

For example, describe the source of the data and the related authors, data curators, and context, etc.

“Importantly, data architects can implement this technique in a noninvasive manner with respect to the source systems containing customer data, by building it as a layer above those systems. A popular example of this is to build a knowledge graph of metadata that describes data residing separately in an otherwise murky data lake.” (Ch 3. Page 29)

Examine the 2nd Degree connections when attempting to understand social data:

For example, understand subgraphs of 2nd Degree connections to create a “Single Encompassing View” or a total description of an object (likely a person, but sometimes something else: location, product, event, or general object, etc.)

“It’s worth noting at this point that beyond helping with discovery and exploration, [social] relationships are highly predictive … However, it is remarkable that a researcher can make this prediction even more accurately based on our friends-of-friends behavior … [and] the behavior of our friends-of-friends, whom we may not know that well or at all, is more predictive of our behavior than information that pertains only to us. For more information on the science underlying social graphs, see Connected by James Fowler and Nicholas Christakis (Little, Brown and Company, 2009).” (Ch 4. Page 42)

Graph Queries and/or Graph Algorithms:

Depending on use case:

– Graph queries are useful in determining a subgraph with a particular shape (see: SubGraph[‘the enemy of my enemy is my friend’])

– Graph algorithms are useful for applying a metric globally across the entire graph (see: ‘PageRank algorithm’ for influential nodes)

“The most well-known graph algorithms fall into five classic categories:

• Community detection for finding clusters or likely partitions
• Centrality for determining the importance of distinct nodes in a network
• Similarity for evaluating how alike nodes are
• Heuristic link prediction for estimating the likelihood of nodes forming a relationship
• Pathfinding for evaluating optimal paths and determining route quality and availability” (Ch 4. Page 46)

Streaming data and augmented imagination:

For example: interactivity requires different levels of abstraction where algorithms run at different speeds based on technological constraints

“It’s important to understand that interactive speeds prohibit [the] use of global algorithms and ML training … on a per-request basis. Instead, [they] operate on a different cadence to real-time queries. The more expensive processing runs in the background, continuously enriching the actioning knowledge graph, while real-time queries get better results over time as the underlying knowledge graph improves.” (Ch 4. Page 51)

Simulation and Forecasting:

For example: The ideas surrounding the concept of a digital twin, where that twin is a graph (represented in, at least, business and/or game theory use cases)

“The enterprise digital twin provides key values:
• A map of the business or a smaller business unit (departmental) or layer (such as IT) of the business
• Real-time understanding of the business
• A way of exposing the model to pressures to see how it reacts

With a digital twin acting as a smart map, it is possible to create a rich virtual view of the business that closely matches reality. Our organizing principle follows accordingly: the elements in the real world and how they interrelate are captured in high fidelity as a property graph, and the constraints and rules that govern the real world become constraints and queries for that property graph.” (Ch 6. Page 70)

Thought-provoking Idea:

The reality of our technological context shows us how our advantages in the short run often signify meager advantages in the long run. The potential good news, if we are the underdog, is that we only need to be more innovative through our imagining of better systems and techniques. Once one becomes King, though, our competitive technological supremacy only lasts until an opponent makes the next better advancement. There’s a technological hyperinflation, and investing in our future appears to be the only way to stay relevant.

Ideals

Socially

Friends who demonstrate kindness, curiosity, motivation, intelligence, and open-mindedness

[People] whom one knows and with whom one has a bond of mutual affection, typically exclusive of sexual or family relations who demonstrate the quality of being friendly, generous, and considerate, a strong desire to know or learn something, the reason or reasons one has for acting or behaving in a particular way, the ability to acquire and apply knowledge and skills, and the willingness to search actively for evidence against one’s favored beliefs, plans, or goals, and to weigh such evidence fairly when it is available

Peers continue to progress in their goals

A person of the same age, status, or ability as an[y] other specified person continue(s) to progress in their ambition or effort; an aim or desired result

Society valuing each member and non-frivolous activities

The aggregate of people living together in a more or less ordered community valuing each person, animal, or plant belonging to [the aggregate] and [valuing] things, not lacking importance, seriousness, or a sound basis, that a person or group does or has done

Physically

Located in a comfortable, peaceful, and productive area

Located in an [area] which provid[es] physical ease and relaxation, free from disturbance, and br[ings] forth [a product]

Access to resources and facilities that enable fulfillment

Access to a stock or supply of money, materials, staff, and other assets that can be drawn on by a person or organization in order to function effectively and a place, amenity, or piece of equipment provided for a particular purpose that enable the achievement of something desired, promised, or predicted

Space that embody the ethos of the people in it

The dimensions of height, depth, and width within which all things exist and move embod[ies] the nature, disposition, and customs, of those without special rank or position in society; the populace in [the dimension].

Technology

Progress-driven, yet measured, use of technology

A focus on forward or onward movement toward a destination, yet carefully considered; deliberate and restrained., use of ‘systematic treatment’

Communication systems striving for mutual knowledge and shared understanding

A collection of individual communications networks, transmission systems, relay stations, tributary stations, and data terminal equipment (DTE) usually capable of interconnection and interoperation to form an integrated whole striving for facts, information, and skills acquired by each of two or more parties toward the other or others through experience or education; the theoretical or practical understanding of a subject and perception or judgment of a situation distributed between members of a group.

Fundamental respect for consensual use and development of new techniques

Forming a necessary base or core; of central importance with regard for the feelings, wishes, rights, or traditions of others for deploying (something) as a means of accomplishing a[n] agreed and ‘felt together’ purpose and a specified state of growth or advancement of novel approaches of carrying out a particular task, especially the execution or performance of an artistic work or a scientific procedure

Personally

Working towards common goals of the society I belong with

To do things that make progress to an aim or desired result that is distributed between members of the aggregate of people living together in a more or less ordered community [that] I belong with

Ability to focus my attention on projects I find important

Possession of the means or skill to do something to pay particular attention to throw forth, plan, and cause to move forward something prominent [that] I find of great significance or value; [and] likely to have a profound effect on success, survival, or well-being.

Variety of subjects that further my understanding and contribute to frequent unique experiences

The quality or state of being different or diverse; the absence of uniformity, sameness, or monotony of a person or circumstance giving rise to a specified feeling, response, or action that helps the progress or development of [my] power of abstract thought; [or to] progress intellect; and [to] help to cause or bring about a particularly remarkable, special, or unusual event or occurrence that leaves an impression on someone (me) occurring or done on many occasions, in many cases, or in quick succession

Definitions

Preference on the systems’ agents’ qualities exhibited (qualities of the agents of the systems)

  • Kindness – the quality of being friendly, generous, and considerate
  • Curiosity – a strong desire to know or learn something.
  • Motivation – the reason or reasons one has for acting or behaving in a particular way
  • Intelligence – the ability to acquire and apply knowledge and skills.
  • Open-mindedness – the willingness to search actively for evidence against one’s favored beliefs, plans, or goals, and to weigh such evidence fairly when it is available
  • Non-frivolous – not lacking importance, seriousness, or a sound basis
  • Comfortable – providing physical ease and relaxation
  • Peaceful – free from disturbance
  • Productive – early 17th century: from French productif, -ive or late Latin productivus, from product- ‘brought forth’, from the verb producere
  • Fulfillment – the achievement of something desired, promised, or predicted
  • Progress-driven – a focus on forward or onward movement toward a destination
  • Measured – carefully considered; deliberate and restrained.
  • Consensual – mid 18th century: from Latin consensus ‘agreement’ (from consens- ‘felt together, agreed’, from the verb consentire ) + -al.
  • Important – of great significance or value; likely to have a profound effect on success, survival, or well-being.
  • Variety – the quality or state of being different or diverse; the absence of uniformity, sameness, or monotony.

Quantifiable entities

  • Friends – a person whom one knows and with whom one has a bond of mutual affection, typically exclusive of sexual or family relations.
  • Peers – a person of the same age, status, or ability as another specified person.
  • Society – the aggregate of people living together in a more or less ordered community.
  • Members – a person, animal, or plant belonging to a particular group.
  • Resources – a stock or supply of money, materials, staff, and other assets that can be drawn on by a person or organization in order to function effectively.
  • Facilities – a place, amenity, or piece of equipment provided for a particular purpose.
  • Space – the dimensions of height, depth, and width within which all things exist and move.
  • People – those without special rank or position in society; the populace.
  • Technology – early 17th century: from Greek tekhnologia ‘systematic treatment’, from tekhnē ‘art, craft’ + -logia (see -logy).
  • Communication systems – a collection of individual communications networks, transmission systems, relay stations, tributary stations, and data terminal equipment (DTE) usually capable of interconnection and interoperation to form an integrated whole
  • New Techniques – novel approaches of carrying out a particular task, especially the execution or performance of an artistic work or a scientific procedure.
  • Ability – possession of the means or skill to do something
  • Focus – to pay particular attention to
  • Attention – notice taken of someone or something; the regarding of someone or something as interesting or important.
  • Subjects – a person or circumstance giving rise to a specified feeling, response, or action.

Motivational aspects for an individual or group

  • Goals – the object(s) of a person’s ambition or effort; an aim or desired result
  • Activities – a thing that a person or group does or has done.
  • Projects – late Middle English (in the sense ‘preliminary design, tabulated statement’): from Latin projectum ‘something prominent’, neuter past participle of proicere ‘throw forth’, from pro- ‘forth’ + jacere ‘to throw’. Early senses of the verb were ‘plan’ and ‘cause to move forward’.

Fundamental ‘precepts’ emerge from these notions

  • Ethos – mid 19th century: from modern Latin, from Greek ēthos ‘nature, disposition’, (plural) ‘customs’.
  • Mutual Knowledge – facts, information, and skills acquired by each of two or more parties toward the other or others through experience or education; the theoretical or practical understanding of a subject.
  • Shared Understanding – perception or judgment of a situation distributed between members of a group
  • Fundamental respect – forming a necessary base or core; of central importance with regard for the feelings, wishes, rights, or traditions of others
  • use and development – take, hold, or deploy (something) as a means of accomplishing a purpose or achieving a result; employ; and a specified state of growth or advancement.
  • Working towards common goals – to do things that make progress to an aim or desired result that is distributed between members of a group
  • contribute to – help to cause or bring about
  • further my understanding – help the progress or development of my power of abstract thought; progress intellect;
  • frequent unique experiences – a particularly remarkable, special, or unusual event or occurrence that leaves an impression on someone occurring or done on many occasions, in many cases, or in quick succession

Ethical Social Forecasting

Part 1

I’m writing about the ethics (and maybe the legality, too) of social dynamics and forecasting. Also, behavioral dynamics is a related concept, but I’m not going to actively write about this. Some general guidelines that I paraphrased from Duke:

1) Be honest and humble in the presentation and organization of models and tools
2) Defend models that are honest and humble, as incomplete forecasts provide a better sense of the uncertain future than missing forecasts
3) Be kind

Part 2

I wrote this section about an hour after I went down the rabbit hole at the end of Part 2, but I thought it serves as a good introduction…

But, there are some other concerns that I’m exploring, and for that I’m sharing my notes about this blog post. Instability Forecast ModelsIFMs – are used to determine political and economic (human-caused) disasters, but they more generally predict social behavior. The ethics of treating them as ‘social weather’ forecasting instruments might be problematic. One reason might be the reflexive side of social weather which contrasts to a hurricane continuing to head towards Florida – the natural weather forecast doesn’t change the path it takes. Schrodt appears to be thinking about how social and weather models relate to each other, and poses quite a few ethical, pragmatic, and social questions.

Something to read for later, or now if you (or me) has time: http://eventdata.parusanalytics.com/papers.dir/Schrodt.PRL.2.0.pdf

It’s absolutely a cross between Social and Computational Sciences and is titled PATTERNS, RULES AND LEARNING: COMPUTATIONAL MODELS OF INTERNATIONAL BEHAVIOR (Philip A. Schrodt, University of Kansas)

A fun line from the preface (page ix): “Computer scientists seem to find it more useful than political scientists…”

I started to write about the dilemmas presented on Schrodt’s blog, but then I went down a rabbit hole. I was curious about Schrodt, and so I researched his background. Definitely worth researching him on your (my) own (again if it’s me).

Part 3

The main part of this post presents my synthesis of this blog post – written by Phillip Schrodt.

Dilemmas with treating IFMs as Weather Models:

  1. Refexivity – The reflexive qualities of public models (again and for the last time, like a Weather Forecast) make social forecasting problematic (think how Asimovian predictive failures are described in the Foundation Series). Asimov sets a futuristic galaxy in the Foundation Series where social dynamics are predicted by a Mathematical Sociologist called Hari Seldon. Seldon and Daneel Olivaw, another important character, come to the realization of reflexiveness in predicting Social Dynamics. They decide to keep Psychohistory unknown to most people, mainly so that their models continue working.

    In our current time, it might be possible to overcome this dilemma if the models factored in new information, but I’d say that it would be much harder (but, probably not impossible) to build a reflexively resilient (or self-modifying) computational model – BTW, this idea is one of those that isn’t new. Another possible solution (inspired by Weather Models) is to retrain them more frequently. As computing technology scales to size carbon nanotube computation, we will likely be able to retrain ML models more often.


  2. Crying Wolf – There is a dilemma with creating a crisis when there is none. For example, a model incorrectly predicts the possibility of a crisis. People mobilize to stop a crisis that would have never happened. In the most positive events, there might be no harm at all – simply being over-prepared and wasting resources. In the worst cases, humans risk life, limb, freedoms, etc.

    Mitigating this problem echos the point (2) I mentioned earlier from Duke: Defend models that are honest and humble, as incomplete forecasts provide a better sense of the uncertain future than missing forecasts

    Therefore, it might be better to consider the 2nd dilemma as an imperative to overcome – and an imperative in the Kantian sense. In fact, it might be more prudent to develop a Kantian imperative handling this dilemma directly.


  3. Iteration Velocity – Transparency allows for not only sharing of ideas, but new ideas at interfaces between them. Models are biased, so having multiple competing models provides multiple perspectives. It seems that having more models also causes a convergence. At what point does the outlier model make a better prediction?

    I think point ( 1 ) from above provides some nice advice: Be honest and humble in the presentation and organization of models and tools

    Additionally, this dilemma relates to the notion of Scientific Revolutions as they provide new paradigms for considering reality. Without the sharing of social models, it begins to seem that the rate of iteration on new paradigms slows.


  4. Finding Diamonds in piles of $#@! – Especially with the rise of ‘fake news’, the quality of the data mined from the Internet and the data sources you use requires skepticism in order to be understood. Quality is more important than quality, and the dilemma is knowing which sources of data actually maintain a sense of quality.

    Ethically, there is a responsibility to find the higher quality data – otherwise, the risk is that the model might not be as honest.


  5. Data Source Consistency – Different from the quality (as in trustworthiness) – Consistency in this case is referring to the availability of a sustained dataset. Sustained means data that continues to be present over time, and not lost to history. This idea presented itself in the Foundation Series, too. Hari Sledon sought information about historical events of the galactic empire, and expressed to Dors Venabili about the lack of consistent data. The solution Asimov suggests through Seldon might seem obvious, or it might seem less consistent (depends on who you are). Seldon finds that the political and economic center of the empire, Trantor (which is also the capital), serves as a good model to represent the entire history of humanity.

    Similarly, I think the way to solve this dilemma mirrors the Asimovian solution. When we’re trying to build honest models, we need to acknowledge that we might base them on a microcosm – such as D.C, Moscow, Beijing, or choose another capital city of your choice – so long as that microcosm resembles the social and behavioral dynamics you are interested in. If you work for an organization that doesn’t want you to use data on a world capital, then you might need to find another analogous dataset.


  6. Thinking Hard is Important – False data might seem like an easy concept, until you realize that missing, incomplete, biased, and contradictory data carries information that reveals otherwise hidden qualities of reality. Statistical imputation is the process of filling in missing data, and improved computational methods make this easier to do. Another factor to consider: false positives indicate edge cases in statistical and computational approaches – finding better ways of understanding why the false positive happened allows for improvements in our techniques.

    A common example of this is the discovery of infrared. It’s important to notice a thermometer recording heat beyond the red part of the visible spectrum of light (A false positive?). It’s more important to not reject that observation, for example, because it allows for new discoveries (A new model!). William Herschel originally called them ‘Calorific Rays’ – here’s the history.

    Physical Scientists are really good at thinking hard, and have already learned a their fair share of difficult to learn lessons. It’s a good idea to learn from them, too. (Again, quite a few techniques from weather forecasting are increasingly being applied to other domains)


  7. Social means relating to Society (read: People) – Data included in IFMs (and other social models) is related directly to the lived experiences of people. It’s necessary to provide extra attention to how this data deals with people. In other words, those wizards who work with social data ought to regard people as people instead of as statistics.

    There is a great responsibility in understanding that social models that predict and forecast have the capacity to impact the everyday experiences of another person. In fact, they often do. The main goal of Hari Seldon and Daneel Olivaw was to minimize the loss and catastrophe that the galaxy would experience. If you’re thinking about using data for predictive analytics, you have an ethical responsibility to be thinking about the people you will be impacting.

For the source material on these dilemmas, see Schrodt’s Article – it’s a very fascinating discussion. Now I’ve got a better sense of the ethics supporting this blend of social and computational sciences.

Sources: