Google’s Knowledge Graph: connecting structured knowledge from diverse sources
Stefano Mazzochi and other former MetaWebbers now at Google have turned out another beautiful structure in the garden of human knowledge: the Knowledge Graph.
This helps visualize one key aspect of information meshes, though it has many limitations still. (It is only a graph, as the name suggests; as defined within Google it is only the part of the universal knowledge graph that they choose to bless as ‘clean’; it doesn’t include any data that they choose not to make publicly visible; and there is no higher level of structure to support a metric, or a multi-dimensional space).
For the past few years, I have been tracking patterns and ways to measure them. In some easily reproducible settings, like small-group social engagements, short-timeframe teamwork, and the like, patterns are much more useful than individual events at determining how things work out. Especially when the desired outcome is patterned, and real-life outcomes usually are (“make sure everyone leaves happy”, “come up with a solution that addresses everyone’s personal use case well enough”), focusing on natural patterns rather than linear ones* provides for better rules of thumb, and a clearer understanding of why things happen.
Indeed, most common wisdom about why things happen – how causality works, what comes first and what comes next – is simply a version of the post hoc fallacy: if two things happen near eachother, one caused the other. You can see this most eloquently in the history of many sciences. We continue to make this class of mistakes most quantitatively in abuses of statistics today. But the more prominent arena for this sort of thinking is in everyday life – the way we talk and write, the words we use to explain important events to ourselves.
If you look at almost any significant and complex world problem, you will find that both laymen and experts enjoy breaking things down into linear patterns, and choosing a small number to claim as the “key” factors in making or unmaking some change. Climate change, economic collapses, political standoffs.
In my observation, it is rare for there to be much truth in ascribing impact to any small set of such factors. Yet most people I know will, in at least some areas where we lack solid repeatable data, suggest otherwise.
After running some experiments in this area, I am keen on writing something more formal about this, including some language, metrics, and toy examples for working with patterns. I have found a close attention to patterns to be of tremendous personal use, and expect it will come to be so in larger collaborations as well. If you have run across relevant work in this area, or writings on pattern of any sort – human, biological, artistic, mathematical, or other – I should like to hear about it.
* Linear or “single factor” patterns are the simplest kind; and in many if not all cases one could describe all more complex patterns in terms of the interction of linear patterns. However we can usually evaluate a set of natural, more complex patterns with reasonably low error. Forcing a guess at their decomposition into linear ones and at what those linear factors are, and composing those guesses together, is often far more incomplete or uncertain.
New photos York style, and mesh completionism
While still recovering from a Rein’s Deli hangover, I found myself the subject of the Ragesoss lens last weekend. Good energy, well captured.
@Ragesoss: It is a mathematical notion applied to ideas. A conceptual space around a theme is full of different concepts, each related to the theme in some way. Such a space can be described in terms of facets that can be used to describe a concept: for instance, you might describe ideas for laying out a garden in terms of their complexity, suitable climate, or total size… or many others. Complexity and size are sometimes linked. You can imagine the conceptual span of a set of facets, or their dependency on one another, as corrolaries of the span and independence of vectors being used as the basis for an abstract space.
A mesh is a limited set of elements that can be used to effectively describe an infinite space of ideas. Human languages are full of concept meshes. The easiest to discuss are one-dimensional meshes (ideas that span the spectrum of a single facet):
- color words – the spectrum of visible colors is split into a set of common colors. this set of names is a casual mesh for the visible color spectrum. (casual in that there is no explicit metric used to determine whether all parts of the visible spectrum are ‘equally’ represented by words)
- shape words – shapes may be described as circular or oval, square or rectangular. There is a humorous ‘proof’ that the only skew triangle has angles (45, 60, 75) – that all others are roughly equilateral, isoceles, or right.
Higher-dimensional meshes include texture words (smooth, rough, bumpy, prickly, soft, firm, sticky… – covering facets of friction, give, tangible local structure, and more). Most higher-dimensional meshes in language are incomplete (we rarely form words for concepts whose realizations are not in common use).
If you define a metric for the distance between two points on a spectrum, you can construct an “equally-spaced” subdivision of the space, or a balanced mesh. This splits a space into a set of characteristic elements (here, concepts) or nodes which can be used to describe anything elsewhere in the space.
Choosing a metric is important and difficult. For instance, once we found a way to measure color by the wavelength of its light, we could ask for enough common color words such that every frequency of visible light is no more than 50nm from the wavelength of one of the characteristic colors. In practice, humans see different parts of the color spectrum with differing degrees of sensitivity, and we become familiar with certain constant colors in our environment . So while the rendered spectrum does not devote much space to Yellow or Orange (in contrast with green and red), we have many more characteristic words for yellows and blues than a straight “wavelength subdivision” would suggest.
It is also difficult to define facets that are independent of one another; but this is not necessary. It is mainly important for each facet to be easy to observe and agree on.
For a given metric, you can describe the fineness of a mesh in terms of the maximum distance from any concept to the closest characteristic element. (or sometimes twice that distance – as a description of the “largest” concept that could “slip through” the mesh without including any of the characteristic elements.) If you have different metrics for each facet, a synthetic combined metric must be created that is consistent with each.
A balanced mesh is then one in which the fineness of the mesh is essentially the same for all subsets of the conceptual space — so, a set of color words that provides equal facility in describing perceived colors at all points on the color spectrum. (Again, a suitable metric here might be one that stretches out the spectrum in regions perceived very well by the human eye, or colors that come up frequently in human life — the latter a metric that changes with social context.)
One can often have a clear definition of a mesh without having words for some of its characteristic elements. This happens often with a multifaceted space, where the intersection of well-known values of each facet is an unknown combination that has no word to describe it. One common way of constructing a balanced mesh involves creating a balanced mesh for each facet, and then defining a concept for every combination of those single-facet ideas. Building a “complete” set of characteristic concepts can be thought of as mesh completion. It is a way of thoroughly grokking a space of related concepts. And the fineness of the resulting mesh is a measure of how effectively one has used language, imagery, or other methods to illustrate the limitless variety possible within the constraints of that conceptual space.
(More after the jump…) (more…)