Modul:Inference

Fra Wikipedia, den frie encyklopedi
Moduldokumentasjon
Note: This module is for testing purposes, and will only partially work.

The module Inference is an attempt to make an alternative way to query the relations between entities. It uses selectors to manipulate the internal set of selected claims, where such selectors can be chained to form more complex propositions. In a final step parts of the internal structure of the claims or even the entities can be returned. The final step will involve an implicit filter operation, so nil will not be returned if possible. The implicit filtering will not change the internal selection, thus further processing of the selection is still possible.

The entities are not directly selected, but they can be accessed through selected claims. A call creates a list of current selected claims by using their containing entities. This can happen implicit or be made explicit. Implicit selections are created from the connected entity (could be better to use a dot-keyword), but also after after each fetch operation. Explicit selections are made by inserting parenthesis, and those copy the current selection unless given an explicit selection by listing their containing entities after the opening parenthesis.

The module is a work in progress, and is not ready for production.

Query path[rediger kilde]

A query will use a path-like string to select some claims from the explicit or implicit given entities. The syntax is pretty straight forward. It contains parenthesis, brackets, forward slashes, operators, a few keywords, and recognized names for entities and properties.

parenthesis
Create a new selection, fill it with claims from the beginning of the list, or if none is found, reuse the last current selection.
brackets
Create a new selector, and select all claims that satisfies the property part, and then select the subset of claims that evaluate to truthy.
forward slash
Fetch the set of claims that reference other entities, and create a new selection.
intersection operator (default)
Takes two sets of selections and creates a new selection containing those claims that exist in both sets.
union operator
Takes two sets of selections and creates a new selection containing those claims that exist in either sets.


A string between forward slashes can be viewed as put between parenthesis, with only implicit selection.

Use claims from entity Q20, reduce selection to those that has P31, fetch, reduce selection to preferred claims

Q20 [P31]/[rank preferred]/

Use claims from current entity, reduce selection to those that has P31, fetch, reduce selection to preferred claims

. [P31]/[* rank preferred]/
. [P31]/[* rank preferred]/

Selectors[rediger kilde]

Property is to the left, compared value to the right.

exist (ex)
Selector is truty if property exist.
equal (eq)
Selector is truty if value for property is equal to given value (within the property values precision).
contains (co)
Selector is truty if value for property contains the given value.
starts (st)
Selector is truty if value for property starts with the given value.
ends (en)
Selector is truty if value for property starts with the given value.

Selectors might include a modifier

Module[rediger kilde]

The moule must be required and will then expose a number of methods. An object with a set of claims is then created like the following

local inference = require 'Module:Inference'
local set = inference.create('Q1') -- Universe

The set of claims are the ids of the objects selected entries. They refer an internal cache of all known claims, and an other internal cache all known entities. Because the claims and the entities are expected to remain the same, they are assumed to be non-mutable, they should not change in any way. If they do information will leak between accesses, and later accesses might even fail. (Perhaps block changing of the entities?)

It is possible to create a set consisting of several entities

local inference = require 'Module:Inference'
local set = inference.create(
  'Q308', -- Mercury
  'Q313', -- Venus
  'Q2', -- Earth
  'Q111' -- Mars
)

The individual (more than 400) claims are not jumbled together, they still belong to each individual entity.

This set can be filtered down into a smaller set, given the various field in each claim

local inference = require 'Module:Inference'
local set = inference.create( 'Q308', 'Q313', 'Q2', 'Q111' ):property( 'P156' )

This actually has a size of five, as P156 for Mars (Q111) has two entries. The size can be found with :size().

Other ways to filter down the set is by using :type(), :rank(), :snaktype(), :datatype(), and :valuetype(). Each one take string arguments or functions, accumulating claims that somehow matches.

The selection can be turned into tables at different depth in the model

local inference = require 'Module:Inference'
local set = inference.create( 'Q308', 'Q313', 'Q2', 'Q111' ):property( 'P156' ):getEntities()

This will return a list of the four entities, as all four of them has the neste (P156) property. Other extracts are :getClaims(), :getClaims(), :getProperties(), :getMainsnaks(), :getDatavalues(), and :getValues(). These methods does not compress the set and will give five entries.

There are also two methods to format the returned values so they can be readily used on a page. One is the :getPlain() which returns a properly escaped text, and and the other one is the :getRich() which returns a formatted wikitext form.

All methods with names on the get*-form will return the selection without changig it in any way.

Missing[rediger kilde]

  • Handling of qualifiers and references are not implemented.
  • Parsing of path statements
  • Fatch method

Examples[rediger kilde]

{{#invoke:Inference|query|format=dump}}

{{#invoke:Inference|query|format=dump}}

{{#invoke:Inference|query|Q20|format=dump}}

{{#invoke:Inference|query|Q20|format=dump}}

{{#invoke:Inference|query|Q20|format=dump|qualifier|reference}}

{{#invoke:Inference|query|Q20|format=dump|qualifier|reference}}

--- Class for inferences about entities

local mt = {}
local Inference = setmetatable( {}, mt )

--- Fast lookup of all entities known to the lib
local entitiesById = {}

--- Fast lookup of all claims known to the lib
local claimsById = {}

--- Lookup of missing class members
function Inference:__index( key ) -- luacheck: no self
	return Inference[key]
end

local validEntityTypes = {}
validEntityTypes['item'] = true
validEntityTypes['property'] = true
validEntityTypes['lexeme'] = true

local validClaimTypes = {}
validClaimTypes['statement'] = true

local validSchemaVersion = {}
validSchemaVersion[2] = true

function assertEntity( entity )
	assert( entity )
	assert( entity.id)
	assert( entity.type)
	assert( entity.schemaVersion )
	assert( validSchemaVersion[entity.schemaVersion] )
	assert( validEntityTypes[entity.type] )
end

function assertEntityId( id )
	assert( string.match( id, '^[lLpPqQ]%d+$') )
end

function assertClaim( claim )
	assert( claim )
	assert( claim.id)
	assert( claim.mainsnak)
	assert( validClaimTypes[claim.type] )
end

function assertClaimId( id )
	assert( string.match( id, '^[lLpPqQ]%d+%$[-a-fA-F0-9]+$') )
end

--- Create a new instance
function Inference.create( ... )
	local self = setmetatable( {}, Inference )
	self:_init( ... )
	return self
end

--- Initialize a new instance
function Inference:_init( ... )
	self._selected = {}
	for _,arg in ipairs( {...} ) do
		local tpe = type( arg )
		local value = arg
		if tpe == 'function' then
			value = arg()
			tpe = type( value )
		end
		if tpe == 'string' then
			self:path( arg )
			--assertEntityId( value )
			--local ucEntityId = string.upper( value )
			--local entity = Inference.loadEntity( ucEntityId )
			--if entity then
			--	self:selectEntity( entity )
			--end
		elseif tpe == 'table' then
			assertEntity( value )
			local entity = value
			Inference.registerEntity( entity )
			Inference.registerClaims( entity )
		elseif tpe == 'nil' then
			-- never mind
		else
			error()
		end
	end
	return self
end

function mt.__call( tbl, ... )
	return tbl.create( ... )
end

local function split( str )
   local ret={}
   local idx=1
   for s in str:gmatch("([^%s]*)") do
      ret[idx] = ret[idx] or s
      if s == "" then
         idx = idx + 1
      end
   end
   return unpack( ret )
end

local selectors = {}
selectors['eq'] = function( obj, str )
	end
selectors['snaktype'] = function( obj, str )
	obj:snaktype( split( str ) )
end
selectors['valuetype'] = function( obj, str )
	obj:valuetype( split( str ) )
end
selectors['type'] = function( obj, str )
	obj:type( split( str ) )
end
selectors['rank'] = function( obj, str )
	obj:rank( split( str ) )
end

function Inference:path( path )
	local str = path
	local len = string.len( str )
	local index = 1
	local first = nil
	local last = nil
	local step = 10
		local keep = nil
	repeat
		keep = index
		step = step - 1
		first, last = string.find( str, '^%s+', index)
		if first then
			index = last+1
		end
		first, last = string.find( str, '^%.', index)
		if first then
			local entity = mw.wikibase.getEntity()
			if entity then
				Inference.registerEntity( entity, true )
				Inference.registerClaims( entity, true )
				self:selectEntity( entity )
			end
			index = last+1
		end
		first, last = string.find( str, '^[pP]%d+', index)
		if first then
			self:property( string.upper( string.sub( str, first, last ) ) )
			index = last+1
		end
		first, last = string.find( str, '^[lLqQpP]%d+', index)
		if first then
			local ucEntityId = string.upper( string.sub( str, first, last ) )
			local entity = Inference.loadEntity( ucEntityId )
			if entity then
				self:selectEntity( entity )
			end
			index = last+1
		end
		first, last = string.find( str, '^%b[]', index)
		if first then
			local substr = string.sub( str, first+1, last-1 )
			local id = string.match( substr, '^%s*([pP]%d+)%s*$')
			if id then
				self:property( string.upper( id ) )
			else
				local id, oper, arg = string.match( substr, '^%s*([pP]%d+)%s+(%a+)%s+(.*)$')
				if id and selectors[oper] then
					selectors[oper]()
				else
					local name, arg = string.match( substr, '^%s*(%a+)%s+(.*)$')
					if name and selectors[name] then
						selectors[name]( self, arg )
					end
				end
			end
			index = last+1
		end
		first, last, name = string.find( str, '^(%a*)(%b())', index)
		if first then
			--self:path( string.sub( str, first, last ) )
			index = last+1
		end
		first, last = string.find( str, '^/', index)
		if first then
			self:fetch()
			index = last+1
		end
	until (index == keep or index >= len)
	return self
end

function Inference.registerEntity( entity, noAssert )
	assertEntity( entity, noAssert )
	local ucEntityId = string.upper( entity.id )
	entitiesById[ucEntityId] = entity
end

function Inference.registerClaims( entity, noAssert )
	assertEntity( entity, noAssert )
	local ucEntityId = string.upper( entity.id )
	for prop,list in pairs( entity.claims or {} ) do
		for _,claim in ipairs( list or {} ) do
			assertClaimId( claim.id, noAssert )
			local ucClaimId = string.upper( claim.id )
			claimsById[ucClaimId] = claim
		end
	end
end

function Inference:selectEntity( entity, noAssert )
	assertEntity( entity, noAssert )
	for _,list in pairs( entity.claims or {} ) do
		for _,claim in ipairs( list or {} ) do
			self:selectClaim( claim, noAssert )
		end
	end
end

function Inference:selectClaim( claim, noAssert )
	assertClaim( claim, noAssert )
	local ucClaimId = string.upper( claim.id )
	table.insert( self._selected, ucClaimId )
end

function Inference.loadEntity( entityId, noAssert )
	assertEntityId( entityId, noAssert )
	local ucEntityId = string.upper( entityId )
	local entity = entitiesById[ucEntityId]
	if not entity then
		entity = mw.wikibase.getEntity( ucEntityId )
		if not entity then
			return nil
		end
		Inference.registerEntity( entity, true )
	end
	if entity then
		Inference.registerClaims( entity, true )
	end
	return entity
end

function Inference:fetch()
	self:valuetype( 'wikibase-entityid' )
	local values = self:getValues()
	self:empty()
	for _,value in ipairs( values ) do
		local ucEntityId = string.upper( value.id )
		local entity = Inference.loadEntity( ucEntityId )
		if entity then
			self:selectEntity( entity )
		end
	end
	return self
end

--- Is the current selection empty
-- Note that the selection is non-empty even if a nil
-- is stored internally.
-- @return boolean saying whether the selection is empty
function Inference:isEmpty()
	return #self._selected == 0
end

function Inference:size()
	return #self._selected
end

function Inference:empty()
	self._selected = {}
end

--- Is this an identifier for a claim
-- @param id string form of an identifier
-- @return boolean saying whether the selection is empty
function Inference:isClaimId()
	return #self._selected == 0
end

function Inference.filterRank( ... )
	local filter = {}
	for _,str in ipairs( {...} ) do
		if string.match( str, '^[nN]ormal$') then
			filter['normal'] = true
		elseif string.match( str, '^[pP]referred$') then
			filter['preferred'] = true
		elseif string.match( str, '^[dD]eprecated$') then
			filter['deprecated'] = true
		end
	end
	return filter
end

function Inference.filterType( ... )
	local filter = {}
	for _,str in ipairs( {...} ) do
		if string.match( str, '^[sS]tatements?$') then
			filter['statement'] = true
		elseif string.match( str, '^[pP]roperty$') or string.match( str, '^[pP]roperties$') then
			filter['property'] = true
		elseif string.match( str, '^[lL]exemes?$') then
			filter['lexeme'] = true
		end
	end
	return filter
end

function Inference.filterProperty( ... )
	local filter = {}
	for _,str in ipairs( {...} ) do
		if string.match( str, '^[pP]%d+$') then
			filter[string.upper(str)] = true
		end
	end
	return filter
end

function Inference.filterSnaktype( ... )
	local filter = {}
	for _,str in ipairs( {...} ) do
		if string.match( str, '^[vV]alues?$') then
			filter['value'] = true
		elseif string.match( str, '^[sS]ome%-?[vV]alues?$') then
			filter['somevalue'] = true
		elseif string.match( str, '^[nN]o%-?[vV]alues?$') then
			filter['novalue'] = true
		end
	end
	return filter
end

function Inference.filterDatatype( ... )
	local filter = {}
	for _,str in ipairs( {...} ) do
		if string.match( str, '^[wW]ikibase%-?[iI]tems?$') then
			filter['wikibase-item'] = true
		elseif string.match( str, '^[wW]ikibase%-?[pP]ropertys?$') then
			filter['wikibase-property'] = true
		elseif string.match( str, '^[qQ]uantity$') or string.match( str, '^[qQ]uantities$') then
			filter['quantity'] = true
		elseif string.match( str, '^[tT]imes?$') then
			filter['time'] = true
		elseif string.match( str, '^[uU][rR][lL]s?$') then
			filter['url'] = true
		elseif string.match( str, '^[sS]trings?$') then
			filter['string'] = true
		elseif string.match( str, '^[mM]ono%-?[lL]ingual%-[tT]exts?$') then
			filter['monolingualtext'] = true
		elseif string.match( str, '^[gG]eo%-?[sS]hapes?$') then
			filter['geo-shape'] = true
		elseif string.match( str, '^[gG]lobe%-?[cC]oordinates?$') then
			filter['globe-coordinate'] = true
		elseif string.match( str, '^[tT]abular%-?[dD]atas?$') then
			filter['tabular-data'] = true
		elseif string.match( str, '^[mM]aths?$') then
			filter['math'] = true
		elseif string.match( str, '^[eE]xternal%-?[iI]ds?$') then
			filter['external-id'] = true
		elseif string.match( str, '^[cC]ommons%-?[mM]edias?$') then
			filter['commonsMedia'] = true
		end
	end
	return filter
end

function Inference.filterValuetype( ... )
	local filter = {}
	for _,str in ipairs( {...} ) do
		
		if string.match( str, '^[wW]ikibase%-?[eE]ntityids?') then
			filter['wikibase-entityid'] = true
		elseif string.match( str, '^[qQ]uantity$') or string.match( str, '^[qQ]uantities$') then
			filter['quantity'] = true
		elseif string.match( str, '^[sS]trings?$') then
			filter['string'] = true
		elseif string.match( str, '^[tT]imes?$') then
			filter['string'] = true
		elseif string.match( str, '^[mM]ono%-?[lL]ingual%-?[tT]exts?$') then
			filter['monolingualtext'] = true
		elseif string.match( str, '^[gG]lobe%-?[cC]oordinates?$') then
			filter['globecoordinate'] = true
		end
	end
	return filter
end

function Inference.extractRank( claim )
	return claim.rank
end

function Inference.extractType( claim )
	return claim.type
end

function Inference.extractProperty( claim )
	return claim and claim.mainsnak and claim.mainsnak.property or nil
end

function Inference.extractSnaktype( claim )
	return claim and claim.mainsnak and claim.mainsnak.snaktype or nil
end

function Inference.extractMainsnak( claim )
	return claim and claim.mainsnak or nil
end

function Inference.extractDatatype( claim )
	return claim and claim.mainsnak and claim.mainsnak.datatype or nil
end

function Inference.extractDatavalue( claim )
	return claim and claim.mainsnak and claim.mainsnak.datavalue or nil
end

function Inference.extractValuetype( claim )
	return claim and claim.mainsnak and claim.mainsnak.datavalue and claim.mainsnak.datavalue.type or nil
end

function Inference.extractValue( claim )
	return claim and claim.mainsnak and claim.mainsnak.datavalue and claim.mainsnak.datavalue.value or nil
end

function Inference.extractIdentity( claim )
	return claim
end

function Inference:filter( extractFunc, filterFunc, ... )
	local funs = {}
	local strs = {}
	for _,arg in ipairs( {...} ) do
		local tpe = type( arg )
		if tpe == 'string' then
			table.insert( strs, arg )
		elseif tpe == 'function' then
			table.insert( funs, arg )
		end
	end
	local filter = filterFunc( unpack( strs ) )
	local selected = {}
	for i,claimId in ipairs( self._selected ) do
		if claimId then
			local ucClaimId = string.upper( claimId )
			local claim = claimsById[ucClaimId]
			local extract = extractFunc( claim )
			local keep = filter[extract]
			for _,fun in ipairs(funs) do
				keep = keep or fun( extract )
			end
			if keep then
				table.insert( selected, ucClaimId )
			end
		end
	end
	self._selected = selected
	return self
end

function Inference:rank( ... )
	self:filter( Inference.extractRank, Inference.filterRank, ... )
	return self
end

function Inference:type( ... )
	self:filter( Inference.extractType, Inference.filterType, ... )
	return self
end

--- Filter out these claims
-- @param ... string forms of identifiers
-- @return self for chaining
function Inference:property( ... )
	self:filter( Inference.extractProperty, Inference.filterProperty, ... )
	return self
end

function Inference:snaktype( ... )
	self:filter( Inference.extractSnaktype, Inference.filterSnaktype, ... )
	return self
end

function Inference:datatype( ... )
	self:filter( Inference.extractDatatype, Inference.filterDatatype, ... )
	return self
end

function Inference:valuetype( ... )
	self:filter( Inference.extractValuetype, Inference.filterValuetype, ... )
	return self
end

function Inference:getSelectedIds()
	return self._selected
end

function Inference:getEntities()
	local entities = {}
	local entityIds = {}
	local seenIds = {}
	for _,claimId in ipairs( self._selected ) do
		local ucEntityId = string.upper( string.match( claimId, '^([lLpPqQ]%d+)' ) )
		if not seenIds[ucEntityId] then
			seenIds[ucEntityId] = true
			table.insert( entityIds, ucEntityId )
		end
	end
	for _,entityId in ipairs( entityIds ) do
		local entity = entitiesById[entityId]
		if entity then
			table.insert( entities, entity )
		end
	end
	return entities
end

function Inference:get( extractFunc )
	local extracts = {}
	for i,claimId in ipairs( self._selected ) do
		extracts[i] = extractFunc( claimsById[claimId] )
	end
	return extracts
end

function Inference:getClaims()
	return self:get( Inference.extractIdentity )
end

function Inference:getProperties()
	return self:get( Inference.extractProperty )
end

function Inference:getMainsnaks()
	return self:get( Inference.extractMainsnak )
end

function Inference:getDatavalues()
	return self:get( Inference.extractDatavalue )
end

function Inference:getValues()
	return self:get( Inference.extractValue )
end

function Inference:render( ... )
	local flags = {}
	for _,v in ipairs( {...} ) do
		if v == 'string' then
			flags[v] = true
		end
	end
	local singularFormatter = (flags['rich'] and mw.wikibase.formatValue) or mw.wikibase.renderSnak
	local pluralFormatter = (flags['rich'] and mw.wikibase.formatValues) or mw.wikibase.renderSnaks
	local rendered = {}
	for i,claimId in ipairs( self._selected ) do
		local items = {}
		local claim = claimsById[claimId]
		if claim.mainsnak then
			local main = tostring( singularFormatter( claim.mainsnak ) )
			table.insert( items, main )
		end
		if flags['qualifiers'] and claim.qualifiers then
			local qualifiers = tostring( pluralFormatter( claim.qualifiers ) )
			table.insert( items, qualifiers and string.format( "(%s)", main ) )
		end
		if flags['references'] and claim.references then
			for _,refs in ipairs( claim.references ) do
				local reference = tostring( pluralFormatter( refs ) )
				table.insert( items, reference and string.format( "[%s]", reference ) )
			end
		end
		table.insert( rendered, table.concat() )
	end
	
end

function Inference:dump( args )
	local rendered = {}
	for i,claimId in ipairs( self._selected ) do
		local items = {}
		local claim = claimsById[claimId]
		local main = mw.wikibase.renderSnak( claim.mainsnak )
		table.insert( items, main )
		if claim.qualifiers then
			local qual = args['qualifiers'] and mw.wikibase.renderSnaks( claim.qualifiers )
			if qual and #qual ~= 0 then
				table.insert( items, string.format( "(%s)", qual ) )
			end
		end
		for _,refs in ipairs( claim.references or {} ) do
			local ref = args['references'] and mw.wikibase.renderSnaks( refs.snaks )
			if ref and #ref ~= 0 then
				table.insert( items, string.format( "[%s]", ref ) )
			end
		end
		table.insert( rendered, table.concat( items, args['space'] or ' ' ) )
	end
	return table.concat( rendered, args['separator'] or ",\n" )
end

function Inference:rich( frame, args )
	local rendered = {}
	local outer = mw.html.create( 'div' )
	for i,claimId in ipairs( self._selected ) do
		local items = {}
		local claim = claimsById[claimId]
		local main = mw.wikibase.formatValue( claim.mainsnak )
		table.insert( items, main )
		if claim.qualifiers then
			local qual = args['qualifiers'] and mw.wikibase.formatValues( claim.qualifiers )
			if qual and #qual ~= 0 then
				table.insert( items, string.format( "(%s)", tostring( qual ) ) )
			end
		end
		for _,refs in ipairs( claim.references or {} ) do
			local ref = args['references'] and mw.wikibase.formatValues( refs.snaks )
			if ref and #ref ~= 0 then
				table.insert( items, string.format( "[%s]", tostring( ref ) ) )
			end
		end
		local inner = mw.html.create( 'span' )
		inner:wikitext( table.concat( items, args['li'] or ' ' ) )
		outer:node( inner )
	end
	return tostring( outer )
end

function Inference.query( ... )
	local list = {...}
	local frame = nil
	local args = {}

	if #list == 1 and type( list[1] ) and list[1].args then
		frame = list[1]
		list = frame.args
	end

	for k,v in pairs( list ) do
		if not v.args then
			if tonumber(k) and type( v ) == 'string' then
				args[v] = true
			end
			args[k] = v
		end
	end

	args['qualifiers'] = args['qual'] or args['qualifier'] or args['qualifiers']
	args['references'] = args['ref'] or args['reference'] or args['references']
	--args['format'] = args['format'] or 'rich'
	
	local result = Inference( args[1] or '' )

	if args['format'] and string.match( args['format'], '^[dD]ump$' ) then
		return result:dump( args )
	end

	if args['format'] and string.match( args['format'], '^[rR]ich$' ) then
		if not frame then
			frame = mw.getCurrentFrame()
		end
		return result:rich( frame, args )
	end

	return result:size()
end

-- Return the final class
return Inference