Project:OpenRefine/GREL
Wikidata coordinates
A SPARQL query on Wikidata returns coordinates in the format "Point(longitude,latitude)" (yes, longitude first), but this is not a supported format and we want "latitude,longitude" instead.
if(value==null, null, replace( replace( value, "Point(", "" ), ")", "" ) )
Followed by
if( value=="", "", value.split(" ")[1] + "," + value.split(" ")[0] )
Check first that a cell is (not) empty
Cells may be empty so always add a check by using an if statement with 'value!=null' or conversely,' value==null':
if(value!=null, [value ], null )
if value contains
Used this to split a hybrid column in the Monasticpn Hibernicum.
if(value.contains('Tuam'), 'Tuam', if(value.contains('Cashel'), 'Cashel', if(value.contains('Dublin'), 'Dublin', if(value.contains('Armagh'),'Armagh', null))))
if(value.contains('Ulster'), 'Ulster', if(value.contains('Munster'), 'Munster', if(value.contains('Leinster'), 'Leinster', if(value.contains('Connacht'),'Connacht', if(value.contains('Mide'),'Mide', null)))))
Confidence levels
Some databases may use question marks to indicate that the information is tentative, uncertain, etc. That is fine for text properties but if we were to qualify statements linking to Items, we can use ... qualifiers. We can create an additional column based on the absence of presence of question marks, e.g.
if(value!=null, if(value.contains('?'),"uncertain","reasonable"), null )
Qualifiers are then added in our schema based on the value, "uncertain" or "reasonable", in our new column.
Split and trim
Given a messy array like: " value1; value2; "
if(value==null, null, forEach( trim(value).split(";"), i, i.trim() ))
Remove final period
if( value.substring(-1) == ".", value.substring(0,-1), value )
Arrays
Example - split the string, get the first item in the array (1 not 0) and pseud-implode it by 'joining' it. The last step may not be so obvious.
if( value.contains(';'), value.split(";").slice(1).join(";"), null )
Get numbers from string
Convoluted way to get numbers beg. with 1 or 2 from a hybrid string.
if(value==null, value, forEach( value.splitByCharType(), v, if( v.startsWith("1"), v, if( v.startsWith("2"), v, null ) ) ).join("") )