Project:OpenRefine/GREL

From CODECS Wikibase
Jump to navigation Jump to search

Wikidata coordinates

A SPARQL query on Wikidata returns coordinates in the format "Point(longitude,latitude)" (yes, longitude first), but this is not a supported format and we want "latitude,longitude" instead.

if(value==null,
null,
replace( 
 replace( value, "Point(", "" ), 
")", "" )
)

Followed by

if( value=="",
"",
value.split(" ")[1] + "," + value.split(" ")[0]
)

Check first that a cell is (not) empty

Cells may be empty so always add a check by using an if statement with 'value!=null' or conversely,' value==null':

if(value!=null,
[value ],
null
)

if value contains

Used this to split a hybrid column in the Monasticpn Hibernicum.

if(value.contains('Tuam'), 'Tuam',
  if(value.contains('Cashel'), 'Cashel',
  if(value.contains('Dublin'), 'Dublin',
  if(value.contains('Armagh'),'Armagh',
null))))
if(value.contains('Ulster'), 'Ulster',
  if(value.contains('Munster'), 'Munster',
  if(value.contains('Leinster'), 'Leinster',
  if(value.contains('Connacht'),'Connacht',
  if(value.contains('Mide'),'Mide',
null)))))

Confidence levels

Some databases may use question marks to indicate that the information is tentative, uncertain, etc. That is fine for text properties but if we were to qualify statements linking to Items, we can use ... qualifiers. We can create an additional column based on the absence of presence of question marks, e.g.

if(value!=null,
 if(value.contains('?'),"uncertain","reasonable"),
 null
)

Qualifiers are then added in our schema based on the value, "uncertain" or "reasonable", in our new column.

Split and trim

Given a messy array like: " value1; value2; "

if(value==null,
null,
forEach(
    trim(value).split(";"),
    i,
    i.trim()
))

Remove final period

if( value.substring(-1) == ".",
value.substring(0,-1),
value
)

Arrays

Example - split the string, get the first item in the array (1 not 0) and pseud-implode it by 'joining' it. The last step may not be so obvious.

if( value.contains(';'),
value.split(";").slice(1).join(";"),
null
)

Get numbers from string

Convoluted way to get numbers beg. with 1 or 2 from a hybrid string.

if(value==null,
value,
forEach(
  value.splitByCharType(), v,
  if( v.startsWith("1"), v, 
    if( v.startsWith("2"), v, null )
  )
).join("")
)