Skip to Main Content

OpenRefine: Common Transformations

Common Transformations

Through the column drop down menus, OpenRefine provides options for common transformations to facilitate data cleaning such as:

  • removing proceeding and trailing white spaces; collapsing extra spaces
  • standardizing letter case
  • removing html
  • changing cell type

Facets & Clustering

Facets

Facet options group together identical cells across rows for a particular column, and indicates the number of rows within each group. Facets can be used to:

  • identify and resolve inconsistencies
  • conduct batch edits similar to find and replace
  • gather count data

Access Facet options by clicking on the down arrow next to each column title. 

Cluster

Cluster options attempt to group together different cell values that may be alternative representations of the same thing, for example, FFP, Firm Fixed Price and firmed fixed price. 

Column Transformations

Reorder and/or Remove Columns

  • Using the All dropdown menu (above the star column), select Edit columns--> Re-order/remove columns. Drag and drop columns to reorder; move columns to right pane to remove them from the grid.  

Splitting Cells into Columns 

  • Cells can be split into additional columns using a separator (often a comma, semi-colon, dash, space, or other character) or field length.
  • Using the dropdown menu for the column you wish split, select Edit column-->Split into several columns. 

Combining Cell Values/ Concatenate

  • Combining cells from the same  row into a new column requires a GREL expression. From the drop down of one of the columns you wish to alter, select Edit cells --> Transform. This will bring up a window for entering your GREL expression. 
  • Enter the expression: 

value + cells[‘Column’].value

where "value" is the current column, and  'column' is the name of the column whose values you would like to combine. Note, nothing will happen if the column name is not exact. 

You can also add additional syntax using a plus sign, such as, 

value + "-" + cells[‘Column’].value 

to add, in this example, a dash between the content being combined. 

OpenRefine also can support splitting rows into columns, and splitting cells into columns or rows using the dropdown menu for the column.