Class WorkSheet

java.lang.Object
org.biojava.nbio.survival.data.WorkSheet

public class WorkSheet extends Object
Need to handle very large spreadsheets of expression data so keep memory footprint low
Author:
Scooter Willis
  • Constructor Details

  • Method Details

    • clear

      public void clear()
      See if we can free up memory
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • randomlyDivideSave

      public void randomlyDivideSave(double percentage, String fileName1, String fileName2) throws Exception
      Split a worksheet randomly. Used for creating a discovery/validation data set The first file name will matched the percentage and the second file the remainder
      Parameters:
      percentage -
      fileName1 -
      fileName2 -
      Throws:
      Exception
    • getCopyWorkSheetSelectedRows

      public static WorkSheet getCopyWorkSheetSelectedRows(WorkSheet copyWorkSheet, ArrayList<String> rows) throws Exception
      Create a copy of a worksheet. If shuffling of columns or row for testing a way to duplicate original worksheet
      Parameters:
      copyWorkSheet -
      rows -
      Returns:
      Throws:
      Exception
    • getCopyWorkSheet

      public static WorkSheet getCopyWorkSheet(WorkSheet copyWorkSheet) throws Exception
      Create a copy of a worksheet. If shuffling of columns or row for testing a way to duplicate original worksheet
      Parameters:
      copyWorkSheet -
      Returns:
      Throws:
      Exception
    • getMetaDataColumns

      public ArrayList<String> getMetaDataColumns()
      Returns:
    • getMetaDataRows

      public ArrayList<String> getMetaDataRows()
      Returns:
    • getDataColumns

      public ArrayList<String> getDataColumns()
      Returns:
    • shuffleColumnsAndThenRows

      public void shuffleColumnsAndThenRows(ArrayList<String> columns, ArrayList<String> rows) throws Exception
      Randomly shuffle the columns and rows. Should be constrained to the same data type if not probably doesn't make any sense.
      Parameters:
      columns -
      rows -
      Throws:
      Exception
    • shuffleColumnValues

      public void shuffleColumnValues(ArrayList<String> columns) throws Exception
      Need to shuffle column values to allow for randomized testing. The columns in the list will be shuffled together
      Parameters:
      columns -
      Throws:
      Exception
    • shuffleRowValues

      public void shuffleRowValues(ArrayList<String> rows) throws Exception
      Need to shuffle rows values to allow for randomized testing. The rows in the list will be shuffled together
      Parameters:
      rows -
      Throws:
      Exception
    • hideMetaDataColumns

      public void hideMetaDataColumns(boolean value)
      Parameters:
      value -
    • hideMetaDataRows

      public void hideMetaDataRows(boolean value)
      Parameters:
      value -
    • setMetaDataRowsAfterRow

      public void setMetaDataRowsAfterRow()
    • setMetaDataColumnsAfterColumn

      public void setMetaDataColumnsAfterColumn()
    • setMetaDataRowsAfterRow

      public void setMetaDataRowsAfterRow(String row)
      Parameters:
      row -
    • setMetaDataColumnsAfterColumn

      public void setMetaDataColumnsAfterColumn(String column)
      Parameters:
      column -
    • setMetaDataColumns

      public void setMetaDataColumns(ArrayList<String> metaDataColumns)
      Clears existing meta data columns and sets new ones
      Parameters:
      metaDataColumns -
    • markMetaDataColumns

      public void markMetaDataColumns(ArrayList<String> metaDataColumns)
      marks columns as containing meta data
      Parameters:
      metaDataColumns -
    • markMetaDataColumn

      public void markMetaDataColumn(String column)
      Parameters:
      column -
    • isMetaDataColumn

      public boolean isMetaDataColumn(String column)
      Parameters:
      column -
      Returns:
    • isMetaDataRow

      public boolean isMetaDataRow(String row)
      Parameters:
      row -
      Returns:
    • markMetaDataRow

      public void markMetaDataRow(String row)
      Parameters:
      row -
    • setMetaDataRows

      public void setMetaDataRows(ArrayList<String> metaDataRows)
      Parameters:
      metaDataRows -
    • hideEmptyRows

      public void hideEmptyRows() throws Exception
      Throws:
      Exception
    • hideEmptyColumns

      public void hideEmptyColumns() throws Exception
      Throws:
      Exception
    • hideRow

      public void hideRow(String row, boolean hide)
      Parameters:
      row -
      hide -
    • hideColumn

      public void hideColumn(String column, boolean hide)
      Parameters:
      column -
      hide -
    • replaceColumnValues

      public void replaceColumnValues(String column, HashMap<String,String> values) throws Exception
      Change values in a column where 0 = something and 1 = something different
      Parameters:
      column -
      values -
      Throws:
      Exception
    • applyColumnFilter

      public void applyColumnFilter(String column, ChangeValue changeValue) throws Exception
      Apply filter to a column to change values from say numberic to nominal based on some range
      Parameters:
      column -
      changeValue -
      Throws:
      Exception
    • addColumn

      public void addColumn(String column, String defaultValue)
      Parameters:
      column -
      defaultValue -
    • addColumns

      public void addColumns(ArrayList<String> columns, String defaultValue)
      Add columns to worksheet and set default value
      Parameters:
      columns -
      defaultValue -
    • addRow

      public void addRow(String row, String defaultValue)
      Parameters:
      row -
      defaultValue -
    • addRows

      public void addRows(ArrayList<String> rows, String defaultValue)
      Add rows to the worksheet and fill in default value
      Parameters:
      rows -
      defaultValue -
    • addCell

      public void addCell(String row, String col, String value) throws Exception
      Add data to a cell
      Parameters:
      row -
      col -
      value -
      Throws:
      Exception
    • isValidRow

      public boolean isValidRow(String row)
      Parameters:
      row -
      Returns:
    • isValidColumn

      public boolean isValidColumn(String col)
      Parameters:
      col -
      Returns:
    • setCacheDoubleValues

      public void setCacheDoubleValues(boolean value)
      Parameters:
      value -
    • getCellDouble

      public Double getCellDouble(String row, String col) throws Exception
      Parameters:
      row -
      col -
      Returns:
      Throws:
      Exception
    • getCell

      public String getCell(String row, String col) throws Exception
      Get cell value
      Parameters:
      row -
      col -
      Returns:
      Throws:
      Exception
    • changeRowHeader

      public void changeRowHeader(ChangeValue changeValue)
      Parameters:
      changeValue -
    • changeColumnHeader

      public void changeColumnHeader(ChangeValue changeValue)
      Parameters:
      changeValue -
    • changeRowHeader

      public void changeRowHeader(String row, String newRow) throws Exception
      Parameters:
      row -
      newRow -
      Throws:
      Exception
    • changeColumnsHeaders

      public void changeColumnsHeaders(LinkedHashMap<String,String> newColumnValues) throws Exception
      Change the columns in the HashMap Key to the name of the value
      Parameters:
      newColumnValues -
      Throws:
      Exception
    • changeColumnHeader

      public void changeColumnHeader(String col, String newCol) throws Exception
      Parameters:
      col -
      newCol -
      Throws:
      Exception
    • getColumnIndex

      public Integer getColumnIndex(String column) throws Exception
      Parameters:
      column -
      Returns:
      Throws:
      Exception
    • getRowIndex

      public Integer getRowIndex(String row) throws Exception
      Parameters:
      row -
      Returns:
      Throws:
      Exception
    • getRandomDataColumns

      public ArrayList<String> getRandomDataColumns(int number)
      Parameters:
      number -
      Returns:
    • getRandomDataColumns

      public ArrayList<String> getRandomDataColumns(int number, ArrayList<String> columns)
      Parameters:
      number -
      columns -
      Returns:
    • getAllColumns

      public ArrayList<String> getAllColumns()
      Get the list of column names including those that may be hidden
      Returns:
    • getColumns

      public ArrayList<String> getColumns()
      Get the list of column names. Does not include hidden columns
      Returns:
    • getDiscreteColumnValues

      public ArrayList<String> getDiscreteColumnValues(String column) throws Exception
      Get back a list of unique values in the column
      Parameters:
      column -
      Returns:
      Throws:
      Exception
    • getDiscreteRowValues

      public ArrayList<String> getDiscreteRowValues(String row) throws Exception
      Get back a list of unique values in the row
      Parameters:
      row -
      Returns:
      Throws:
      Exception
    • getAllRows

      public ArrayList<String> getAllRows()
      Get all rows including those that may be hidden
      Returns:
    • getRows

      public ArrayList<String> getRows()
      Get the list of row names. Will exclude hidden values
      Returns:
    • getDataRows

      public ArrayList<String> getDataRows()
      Get the list of row names
      Returns:
    • getLogScale

      public WorkSheet getLogScale(double base) throws Exception
      Get the log scale of this worksheet where a zero value will be set to .1 as Log(0) is undefined
      Parameters:
      base -
      Returns:
      Throws:
      Exception
    • getLogScale

      public WorkSheet getLogScale(double base, double zeroValue) throws Exception
      Get the log scale of this worksheet
      Parameters:
      base -
      Returns:
      Throws:
      Exception
    • swapRowAndColumns

      public WorkSheet swapRowAndColumns() throws Exception
      Swap the row and columns returning a new worksheet
      Returns:
      Throws:
      Exception
    • unionWorkSheetsRowJoin

      public static WorkSheet unionWorkSheetsRowJoin(String w1FileName, String w2FileName, char delimitter, boolean secondSheetMetaData) throws Exception
      Combine two work sheets where you join based on rows. Rows that are found in one but not the other are removed. If the second sheet is meta data then a meta data column will be added between the two joined columns
      Parameters:
      w1FileName -
      w2FileName -
      delimitter -
      secondSheetMetaData -
      Returns:
      Throws:
      Exception
    • unionWorkSheetsRowJoin

      public static WorkSheet unionWorkSheetsRowJoin(WorkSheet w1, WorkSheet w2, boolean secondSheetMetaData) throws Exception
      * Combine two work sheets where you join based on rows. Rows that are found in one but not the other are removed. If the second sheet is meta data then a meta data column will be added between the two joined columns
      Parameters:
      w1 -
      w2 -
      secondSheetMetaData -
      Returns:
      Throws:
      Exception
    • readCSV

      public static WorkSheet readCSV(String fileName, char delimiter) throws Exception
      Read a CSV/Tab delimitted file where you pass in the delimiter
      Parameters:
      fileName -
      delimiter -
      Returns:
      Throws:
      Exception
    • readCSV

      public static WorkSheet readCSV(File f, char delimiter) throws Exception
      Throws:
      Exception
    • readCSV

      public static WorkSheet readCSV(InputStream is, char delimiter) throws Exception
      Read a CSV/Tab delimited file where you pass in the delimiter
      Parameters:
      f -
      delimiter -
      Returns:
      Throws:
      Exception
    • saveCSV

      public void saveCSV(String fileName) throws Exception
      Save the worksheet as a csv file
      Parameters:
      fileName -
      Throws:
      Exception
    • saveTXT

      public void saveTXT(String fileName) throws Exception
      Parameters:
      fileName -
      Throws:
      Exception
    • setRowHeader

      public void setRowHeader(String value)
      Parameters:
      value -
    • appendWorkSheetColumns

      public void appendWorkSheetColumns(WorkSheet worksheet) throws Exception
      Add columns from a second worksheet to be joined by common row. If the appended worksheet doesn't contain a row in the master worksheet then default value of "" is used. Rows in the appended worksheet not found in the master worksheet are not added.
      Parameters:
      worksheet -
      Throws:
      Exception
    • appendWorkSheetRows

      public void appendWorkSheetRows(WorkSheet worksheet) throws Exception
      Add rows from a second worksheet to be joined by common column. If the appended worksheet doesn't contain a column in the master worksheet then default value of "" is used. Columns in the appended worksheet not found in the master worksheet are not added.
      Parameters:
      worksheet -
      Throws:
      Exception
    • save

      public void save(OutputStream outputStream, char delimitter, boolean quoteit) throws Exception
      Parameters:
      outputStream -
      delimitter -
      quoteit -
      Throws:
      Exception
    • getIndexColumnName

      public String getIndexColumnName()
      Returns:
      the indexColumnName
    • setIndexColumnName

      public void setIndexColumnName(String indexColumnName)
      Parameters:
      indexColumnName - the indexColumnName to set
    • getColumnLookup

      public LinkedHashMap<String,HeaderInfo> getColumnLookup()
      Returns:
      the columnLookup
    • getRowLookup

      public LinkedHashMap<String,HeaderInfo> getRowLookup()
      Returns:
      the rowLookup
    • getMetaDataColumnsHashMap

      public LinkedHashMap<String,String> getMetaDataColumnsHashMap()
      Returns:
      the metaDataColumnsHashMap
    • getMetaDataRowsHashMap

      public LinkedHashMap<String,String> getMetaDataRowsHashMap()
      Returns:
      the metaDataRowsHashMap
    • getRowHeader

      public String getRowHeader()
      Returns:
      the rowHeader