Package org.biojava.nbio.survival.data
Class WorkSheet
java.lang.Object
org.biojava.nbio.survival.data.WorkSheet
Need to handle very large spreadsheets of expression data so keep memory
footprint low
- Author:
- Scooter Willis
-
Constructor Summary
ConstructorDescriptionWorkSheet
(Collection<String> rows, Collection<String> columns) WorkSheet
(CompactCharSequence[][] values) -
Method Summary
Modifier and TypeMethodDescriptionvoid
Add data to a cellvoid
void
addColumns
(ArrayList<String> columns, String defaultValue) Add columns to worksheet and set default valuevoid
void
Add rows to the worksheet and fill in default valuevoid
appendWorkSheetColumns
(WorkSheet worksheet) Add columns from a second worksheet to be joined by common row.void
appendWorkSheetRows
(WorkSheet worksheet) Add rows from a second worksheet to be joined by common column.void
applyColumnFilter
(String column, ChangeValue changeValue) Apply filter to a column to change values from say numberic to nominal based on some rangevoid
changeColumnHeader
(String col, String newCol) void
changeColumnHeader
(ChangeValue changeValue) void
changeColumnsHeaders
(LinkedHashMap<String, String> newColumnValues) Change the columns in the HashMap Key to the name of the valuevoid
changeRowHeader
(String row, String newRow) void
changeRowHeader
(ChangeValue changeValue) void
clear()
See if we can free up memoryGet the list of column names including those that may be hiddenGet all rows including those that may be hiddenGet cell valuegetCellDouble
(String row, String col) getColumnIndex
(String column) Get the list of column names.static WorkSheet
getCopyWorkSheet
(WorkSheet copyWorkSheet) Create a copy of a worksheet.static WorkSheet
getCopyWorkSheetSelectedRows
(WorkSheet copyWorkSheet, ArrayList<String> rows) Create a copy of a worksheet.Get the list of row namesgetDiscreteColumnValues
(String column) Get back a list of unique values in the columnGet back a list of unique values in the rowgetLogScale
(double base) Get the log scale of this worksheet where a zero value will be set to .1 as Log(0) is undefinedgetLogScale
(double base, double zeroValue) Get the log scale of this worksheetgetRandomDataColumns
(int number) getRandomDataColumns
(int number, ArrayList<String> columns) getRowIndex
(String row) getRows()
Get the list of row names.void
hideColumn
(String column, boolean hide) void
void
void
hideMetaDataColumns
(boolean value) void
hideMetaDataRows
(boolean value) void
boolean
isMetaDataColumn
(String column) boolean
isMetaDataRow
(String row) boolean
isValidColumn
(String col) boolean
isValidRow
(String row) void
markMetaDataColumn
(String column) void
markMetaDataColumns
(ArrayList<String> metaDataColumns) marks columns as containing meta datavoid
markMetaDataRow
(String row) void
randomlyDivideSave
(double percentage, String fileName1, String fileName2) Split a worksheet randomly.static WorkSheet
static WorkSheet
readCSV
(InputStream is, char delimiter) Read a CSV/Tab delimited file where you pass in the delimiterstatic WorkSheet
Read a CSV/Tab delimitted file where you pass in the delimitervoid
replaceColumnValues
(String column, HashMap<String, String> values) Change values in a column where 0 = something and 1 = something differentvoid
save
(OutputStream outputStream, char delimitter, boolean quoteit) void
Save the worksheet as a csv filevoid
void
setCacheDoubleValues
(boolean value) void
setIndexColumnName
(String indexColumnName) void
setMetaDataColumns
(ArrayList<String> metaDataColumns) Clears existing meta data columns and sets new onesvoid
void
setMetaDataColumnsAfterColumn
(String column) void
setMetaDataRows
(ArrayList<String> metaDataRows) void
void
void
setRowHeader
(String value) void
shuffleColumnsAndThenRows
(ArrayList<String> columns, ArrayList<String> rows) Randomly shuffle the columns and rows.void
shuffleColumnValues
(ArrayList<String> columns) Need to shuffle column values to allow for randomized testing.void
shuffleRowValues
(ArrayList<String> rows) Need to shuffle rows values to allow for randomized testing.Swap the row and columns returning a new worksheettoString()
static WorkSheet
unionWorkSheetsRowJoin
(String w1FileName, String w2FileName, char delimitter, boolean secondSheetMetaData) Combine two work sheets where you join based on rows.static WorkSheet
unionWorkSheetsRowJoin
(WorkSheet w1, WorkSheet w2, boolean secondSheetMetaData) * Combine two work sheets where you join based on rows.
-
Constructor Details
-
WorkSheet
public WorkSheet() -
WorkSheet
- Parameters:
rows
-columns
-- Throws:
Exception
-
WorkSheet
- Parameters:
values
-
-
WorkSheet
- Parameters:
values
-
-
-
Method Details
-
clear
public void clear()See if we can free up memory -
toString
-
randomlyDivideSave
public void randomlyDivideSave(double percentage, String fileName1, String fileName2) throws Exception Split a worksheet randomly. Used for creating a discovery/validation data set The first file name will matched the percentage and the second file the remainder- Parameters:
percentage
-fileName1
-fileName2
-- Throws:
Exception
-
getCopyWorkSheetSelectedRows
public static WorkSheet getCopyWorkSheetSelectedRows(WorkSheet copyWorkSheet, ArrayList<String> rows) throws Exception Create a copy of a worksheet. If shuffling of columns or row for testing a way to duplicate original worksheet- Parameters:
copyWorkSheet
-rows
-- Returns:
- Throws:
Exception
-
getCopyWorkSheet
Create a copy of a worksheet. If shuffling of columns or row for testing a way to duplicate original worksheet- Parameters:
copyWorkSheet
-- Returns:
- Throws:
Exception
-
getMetaDataColumns
- Returns:
-
getMetaDataRows
- Returns:
-
getDataColumns
- Returns:
-
shuffleColumnsAndThenRows
public void shuffleColumnsAndThenRows(ArrayList<String> columns, ArrayList<String> rows) throws Exception Randomly shuffle the columns and rows. Should be constrained to the same data type if not probably doesn't make any sense.- Parameters:
columns
-rows
-- Throws:
Exception
-
shuffleColumnValues
Need to shuffle column values to allow for randomized testing. The columns in the list will be shuffled together- Parameters:
columns
-- Throws:
Exception
-
shuffleRowValues
Need to shuffle rows values to allow for randomized testing. The rows in the list will be shuffled together- Parameters:
rows
-- Throws:
Exception
-
hideMetaDataColumns
public void hideMetaDataColumns(boolean value) - Parameters:
value
-
-
hideMetaDataRows
public void hideMetaDataRows(boolean value) - Parameters:
value
-
-
setMetaDataRowsAfterRow
public void setMetaDataRowsAfterRow() -
setMetaDataColumnsAfterColumn
public void setMetaDataColumnsAfterColumn() -
setMetaDataRowsAfterRow
- Parameters:
row
-
-
setMetaDataColumnsAfterColumn
- Parameters:
column
-
-
setMetaDataColumns
Clears existing meta data columns and sets new ones- Parameters:
metaDataColumns
-
-
markMetaDataColumns
marks columns as containing meta data- Parameters:
metaDataColumns
-
-
markMetaDataColumn
- Parameters:
column
-
-
isMetaDataColumn
- Parameters:
column
-- Returns:
-
isMetaDataRow
- Parameters:
row
-- Returns:
-
markMetaDataRow
- Parameters:
row
-
-
setMetaDataRows
- Parameters:
metaDataRows
-
-
hideEmptyRows
- Throws:
Exception
-
hideEmptyColumns
- Throws:
Exception
-
hideRow
- Parameters:
row
-hide
-
-
hideColumn
- Parameters:
column
-hide
-
-
replaceColumnValues
Change values in a column where 0 = something and 1 = something different- Parameters:
column
-values
-- Throws:
Exception
-
applyColumnFilter
Apply filter to a column to change values from say numberic to nominal based on some range- Parameters:
column
-changeValue
-- Throws:
Exception
-
addColumn
- Parameters:
column
-defaultValue
-
-
addColumns
Add columns to worksheet and set default value- Parameters:
columns
-defaultValue
-
-
addRow
- Parameters:
row
-defaultValue
-
-
addRows
Add rows to the worksheet and fill in default value- Parameters:
rows
-defaultValue
-
-
addCell
Add data to a cell- Parameters:
row
-col
-value
-- Throws:
Exception
-
isValidRow
- Parameters:
row
-- Returns:
-
isValidColumn
- Parameters:
col
-- Returns:
-
setCacheDoubleValues
public void setCacheDoubleValues(boolean value) - Parameters:
value
-
-
getCellDouble
- Parameters:
row
-col
-- Returns:
- Throws:
Exception
-
getCell
Get cell value- Parameters:
row
-col
-- Returns:
- Throws:
Exception
-
changeRowHeader
- Parameters:
changeValue
-
-
changeColumnHeader
- Parameters:
changeValue
-
-
changeRowHeader
- Parameters:
row
-newRow
-- Throws:
Exception
-
changeColumnsHeaders
Change the columns in the HashMap Key to the name of the value- Parameters:
newColumnValues
-- Throws:
Exception
-
changeColumnHeader
- Parameters:
col
-newCol
-- Throws:
Exception
-
getColumnIndex
- Parameters:
column
-- Returns:
- Throws:
Exception
-
getRowIndex
- Parameters:
row
-- Returns:
- Throws:
Exception
-
getRandomDataColumns
- Parameters:
number
-- Returns:
-
getRandomDataColumns
- Parameters:
number
-columns
-- Returns:
-
getAllColumns
Get the list of column names including those that may be hidden- Returns:
-
getColumns
Get the list of column names. Does not include hidden columns- Returns:
-
getDiscreteColumnValues
Get back a list of unique values in the column- Parameters:
column
-- Returns:
- Throws:
Exception
-
getDiscreteRowValues
Get back a list of unique values in the row- Parameters:
row
-- Returns:
- Throws:
Exception
-
getAllRows
Get all rows including those that may be hidden- Returns:
-
getRows
Get the list of row names. Will exclude hidden values- Returns:
-
getDataRows
Get the list of row names- Returns:
-
getLogScale
Get the log scale of this worksheet where a zero value will be set to .1 as Log(0) is undefined- Parameters:
base
-- Returns:
- Throws:
Exception
-
getLogScale
Get the log scale of this worksheet- Parameters:
base
-- Returns:
- Throws:
Exception
-
swapRowAndColumns
Swap the row and columns returning a new worksheet- Returns:
- Throws:
Exception
-
unionWorkSheetsRowJoin
public static WorkSheet unionWorkSheetsRowJoin(String w1FileName, String w2FileName, char delimitter, boolean secondSheetMetaData) throws Exception Combine two work sheets where you join based on rows. Rows that are found in one but not the other are removed. If the second sheet is meta data then a meta data column will be added between the two joined columns- Parameters:
w1FileName
-w2FileName
-delimitter
-secondSheetMetaData
-- Returns:
- Throws:
Exception
-
unionWorkSheetsRowJoin
public static WorkSheet unionWorkSheetsRowJoin(WorkSheet w1, WorkSheet w2, boolean secondSheetMetaData) throws Exception * Combine two work sheets where you join based on rows. Rows that are found in one but not the other are removed. If the second sheet is meta data then a meta data column will be added between the two joined columns- Parameters:
w1
-w2
-secondSheetMetaData
-- Returns:
- Throws:
Exception
-
readCSV
Read a CSV/Tab delimitted file where you pass in the delimiter- Parameters:
fileName
-delimiter
-- Returns:
- Throws:
Exception
-
readCSV
- Throws:
Exception
-
readCSV
Read a CSV/Tab delimited file where you pass in the delimiter- Parameters:
f
-delimiter
-- Returns:
- Throws:
Exception
-
saveCSV
Save the worksheet as a csv file- Parameters:
fileName
-- Throws:
Exception
-
saveTXT
- Parameters:
fileName
-- Throws:
Exception
-
setRowHeader
- Parameters:
value
-
-
appendWorkSheetColumns
Add columns from a second worksheet to be joined by common row. If the appended worksheet doesn't contain a row in the master worksheet then default value of "" is used. Rows in the appended worksheet not found in the master worksheet are not added.- Parameters:
worksheet
-- Throws:
Exception
-
appendWorkSheetRows
Add rows from a second worksheet to be joined by common column. If the appended worksheet doesn't contain a column in the master worksheet then default value of "" is used. Columns in the appended worksheet not found in the master worksheet are not added.- Parameters:
worksheet
-- Throws:
Exception
-
save
- Parameters:
outputStream
-delimitter
-quoteit
-- Throws:
Exception
-
getIndexColumnName
- Returns:
- the indexColumnName
-
setIndexColumnName
- Parameters:
indexColumnName
- the indexColumnName to set
-
getColumnLookup
- Returns:
- the columnLookup
-
getRowLookup
- Returns:
- the rowLookup
-
getMetaDataColumnsHashMap
- Returns:
- the metaDataColumnsHashMap
-
getMetaDataRowsHashMap
- Returns:
- the metaDataRowsHashMap
-
getRowHeader
- Returns:
- the rowHeader
-