Database
Create and manage the PhysiCellModelManager.jl database.
Public API
PhysiCellModelManager.getAllParameterValues — Method
getAllParameterValues(simulation_id::Int)
getAllParameterValues(S::AbstractSampling)Get all parameter values for the given simulation, monad, or sampling as a DataFrame. Simulation ID can also be passed directly as an integer.
Identifying attributes
If sibling elements have identical tags, attributes are programmatically searched to find one that can be used to identify them. Priority is given to "name", "ID", and "id" attributes. If sibling elements cannot be uniquely identified by an attribute, artificial IDs will be added to the XML paths to ensure uniqueness for the column names. These will show up as <tag>:temp_id:<index> in the column names. Search for them with contains(col_name, ":temp_id:"). Note: these are not added to the XML files themselves. Users must manually insert such artificial IDs into their XML files to use PCMM to vary those parameters.
Converting column names into XML paths
To convert the column names in the returned DataFrame back into XML paths, split the column names by '/':
df = getAllParameterValues(simulation_id)
col1 = names(df)[1]
xml_path = split(col1, '/')Alternatively, the internal columnNameToXMLPath function can be used.
xml_path = PhysiCellModelManager.columnNameToXMLPath(col1) Conversion back can be done either with join or the columnName function:
xml_path = ["overall", "max_time"]
col_name = join(xml_path, '/')
# or
col_name = PhysiCellModelManager.columnName(xml_path)PhysiCellModelManager.getParameterValue — Method
getParameterValue(M::AbstractMonad, xp::XMLPath)
getParameterValue(M::AbstractMonad, xml_path::AbstractVector{<:AbstractString})
getParameterValue(simulation_id::Int, xp)Get the parameter value for the given XML path from the simulation/monad's variations database if it exists, otherwise from the base XML file.
- If the value is a boolean string ("true" or "false"), it will be returned as a Bool.
- Otherwise, if it can be parsed as a Float64, it will be returned as such; otherwise, it will be returned as-is.
PhysiCellModelManager.printSimulationsTable — Method
printSimulationsTable(args...; sink=println, kwargs...)Print a table of simulations and their varied values. See simulationsTable for details on the arguments and keyword arguments.
First, create a DataFrame by calling simulationsTable using args... and kwargs.... Then, pass the DataFrame to the sink function.
Arguments
See simulationsTable for details on the arguments.
Keyword Arguments
sink: A function to print the table. Defaults toprintln. Note, the table is a DataFrame, so you can also useCSV.writeto write the table to a CSV file.
See simulationsTable for details on the remaining keyword arguments.
Examples
printSimulationsTable([simulation_1, monad_3, sampling_2, trial_1])sim_ids = [1, 2, 3] # vector of simulation IDs
printSimulationsTable(sim_ids; remove_constants=false) # include constant columnsusing CSV
printSimulationsTable(; sink=CSV.write("temp.csv")) # write data for all simulations into temp.csvPhysiCellModelManager.simulationsTable — Method
simulationsTable(args...; kwargs...)Return a DataFrame with the simulation data calling simulationsTableFromQuery with those keyword arguments.
There are three options for args...:
Simulation,Monad,Sampling,Trial, any array (or vector) of such, or any number of such objects.- A vector of simulation IDs.
- If omitted, creates a DataFrame for all the simulations.
Private API
PhysiCellModelManager.abstractSamplingForeignReferenceSubSchema — Method
abstractSamplingForeignReferenceSubSchema()Create the part of the schema containing foreign key references for the simulations, monads, and samplings tables.
PhysiCellModelManager.addFolderNameColumns! — Method
addFolderNameColumns!(df::DataFrame)Add the folder names to the DataFrame for each location in the DataFrame.
PhysiCellModelManager.appendVariations — Method
appendVariations(location::Symbol, df::DataFrame)Add the varied parameters associated with the location to df.
PhysiCellModelManager.columnsExist — Method
columnsExist(column_names::AbstractVector{<:AbstractString}, table_name::String; kwargs...)
columnsExist(column_names::AbstractVector{<:AbstractString}, valid_column_names::AbstractVector{<:AbstractString})Check if all columns in column_names exist in the specified table in the database.
Alternatively, if the valid_column_names needs to be reused in the caller, it can be passed directly. Keyword arguments (such as db) are forwarded to tableColumns.
PhysiCellModelManager.constructSelectQuery — Function
constructSelectQuery(table_name::String, condition_stmt::String=""; selection::String="*")Construct a SELECT query for the given table name, condition statement, and selection.
PhysiCellModelManager.createDefaultStatusCodesTable — Method
createDefaultStatusCodesTable()Create the default status codes table in the database.
PhysiCellModelManager.createPCMMTable — Method
createPCMMTable(table_name::String, schema::String; db::SQLite.DB=centralDB())Create a table in the database with the given name and schema. The table will be created if it does not already exist.
The table name must end in "s" to help normalize the ID names for these entries. The schema must have a PRIMARY KEY named as the table name without the "s" followed by "_id."
PhysiCellModelManager.createSchema — Method
createSchema()Create the schema for the database. This includes creating the tables and populating them with data.
PhysiCellModelManager.databaseDiagnostics — Method
databaseDiagnostics()Run diagnostics on the central database to check for consistency and integrity.
PhysiCellModelManager.initializeDatabase — Method
initializeDatabase()Initialize the central database. If the database does not exist, it will be created.
PhysiCellModelManager.inputFolderID — Method
inputFolderID(location::Symbol, folder::String)Retrieve the ID of the folder associated with the given location and folder name.
PhysiCellModelManager.inputFolderName — Method
inputFolderName(location::Symbol, id::Int)Retrieve the folder name associated with the given location and ID.
PhysiCellModelManager.inputIDsSubSchema — Method
inputIDsSubSchema()Create the part of the schema corresponding to the input IDs.
PhysiCellModelManager.inputVariationIDsSubSchema — Method
inputVariationIDsSubSchema()Create the part of the schema corresponding to the varied inputs and their IDs.
PhysiCellModelManager.insertFolder — Function
insertFolder(location::Symbol, folder::String, description::String="")Insert a folder into the database. If the folder already exists, it will be ignored.
If the folder already has a description from the metadata.xml file, that description will be used instead of the one provided.
PhysiCellModelManager.isStarted — Method
isStarted(simulation_id::Int[; new_status_code::Union{Missing,String}=missing])Check if a simulation has been started. Can also pass in a Simulation object in place of the simulation ID.
If new_status_code is provided, update the status of the simulation to this value. The check and status update are done in a transaction to ensure that the status is not changed by another process.
PhysiCellModelManager.locationVariationsDatabase — Method
locationVariationsDatabase(location::Symbol, folder::String)Return the variations database for the location and folder.
The second argument can alternatively be the ID of the folder or an AbstractSampling object (simulation, monad, or sampling) using that folder.
PhysiCellModelManager.locationVariationsTable — Method
locationVariationsTable(query::String, db::SQLite.DB; remove_constants::Bool=false)Return a DataFrame containing the variations table for the given query and database.
Remove constant columns if remove_constants is true and the DataFrame has more than one row.
PhysiCellModelManager.locationVariationsTable — Method
locationVariationsTable(location::Symbol, ::Missing, variation_ids::AbstractVector{<:Integer}; kwargs...)If the location folder does not contain a variations database, return a DataFrame with all variation IDs set to 0.
PhysiCellModelManager.locationVariationsTable — Method
locationVariationsTable(location::Symbol, ::Nothing, variation_ids::AbstractVector{<:Integer}; kwargs...)If the location is not being used, return a DataFrame with all variation IDs set to -1.
PhysiCellModelManager.locationVariationsTable — Method
locationVariationsTable(location::Symbol, S::AbstractSampling; remove_constants::Bool=false)Return a DataFrame containing the variations table for the given location and sampling.
PhysiCellModelManager.locationVariationsTable — Method
locationVariationsTableName(location::Symbol, variations_database::SQLite.DB, variation_ids::AbstractVector{<:Integer}; remove_constants::Bool=false)Return a DataFrame containing the variations table for the given location, variations database, and variation IDs.
PhysiCellModelManager.metadataDescription — Method
metadataDescription(path_to_folder::AbstractString)Get the description from the metadata.xml file in the given folder using the description element as a child element of the root element.
PhysiCellModelManager.monadsSchema — Method
monadsSchema()Create the schema for the monads table. This includes the columns and their types.
PhysiCellModelManager.necessaryInputsPresent — Method
necessaryInputsPresent()Check if all necessary input folders are present in the database.
PhysiCellModelManager.parseValueFromString — Method
parseValueFromString(v::String)Helper function to parse a value from a string. If the string is "true" or "false", it returns a Bool. If the string can be parsed as a Float64, it returns a Float64. Otherwise, it returns the original string.
PhysiCellModelManager.physicellVersionsSchema — Method
physicellVersionsSchema()Create the schema for the physicell_versions table. This includes the columns and their types.
PhysiCellModelManager.queryToDataFrame — Method
queryToDataFrame(query::String; db::SQLite.DB=centralDB(), is_row::Bool=false)Execute a query against the database and return the result as a DataFrame.
If is_row is true, the function will assert that the result has exactly one row, i.e., a unique result.
PhysiCellModelManager.recognizedStatusCodes — Method
recognizedStatusCodes()Return the recognized status codes for simulations.
PhysiCellModelManager.recurseToGetParameterValues! — Method
recurseToGetParameterValues!(D::Dict{String,Any}, current_path::Vector{String}, element::XMLElement)Recursively traverse the XML element tree to extract parameter values and populate the dictionary D with XML paths as keys and their corresponding values. Used by getAllParameterValues.
PhysiCellModelManager.reinitializeDatabase — Method
reinitializeDatabase()Reinitialize the database by searching through the data/inputs directory to make sure all are present in the database.
PhysiCellModelManager.samplingsSchema — Method
samplingsSchema()Create the schema for the samplings table. This includes the columns and their types.
PhysiCellModelManager.simulationsTableFromQuery — Method
simulationsTableFromQuery(query::String; remove_constants::Bool=true, sort_by=String[], sort_ignore=[:SimID; shortLocationVariationID.(projectLocations().varied)])Return a DataFrame containing the simulations table for the given query.
By default, will ignore the simulation ID and the variation IDs for the varied locations when sorting. The sort order can be controlled by the sort_by and sort_ignore keyword arguments.
By default, constant columns (columns with the same value for all simulations) will be removed (unless there is only one simulation). Set remove_constants to false to keep these columns.
Arguments
query::String: The SQL query to execute.
Keyword Arguments
remove_constants::Bool: If true, removes columns that have the same value for all simulations. Defaults to true.sort_by::Vector{String}: A vector of column names to sort the table by. Defaults to all columns. To populate this argument, it is recommended to first print the table to see the column names.sort_ignore::Vector{String}: A vector of column names to ignore when sorting. Defaults to the simulation ID and the variation IDs associated with the simulations.
PhysiCellModelManager.statusCodeID — Method
statusCodeID(status_code::String)Get the ID of a status code from the database.
PhysiCellModelManager.stmtToDataFrame — Method
stmtToDataFrame(stmt::SQLite.Stmt, params; is_row::Bool=false)
stmtToDataFrame(stmt_str::AbstractString, params; db::SQLite.DB=centralDB(), is_row::Bool=false)Execute a prepared statement with the given parameters and return the result as a DataFrame. Compare with queryToDataFrame.
The params argument must be a type that can be used with DBInterface.execute(::SQLite.Stmt, params). See the SQLite.jl documentation for details.
If is_row is true, the function will assert that the result has exactly one row, i.e., a unique result.
Arguments
stmt::SQLite.Stmt: A prepared statement object. This includes the database connection and the SQL statement.stmt_str::AbstractString: A string containing the SQL statement to prepare.params: The parameters to bind to the prepared statement. Must be eitherVectororTupleand match the order of the placeholders in the SQL statement.NamedTupleorDictwith keys matching the named placeholders in the SQL statement.
Keyword Arguments
db::SQLite.DB: The database connection to use. Defaults to the central database. Unnecessary if using a prepared statement.is_row::Bool: If true, asserts that the result has exactly one row. Defaults to false.
PhysiCellModelManager.tableColumns — Method
tableColumns(table_name::String; db::SQLite.DB=centralDB())Return the names of the columns in the specified table in the database.
PhysiCellModelManager.tableExists — Method
tableExists(table_name::String; db::SQLite.DB=centralDB())Check if a table with the given name exists in the database.
PhysiCellModelManager.tableIDName — Method
tableIDName(table::String; strip_s::Bool=true)Return the name of the ID column for the table as a String. If strip_s is true, it removes the trailing "s" from the table name.
Examples
julia> PhysiCellModelManager.tableIDName("configs")
"config_id"PhysiCellModelManager.validateParsBytes — Method
validateParsBytes(db::SQLite.DB, table_name::String)Validate that the par_key bytes in the given table match the expected values based on the other columns.
PhysiCellModelManager.variationIDs — Method
variationIDs(location::Symbol, S::AbstractSampling)Return a vector of the variation IDs for the given location associated with S.