Database

Create and manage the PhysiCellModelManager.jl database.

Public API

PhysiCellModelManager.getAllParameterValuesMethod
getAllParameterValues(simulation_id::Int)
getAllParameterValues(S::AbstractSampling)

Get all parameter values for the given simulation, monad, or sampling as a DataFrame. Simulation ID can also be passed directly as an integer.

Identifying attributes

If sibling elements have identical tags, attributes are programmatically searched to find one that can be used to identify them. Priority is given to "name", "ID", and "id" attributes. If sibling elements cannot be uniquely identified by an attribute, artificial IDs will be added to the XML paths to ensure uniqueness for the column names. These will show up as <tag>:temp_id:<index> in the column names. Search for them with contains(col_name, ":temp_id:"). Note: these are not added to the XML files themselves. Users must manually insert such artificial IDs into their XML files to use PCMM to vary those parameters.

Converting column names into XML paths

To convert the column names in the returned DataFrame back into XML paths, split the column names by '/':

df = getAllParameterValues(simulation_id)
col1 = names(df)[1]
xml_path = split(col1, '/')

Alternatively, the internal columnNameToXMLPath function can be used.

xml_path = PhysiCellModelManager.columnNameToXMLPath(col1) 

Conversion back can be done either with join or the columnName function:

xml_path = ["overall", "max_time"]
col_name = join(xml_path, '/')
# or
col_name = PhysiCellModelManager.columnName(xml_path)
source
PhysiCellModelManager.getParameterValueMethod
getParameterValue(M::AbstractMonad, xp::XMLPath)
getParameterValue(M::AbstractMonad, xml_path::AbstractVector{<:AbstractString})
getParameterValue(simulation_id::Int, xp)

Get the parameter value for the given XML path from the simulation/monad's variations database if it exists, otherwise from the base XML file.

  • If the value is a boolean string ("true" or "false"), it will be returned as a Bool.
  • Otherwise, if it can be parsed as a Float64, it will be returned as such; otherwise, it will be returned as-is.
source
PhysiCellModelManager.printSimulationsTableMethod
printSimulationsTable(args...; sink=println, kwargs...)

Print a table of simulations and their varied values. See simulationsTable for details on the arguments and keyword arguments.

First, create a DataFrame by calling simulationsTable using args... and kwargs.... Then, pass the DataFrame to the sink function.

Arguments

See simulationsTable for details on the arguments.

Keyword Arguments

  • sink: A function to print the table. Defaults to println. Note, the table is a DataFrame, so you can also use CSV.write to write the table to a CSV file.

See simulationsTable for details on the remaining keyword arguments.

Examples

printSimulationsTable([simulation_1, monad_3, sampling_2, trial_1])
sim_ids = [1, 2, 3] # vector of simulation IDs
printSimulationsTable(sim_ids; remove_constants=false) # include constant columns
using CSV
printSimulationsTable(; sink=CSV.write("temp.csv")) # write data for all simulations into temp.csv
source
PhysiCellModelManager.simulationsTableMethod
simulationsTable(args...; kwargs...)

Return a DataFrame with the simulation data calling simulationsTableFromQuery with those keyword arguments.

There are three options for args...:

  • Simulation, Monad, Sampling, Trial, any array (or vector) of such, or any number of such objects.
  • A vector of simulation IDs.
  • If omitted, creates a DataFrame for all the simulations.
source

Private API

PhysiCellModelManager.columnsExistMethod
columnsExist(column_names::AbstractVector{<:AbstractString}, table_name::String; kwargs...)
columnsExist(column_names::AbstractVector{<:AbstractString}, valid_column_names::AbstractVector{<:AbstractString})

Check if all columns in column_names exist in the specified table in the database.

Alternatively, if the valid_column_names needs to be reused in the caller, it can be passed directly. Keyword arguments (such as db) are forwarded to tableColumns.

source
PhysiCellModelManager.createPCMMTableMethod
createPCMMTable(table_name::String, schema::String; db::SQLite.DB=centralDB())

Create a table in the database with the given name and schema. The table will be created if it does not already exist.

The table name must end in "s" to help normalize the ID names for these entries. The schema must have a PRIMARY KEY named as the table name without the "s" followed by "_id."

source
PhysiCellModelManager.insertFolderFunction
insertFolder(location::Symbol, folder::String, description::String="")

Insert a folder into the database. If the folder already exists, it will be ignored.

If the folder already has a description from the metadata.xml file, that description will be used instead of the one provided.

source
PhysiCellModelManager.isStartedMethod
isStarted(simulation_id::Int[; new_status_code::Union{Missing,String}=missing])

Check if a simulation has been started. Can also pass in a Simulation object in place of the simulation ID.

If new_status_code is provided, update the status of the simulation to this value. The check and status update are done in a transaction to ensure that the status is not changed by another process.

source
PhysiCellModelManager.locationVariationsDatabaseMethod
locationVariationsDatabase(location::Symbol, folder::String)

Return the variations database for the location and folder.

The second argument can alternatively be the ID of the folder or an AbstractSampling object (simulation, monad, or sampling) using that folder.

source
PhysiCellModelManager.locationVariationsTableMethod
locationVariationsTable(query::String, db::SQLite.DB; remove_constants::Bool=false)

Return a DataFrame containing the variations table for the given query and database.

Remove constant columns if remove_constants is true and the DataFrame has more than one row.

source
PhysiCellModelManager.locationVariationsTableMethod
locationVariationsTable(location::Symbol, ::Missing, variation_ids::AbstractVector{<:Integer}; kwargs...)

If the location folder does not contain a variations database, return a DataFrame with all variation IDs set to 0.

source
PhysiCellModelManager.locationVariationsTableMethod
locationVariationsTableName(location::Symbol, variations_database::SQLite.DB, variation_ids::AbstractVector{<:Integer}; remove_constants::Bool=false)

Return a DataFrame containing the variations table for the given location, variations database, and variation IDs.

source
PhysiCellModelManager.metadataDescriptionMethod
metadataDescription(path_to_folder::AbstractString)

Get the description from the metadata.xml file in the given folder using the description element as a child element of the root element.

source
PhysiCellModelManager.parseValueFromStringMethod
parseValueFromString(v::String)

Helper function to parse a value from a string. If the string is "true" or "false", it returns a Bool. If the string can be parsed as a Float64, it returns a Float64. Otherwise, it returns the original string.

source
PhysiCellModelManager.queryToDataFrameMethod
queryToDataFrame(query::String; db::SQLite.DB=centralDB(), is_row::Bool=false)

Execute a query against the database and return the result as a DataFrame.

If is_row is true, the function will assert that the result has exactly one row, i.e., a unique result.

source
PhysiCellModelManager.simulationsTableFromQueryMethod
simulationsTableFromQuery(query::String; remove_constants::Bool=true, sort_by=String[], sort_ignore=[:SimID; shortLocationVariationID.(projectLocations().varied)])

Return a DataFrame containing the simulations table for the given query.

By default, will ignore the simulation ID and the variation IDs for the varied locations when sorting. The sort order can be controlled by the sort_by and sort_ignore keyword arguments.

By default, constant columns (columns with the same value for all simulations) will be removed (unless there is only one simulation). Set remove_constants to false to keep these columns.

Arguments

  • query::String: The SQL query to execute.

Keyword Arguments

  • remove_constants::Bool: If true, removes columns that have the same value for all simulations. Defaults to true.
  • sort_by::Vector{String}: A vector of column names to sort the table by. Defaults to all columns. To populate this argument, it is recommended to first print the table to see the column names.
  • sort_ignore::Vector{String}: A vector of column names to ignore when sorting. Defaults to the simulation ID and the variation IDs associated with the simulations.
source
PhysiCellModelManager.stmtToDataFrameMethod
stmtToDataFrame(stmt::SQLite.Stmt, params; is_row::Bool=false)
stmtToDataFrame(stmt_str::AbstractString, params; db::SQLite.DB=centralDB(), is_row::Bool=false)

Execute a prepared statement with the given parameters and return the result as a DataFrame. Compare with queryToDataFrame.

The params argument must be a type that can be used with DBInterface.execute(::SQLite.Stmt, params). See the SQLite.jl documentation for details.

If is_row is true, the function will assert that the result has exactly one row, i.e., a unique result.

Arguments

  • stmt::SQLite.Stmt: A prepared statement object. This includes the database connection and the SQL statement.
  • stmt_str::AbstractString: A string containing the SQL statement to prepare.
  • params: The parameters to bind to the prepared statement. Must be either
    • Vector or Tuple and match the order of the placeholders in the SQL statement.
    • NamedTuple or Dict with keys matching the named placeholders in the SQL statement.

Keyword Arguments

  • db::SQLite.DB: The database connection to use. Defaults to the central database. Unnecessary if using a prepared statement.
  • is_row::Bool: If true, asserts that the result has exactly one row. Defaults to false.
source
PhysiCellModelManager.tableIDNameMethod
tableIDName(table::String; strip_s::Bool=true)

Return the name of the ID column for the table as a String. If strip_s is true, it removes the trailing "s" from the table name.

Examples

julia> PhysiCellModelManager.tableIDName("configs")
"config_id"
source