Analysis
Analyze output from a PCMM project. It is anticipated that this will eventually be split off into its own module or even package. Possibly with loader.jl.
Public API
PhysiCellModelManager.connectedComponents — Function
connectedComponents(snapshot::PhysiCellSnapshot, graph=:neighbors; include_cell_type_names=:all_in_one, exclude_cell_type_names::String[], include_dead::Bool=false)
connectedComponents(simulation::Simulation, index, graph; kwargs...)
connectedComponents(pcmm_output::PCMMOutput{Simulation}, args...; kwargs...)Find the connected components of a graph in a PhysiCell snapshot.
The computation can be done on subsets of cells based on their cell types.
Arguments
snapshot::PhysiCellSnapshot: The snapshot to analyzegraph: The graph data to use (default is:neighbors); must be one of:neighbors,:attachments, or:spring_attachments; can also be a stringsimulation::Simulation: The simulation object to analyzeindex: The index of the snapshot to analyze (can be an integer or a symbol)pcmm_output::PCMMOutput{Simulation}: The output of a PCMM simulation run to analyze
Keyword Arguments
include_cell_type_names: The cell types to include in the analysis (default is:all_in_one). Full list of options::all- compute connected components for all cell types individually:all_in_one- compute connected components for all cell types together"cell_type_1"- compute connected components only for the cells of typecell_type_1["cell_type_1", "cell_type_2"]- compute connected components for the cells of typecell_type_1andcell_type_2separately[["cell_type_1", "cell_type_2"]]- compute connected components for the cells of typecell_type_1andcell_type_2together[["cell_type_1", "cell_type_2"], "cell_type_3"]- compute connected components for the cells of typecell_type_1andcell_type_2together, and for the cells of typecell_type_3separately
exclude_cell_type_names: The cell types to exclude from the analysis (default isString[]); can be a single string or a vector of stringsinclude_dead: Whether to include dead cells in the analysis (default isfalse)
Returns
A dictionary in which each key is one of the following:
- cell type name (String)
- list of cell type names (Vector{String})
- list of cell type names followed by the symbol :include_dead (Vector{Any})
For each key, the value is a list of connected components in the graph. Each component is represented as a vector of vertex labels. As of this writing, the vertex labels are the simple AgentID class that wraps the cell ID.
PhysiCellModelManager.motilityStatistics — Method
motilityStatistics(simulation_id::Integer[; direction=:any])
motilityStatistics(simulation::Simulation[; direction=:any])
motilityStatistics(pcmm_output::PCMMOutput{Simulation}[; direction=:any])Return the mean speed, distance traveled, and time alive for each cell in the simulation, broken down by cell type in the case of cell type transitions.
The time is counted from when the cell first appears in simulation output until it dies or the simulation ends, whichever comes first. If the cell transitions to a new cell type during the simulation, the time is counted for each cell type separately. Each cell type taken on by a given cell will be a key in the dictionary returned at that entry.
Arguments
simulation_id::Integer: The ID of the PhysiCell simulation. ASimulationobject can also be passed in.simulation::Simulation: The simulation object.pcmm_output::PCMMOutput{Simulation}: The output of a PCMM simulation run.
Keyword Arguments
direction::Symbol: The direction to compute the mean speed. Can be:x,:y,:z, or:any(default). If:x, for example, the mean speed is calculated using only the x component of the cell's movement.
Returns
AgentDict{Dict{String, NamedTuple}}: AnAgentDict, i.e., one entry per cell in the simulation. Each dictionary has keys for each cell type taken on by the cell. The values are NamedTuples with fields:time,:distance, and:speed.
Example
ms = motilityStatistics(1) # an AgentDict{Dict{String, NamedTuple}}, one per cell in the simulation
ms[1]["epithelial"] # NamedTuple with fields :time, :distance, :speed for the cell with ID 1 in the simulation corresponding to its time as an `epithelial` cell
ms[1]["mesenchymal"].time # time spent as a `mesenchymal` cell for the cell with ID 1 in the simulation
ms[1]["mesenchymal"].distance # distance traveled as a `mesenchymal` cell for the cell with ID 1 in the simulation
ms[1]["mesenchymal"].speed # mean speed as a `mesenchymal` cell for the cell with ID 1 in the simulationPairCorrelationFunction.pcf — Function
pcf(S::AbstractPhysiCellSequence, center_cell_types, target_cell_types=center_cell_types; include_dead::Union{Bool,Tuple{Bool,Bool}}=false, dr::Float64=20.0)
pcf(simulation::Simulation[, index], args...; kwargs...)
pcf(simulation_id::Integer[, index], args...; kwargs...)
pcf(pcmm_output::PCMMOutput{Simulation}[, index], args...; kwargs...)Calculate the pair correlation function (PCF) between two sets of cell types in a PhysiCell simulation snapshot or sequence.
The center_cell_types and target_cell_types can be strings or vectors of strings. This will compute one PCF rather than one for each pair of (center, target) cell types, i.e., all centers are compared to all targets. If omitted, the targetcelltypes will be the same as the centercelltypes, i.e., not a cross-PCF. The include_dead argument can be a boolean or a tuple of booleans to indicate whether to include the dead centers and/or targets, respectively. The dr argument specifies the bin size (thickness of each annulus) for the PCF calculation.
In the signatures with an index, if it is provided then a single snapshot is analyzed, otherwise the entire sequence is analyzed. The index must be an integer or either the symbol :initial or :final.
Arguments
S::AbstractPhysiCellSequence: APhysiCellSnapshotorPhysiCellSequenceobject.center_cell_types: The cell type name(s) to use as the center of the PCF.target_cell_types: The cell type name(s) to use as the target of the PCF.simulation::Simulation: The simulation object.simulation_id::Integer: The ID of the PhysiCell simulation.pcmm_output::PCMMOutput{Simulation}: The output of a PCMM simulation run.index::Union{Integer, Symbol}: The index of the snapshot to analyze (can be an integer or the symbol:initialor:final).
Keyword Arguments
include_dead::Union{Bool,Tuple{Bool,Bool}}: Whether to include dead cells in the PCF calculation. If a tuple, the first element indicates whether to include dead centers and the second element indicates whether to include dead targets.dr::Float64: The bin size for the PCF calculation.
Alternate methods
pcf(simulation::Simulation, index::Union{Integer, Symbol}, center_cell_types, target_cell_types=center_cell_types; kwargs...): Calculate the PCF for a specific snapshot in a simulation.pcf(simulation_id::Integer, index::Union{Integer, Symbol}, center_cell_types, target_cell_types=center_cell_types; kwargs...): Calculate the PCF for a specific snapshot in a simulation by ID.pcf(simulation_id::Integer, center_cell_types, target_cell_types=center_cell_types; kwargs...): Calculate the PCF for all snapshots in a simulation by ID.pcf(simulation::Simulation, center_cell_types, target_cell_types=center_cell_types; kwargs...): Calculate the PCF for all snapshots in a simulation.
Returns
A PCMMPCFResult object containing the time, radii, and g values of the PCF. Regardless of the type of S, the time and radii will always be vectors. If S is a snapshot, the g values will be a vector of the PCF. If S is a sequence, the g values will be a (length(radii)-1 x length(time)) matrix of the PCF.
PhysiCellModelManager.finalPopulationCount — Method
finalPopulationCount(simulation::Simulation[; include_dead::Bool=false])Return the final population count of a simulation as a dictionary with cell type names as keys and their counts as values.
Also works with the simulation ID:
fpc = finalPopulationCount(1)Example
fpc = finalPopulationCount(simulation)
final_default_count = fpc["default"]PhysiCellModelManager.plotbycelltype — Function
plotbycelltype(T::AbstractTrial; include_dead::Bool=false, include_cell_type_names=:all, exclude_cell_type_names=String[], time_unit=:min)Plot the population time series of a trial by cell type.
Each cell type gets its own subplot. Each monad gets its own series within each subplot.
Arguments
T::AbstractTrial: The trial to plot.
Keyword Arguments
include_dead::Bool: Whether to include dead cells in the count. Default isfalse.include_cell_type_names::Vector{String}: A vector of cell type names to include in the plot. Default is:all, which includes all cell types in separate plots.exclude_cell_type_names::Vector{String}: A vector of cell type names to exclude from the plot. Default is an empty vector, i.e., no cell types are excluded.time_unit::Symbol: The time unit to use for the x-axis. Default is:min, which uses minutes.
PhysiCellModelManager.populationCount — Function
populationCount(snapshot, cell_type_to_name_dict::Dict{Int,String}=Dict{Int,String}(), labels::Vector{String}=String[]; include_dead::Bool=false)Return the population count of a snapshot as a dictionary with cell type names as keys and their counts as values.
If the snapshot is missing, it will return missing. This helps in cases where the files have been deleted, for example by pruning.
Arguments
snapshot::PhysiCellSnapshot: The snapshot to count the cells in.cell_type_to_name_dict::Dict{Int,String}: A dictionary mapping cell type IDs to names (default is an empty dictionary). If not provided, it is read from the snapshot files.labels::Vector{String}: The labels to identify the cell data. If not provided, it is read from the snapshot files.
Keyword Arguments
include_dead::Bool: Whether to include dead cells in the count. Default isfalse.
PhysiCellModelManager.populationTimeSeries — Method
populationTimeSeries(M::AbstractMonad[; include_dead::Bool=false])
populationTimeSeries(pcmm_output::PCMMOutput{<:AbstractMonad}[; include_dead::Bool=false])Return the population time series of a simulation or a monad.
See SimulationPopulationTimeSeries and MonadPopulationTimeSeries for more details.
PhysiCellModelManager.simulationRuntimeIntervals — Method
simulationRuntimeIntervals(sequence::PhysiCellSequence)
simulationRuntimeIntervals(simulation::Simulation)
simulationRuntimeIntervals(pcmm_output::PCMMOutput{Simulation})
simulationRuntimeIntervals(simulation_id::Integer)Get the runtime intervals of a simulation.
Returns
A named tuple with the time and runtime vectors. Each entry corresponds to the amount of runtime to simulate from the previous snapshot to the current snapshot.
Examples
seq = simulationRuntimeIntervals(1)
using Plots
plot(seq.time, seq.runtime) # plot time from previous save time vs time
plot(seq.time, cumsum(seq.runtime)) # plot time vs cumulative runtime (same as plotting against the runtime values of each snapshot)PhysiCellModelManager.AverageSubstrateTimeSeries — Type
AverageSubstrateTimeSeriesA struct to hold the average substrate concentrations over time for a PhysiCell simulation.
Constructed using AverageSubstrateTimeSeries(x) where x is any of the following: Integer (simulation ID), PhysiCellSequence, or Simulation.
Fields
simulation_id::Int: The ID of the PhysiCell simulation.time::Vector{Real}: The time points at which the snapshots were taken.substrate_concentrations::Dict{String, Vector{Real}}: A dictionary mapping substrate names to vectors of their average concentrations over time.
Example
asts = PhysiCellModelManager.AverageSubstrateTimeSeries(1) # Load average substrate time series for Simulation 1
asts.time # Get the time points
asts["time"] # alternative way to get the time points
asts["oxygen"] # Get the oxygen concentration over timePhysiCellModelManager.ExtracellularSubstrateTimeSeries — Type
ExtracellularSubstrateTimeSeriesA struct to hold the mean extracellular substrate concentrations per cell type over time for a PhysiCell simulation.
Fields
simulation_id::Int: The ID of the PhysiCell simulation.time::Vector{Real}: The time points at which the snapshots were taken.data::Dict{String, Dict{String, Vector{Real}}}: A dictionary mapping cell type names to dictionaries mapping substrate names to vectors of their average concentrations over time.
Example
ests = PhysiCellModelManager.ExtracellularSubstrateTimeSeries(1) # Load extracellular substrate time series for Simulation 1
ests.time # Get the time points
ests["cancer"]["oxygen"] # Get the oxygen concentration over time for the cancer cell type
ests = PhysiCellModelManager.ExtracellularSubstrateTimeSeries(simulation; include_dead=true) # Load extracellular substrate time series for a Simulation object, including dead cells
ests["time"] # Alternate way to get the time points
ests["cd8"]["IFNg"] # Get the interferon gamma concentration over time for the CD8 cell type
ests = PhysiCellModelManager.ExtracellularSubstrateTimeSeries(sequence) # Load extracellular substrate time series for a PhysiCellSequence objectPrivate API
PhysiCellModelManager._connectedComponents — Method
_connectedComponents(G::MetaGraph)Find the connected components of a graph.
First compute the transitive closure of the underlying Graphs.Graph object. Then, update the edge data of the MetaGraph object to reflect the new edges. Finally, loop over vertex labels and find the vertices they are connected to until all vertices are accounted for. Returns a list of connected components, where each component is represented as a vector of vertex labels.
PhysiCellModelManager._motilityStatistics — Method
_motilityStatistics(p[; direction=:any])Compute the motility statistics for a single cell in the PhysiCell simulation.
Accounts for cell type transitions and computes the distance traveled, time spent, and mean speed for each cell type the given cell has taken on during the simulation. The speed can be restricted to a specific direction (x, y, z) or calculated in any direction. In either case, the distance is unsigned.
This function is used internally by motilityStatistics.
Returns
A Dict{String, NamedTuple} where each key is a cell type name visited by the cell and the value is a NamedTuple with fields :time, :distance, and :speed. The values in this named tuple are the time, distance traveled, and mean speed for the cell in that cell type, i.e., all scalars.
PhysiCellModelManager.PCMMPCFResult — Type
PCMMPCFResultA struct to hold the results of the pair correlation function (PCF) calculation.
The start and end radii for each annulus are stored in the radii field. Thus, there is one more radius than there are annuli, i.e. length(radii) == size(g, 1) + 1. Each column of g corresponds to a time point in the time field, hence size(g, 2) == length(time).
Fields
time::Vector{Float64}: The time points at which the PCF was calculated.pcf_result::PairCorrelationFunction.PCFResult: The result of the PCF calculation.
Example
using PairCorrelationFunction
time = 12.0
radii = [0.0, 1.0, 2.0]
g = [0.5, 1.2]
PhysiCellModelManager.PCMMPCFResult(time, PairCorrelationFunction.PCFResult(radii, g))
# output
PCMMPCFResult:
Time: 12.0
Radii: 0.0 - 2.0 with 2 annuli, Δr = 1.0
g: 0.5 - 1.2 (min - max)using PairCorrelationFunction
time = [12.0; 24.0; 36.0]
radii = [0.0, 1.0, 2.0]
g = [0.5 0.6 0.4; 1.2 1.15 1.4]
PhysiCellModelManager.PCMMPCFResult(time, PairCorrelationFunction.PCFResult(radii, g))
# output
PCMMPCFResult:
Time: 12.0 - 36.0 (n = 3)
Radii: 0.0 - 2.0 with 2 annuli, Δr = 1.0
g: 0.4 - 1.4 (min - max)PhysiCellModelManager.cellPositionsForPCF — Method
cellPositionsForPCF(cells::DataFrame, cell_types::Vector{String}, cell_name_to_type_dict::Dict{String,Int}, include_dead::Bool, is_3d::Bool)Get the positions of the cells for the PCF calculation.
PhysiCellModelManager.isCrossPCF — Method
isCrossPCF(center_cell_types::Vector{String}, target_cell_types::Vector{String})Check if the center and target cell types are the same (PCF) or disjoint (cross-PCF); if neither, throw an error.
PhysiCellModelManager.pcfConstants — Method
pcfConstants(S::AbstractPhysiCellSequence, dr::Float64)Create a Constants object for the PCF calculation based on the mesh of the snapshot or sequence.
PhysiCellModelManager.pcfSnapshotCalculation — Method
pcfSnapshotCalculation(snapshot::PhysiCellSnapshot, center_cell_types::Vector{String}, target_cell_types::Vector{String}, cell_name_to_type_dict::Dict{String,Int}, include_dead::Union{Bool,Tuple{Bool,Bool}}, is_cross_pcf::Bool, constants::Constants)Calculate the pair correlation function (PCF) for a given snapshot.
PhysiCellModelManager.preparePCF! — Method
preparePCF!(S::AbstractPhysiCellSequence, center_cell_types, target_cell_types, include_dead::Union{Bool,Tuple{Bool,Bool}}, dr::Float64)Prepare the arguments for the PCF calculation.
PhysiCellModelManager.preparePCFPlot — Method
preparePCFPlot(results::Vector{PCMMPCFResult}; time_unit=:min, distance_unit=:um)Prepare the time and radii for the PCF plot.
PhysiCellModelManager.processDistance — Method
processDistance(distance::Vector{Float64}, distance_unit::Symbol)Process the distance vector to convert it to the desired distance unit. Options are :um, :mm, and :cm.
PhysiCellModelManager.processPCFCellTypes — Method
processPCFCellTypes(cell_types)Process the cell types for the PCF calculation so that they are always a vector of strings.
PhysiCellModelManager.processTime — Method
processTime(time::Vector{Float64}, time_unit::Symbol)Process the time vector to convert it to the desired time unit. Options are :min, :s, :h, :d, :w, :mo, and :y.
PhysiCellModelManager.AbstractPopulationTimeSeries — Type
AbstractPopulationTimeSeriesAbstract type representing a population time series for either a simulation or a monad.
PhysiCellModelManager.MonadPopulationTimeSeries — Type
MonadPopulationTimeSeries <: AbstractPopulationTimeSeriesHolds the data for a monad's population time series.
Note: unlike SimulationPopulationTimeSeries, this type does not save the data to a file.
Examples
mpts = MonadPopulationTimeSeries(1)
mpts = MonadPopulationTimeSeries(Monad(1))Fields
monad_id::Int: The ID of the monad.monad_length::Int: The number of simulations in the monad.time::Vector{Real}: The time points of the population time series.cell_count::Dict{String, NamedTuple}: A dictionary where keys are cell type names and values are NamedTuples with fields:counts,:mean, and:std.
PhysiCellModelManager.SimulationPopulationTimeSeries — Type
SimulationPopulationTimeSeries <: AbstractPopulationTimeSeriesHolds the data for a simulation's population time series.
If constructed using a Simulation or an Integer (representing a simulation ID), it will save the time series inside the simulations/simulation_id/summary/ folder. It will also look for previously computed time series there to avoid recomputing them.
Examples
spts = SimulationPopulationTimeSeries(1) # first checks if the population time series is already computed and if not, computes it
spts = SimulationPopulationTimeSeries(Simulation(1)) # first checks if the population time series is already computed and if not, computes it
spts = SimulationPopulationTimeSeries(1; include_dead=true) # similar, but counts dead cells as well; the file name has "_include_dead" appendedFields
simulation_id::Int: The ID of the simulation.time::Vector{Real}: The time points of the population time series.cell_count::Dict{String, Vector{Integer}}: A dictionary where keys are cell type names and values are vectors of cell counts over time.
PhysiCellModelManager.formatTimeRange — Method
formatTimeRange(v::Vector{<:Real})Format a vector of time points into a string representation.
Used only for printing certain classes to the console.
PhysiCellModelManager.getMeanCounts — Method
getMeanCounts(apts::AbstractPopulationTimeSeries)Return the mean counts of a population time series.
PhysiCellModelManager.processExcludeCellTypes — Method
processExcludeCellTypes(exclude_cell_type_names)Process the exclude_cell_type_names argument to ensure it is in the correct format.
If exclude_cell_type_names is a string, it is converted to a single-element vector. If it is a vector, it is returned as is.
PhysiCellModelManager.processIncludeCellTypes — Method
processIncludeCellTypes(include_cell_type_names, all_cell_types::Vector{String})Process the include_cell_type_names argument to ensure it is in the correct format.
Uses the all_cell_types vector to determine the valid cell types.
Arguments
include_cell_type_names: the cell types to include in the analysis (default is:all_in_one). Full list of options::all- return the vector of all cell types:all_in_one- return a vector with a single element, which is a vector of all cell types"cell_type_1"- return ["celltype1"]["cell_type_1", "cell_type_2"]- return ["celltype1", "celltype2"][["cell_type_1", "cell_type_2"]]- return [["celltype1", "celltype2"]][["cell_type_1", "cell_type_2"], "cell_type_3"]- return [["celltype1", "celltype2"], "celltype3"]
all_cell_types: a vector of all cell types in the simulation
PhysiCellModelManager.VoxelWeights — Type
VoxelWeightsA struct to hold the voxel weights for a PhysiCell simulation.
Fields
use_weights::Bool: Whether to use weights for the voxel volumes. If all voxel volumes are the same, this is set tofalse.weights::Vector{Real}: The weights for the voxel volumes.weight_total::Real: The total weight of the voxel volumes.
PhysiCellModelManager.averageExtracellularSubstrate — Function
averageExtracellularSubstrate(snapshot::PhysiCellSnapshot, cell_type_to_name_dict::Dict{Int, String}=Dict{Int, String}(), substrate_names::Vector{String}=String[], labels::Vector{String}=String[]; include_dead::Bool=false)Compute the average extracellular substrate concentrations for each cell type in a PhysiCell snapshot.
Arguments
snapshot::PhysiCellSnapshot: The snapshot to analyze.cell_type_to_name_dict::Dict{Int, String}: A dictionary mapping cell type IDs to their names. If not provided, it is read from the snapshot files.substrate_names::Vector{String}: The names of the substrates in the simulation. If not provided, it is read from the snapshot files.labels::Vector{String}: The labels to use for the cells. If not provided, it is read from the snapshot files.
Keyword Arguments
include_dead::Bool: Whether to include dead cells in the analysis (default isfalse).
Returns
Dict{String, Dict{String, Real}}: A dictionary mapping cell type names to dictionaries mapping substrate names to their average concentrations.
That is, if aes is the output of this function, then aes["cell_type_name"]["substrate_name"] is the average concentration of substrate_name for cells of type cell_type_name in the snapshot.
PhysiCellModelManager.averageSubstrate — Function
averageSubstrate(snapshot::PhysiCellSnapshot, substrate_names::Vector{String}=String[], voxel_weights::VoxelWeights=VoxelWeights(snapshot))Compute the average substrate concentrations for every substrate in a PhysiCell snapshot.
The voxel volumes are used as weights if they are not all the same.
Arguments
snapshot::PhysiCellSnapshot: The snapshot to analyze.substrate_names::Vector{String}: The names of the substrates in the simulation. If not provided, it is read from the snapshot files.voxel_weights::VoxelWeights: The voxel weights to use. If not provided, it is computed from the snapshot.
Returns
data::Dict{String, Real}: A dictionary mapping substrate names to their average concentrations.
PhysiCellModelManager.computeVoxelIndices — Method
computeVoxelIndices(cells::DataFrame, mesh::Dict{String, Vector{Float64}})Compute the voxel (linear) indices (1-nxnynz) for a set of cells in a PhysiCell simulation.
PhysiCellModelManager.computeVoxelSubscripts — Method
computeVoxelSubscripts(cells::DataFrame, mesh::Dict{String, Vector{Float64}})Compute the voxel subscripts (1-nx, 1-ny, 1-nz) for a set of cells in a PhysiCell simulation.