
D (default = 1.0 controls the level of stochasticity in the model)Īlternatively, it is possible to run each step separately, as follows: S (optional, needed to compute fate probabilities. e (required if no expression matrix is supplied) k (default = 10 used to compute edge list for knn graph) p (default = 50 used to compute distance matrix for knn graph) N (default = False used to normalize expression data for knn graph) V (default = 2 used to filter genes for knn graph) E (default = 0.05 used to filter genes for knn graph) Usage: python PBA_pipeline.py -X (required if no edge list is supplied. If lineage specific fates were inputted: a fate probability matrix where rows are cells and columns are fates (an array called B.npy saved to the same diectory as the edge list or expression matrix) The potential (an array called V.npy saved to the same directory as the edge list or expression matrix)

The pseudoinverse of the knn graph Laplacian (a matrix called Linv.npy saved to the same directory as the edge list or expression matrix) Output: If no edge list was inputted: an edge list (a file called "edge_list.csv" saved to the same directory as the expression matrix) above)Īnd optionally a lineage specific sink matrix (Input 4. above),Ī gobal source/sink vector (Input 3. To execute all the steps at once, run PBA_pipeline.py as follows:

Each one can be run as a separate script, which relies on the output from the previous script. PBA applies a sequence of calculations (see below). We provide example datasets from (ref 1) in example_datasets/. It is redundant: setting D = x is equivalent to multiplying inputs 3., 4. This parameter controls the level of stochasticity in the model. The i,j entry represents the flux of cells from gene expression state i into lineage j. csv) file should contain a matrix with one column for each lineage and one row for each cell. If provided, this matrix can be used to define terminal lineages and compute the fate probabilities of sampled gene expession state. Note that uniformly changing R by a scalar factor f is equivalent to changing the diffusion rate (level of stochasticity) by 1/f. csv) file should contain a vector of source/sink terms representing the relative rates of proliferation and loss at each sampled gene expession state.
#Effort time sink matrix software
Users can generate, visualize and then export knn graph edge lists in our companion software SPRING (ref 2), available as a webserver or a standalone program. The file should contain an edge in the format " i,j" on each line (0-based numbering).

Instead of an expression matrix (input 1.), users can upload a list of edges representing a (knn) graph over sampled gene expession states. Rows represent cells and columns represent genes. csv) file should contain a matrix of single-cell gene expression values and is used to generate a k-nearest neighbor (knn) graph adjacency matrix. We prove mathematically in (ref 1) that this approximation converges in the high sampling limit. In practice, this equation cannot be numerically solved, but for a set of observed transcriptional states xi sampled from c, PBA calculates an approximate solution for V. Where c is the density of cells in gene expression space, and R is the net rate of cell division minus cell loss. The potential V is the solution to the partial differential equation, PBA requires enough cells to be sampled to approximate a continuum of gene expression states and some prior estimates of the relative rates of proliferation and loss in different gene expression states ( R). Knowledge of V allows one to calculate possible dynamical trajectories and their properties. The dynamics are represented as a potential field, V, over the space of gene expression.

Population balance analysis (PBA) relates the observed states of a system to its steady-state dynamics by applying the law of population balance (ref 1).
