Metacell Utilities
src.metacell_utils.MetaCell(original_df, params, x_col, y_col, cell_type_col, original_idx_col, metacell_idx_col, original_delaunay, metacell_df, metacell_delaunay)
dataclass
Container for metacell collapse results + reproducibility metadata.
Key conventions:
- original_delaunay_* triangles refer to vertices in the original input.
- metacell_delaunay triangles refer to vertices in the returned metacell_df
and use metacell_idx_col (typically 0..n_metacells-1).
- metacell_df["members"] stores a list of original IDs (values from
original_idx_col) that were merged into each metacell.
Functions
metacell_members(metacell_idx)
Return list of original IDs that form this metacell.
original_delaunay_to_row_indices(triangles=None, *, on_missing='drop')
Convert original-ID-space triangles to row indices (0..n_original-1).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
triangles
|
ndarray
|
Triangle array in original-ID space. If None, uses self.original_delaunay. |
None
|
on_missing
|
('drop', 'error')
|
What to do if a triangle references an original ID not present in original_df. - "drop": drop those triangles - "error": raise KeyError |
"drop"
|
original_delaunay_to_xy(triangles=None, *, on_missing='drop')
Convert original-ID-space triangles to their X/Y coordinates.
Returns:
| Type | Description |
|---|---|
ndarray
|
Shape (n_triangles, 3, 2), where the last dimension is (x, y). |
metacell_delaunay_to_xy()
Convert metacell triangles (row-index space) to their X/Y coordinates.
Returns:
| Type | Description |
|---|---|
ndarray
|
Shape (n_triangles, 3, 2), where the last dimension is (x, y). |
to_summary_dict()
Small JSON-serializable-ish summary (avoids embedding full dataframes).
src.metacell_utils.greedy_triangle_collapse(aligned_df, max_metacell_size=3, max_iterations=1000, r_max=None, min_angle_deg=10, use_alpha_shape=False, alpha=0.05, *, original_idx_col='Cell_Num_Old', metacell_idx_col='metacell_id', x_col='X', y_col='Y', cell_type_col='cell_type', return_object=False)
Iteratively collapse same-type triangles into metacells.
This function simplifies a spatial graph by merging cells that form homogeneous triangles (all vertices same cell type). The result is a coarser graph with metacells that preserve boundary structure while reducing the number of nodes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
aligned_df
|
DataFrame
|
Input dataframe with spatial cells.
Required columns: x_col, y_col, cell_type_col, original_idx_col
Optional numeric columns will be averaged (e.g., cell type proportions)
Note: ID columns (Cell_Num, Cell_Num_Old, etc.) are NOT averaged;
metacells get new sequential IDs in |
required |
max_metacell_size
|
int
|
Maximum number of original cells in a metacell |
10
|
max_iterations
|
int
|
Maximum number of collapse iterations |
1000
|
r_max
|
float
|
Maximum edge length - triangles with any edge > r_max are removed |
None
|
min_angle_deg
|
float
|
Minimum angle in degrees - triangles with smaller angles are degenerate |
10
|
use_alpha_shape
|
bool
|
If True, filter triangles to only those within alpha shape |
False
|
alpha
|
float
|
Alpha parameter for alpha shape (smaller = tighter boundary) Only used if use_alpha_shape=True |
0.05
|
Returns:
| Type | Description |
|---|---|
If return_object is False (default):
|
metacell_df : pd.DataFrame Simplified graph where each row is a metacell. Columns: x_col, y_col, cell_type_col, size, members (list of original IDs), metacell_idx_col (new sequential IDs), plus averaged numeric columns. metacell_delaunay : np.ndarray Filtered Delaunay triangulation on metacells, shape (n_triangles, 3) |
If return_object is True:
|
MetaCell Object containing original_df, original_delaunay (filtered), metacell_df, metacell_delaunay, and all parameters used. |
Examples:
src.metacell_utils.unpack_metacell_matches(metacell_matches, metacell_aligned_df, metacell_ref_df, aligned_df=None, ref_df=None, strategy='distribute', aligned_original_idx_col=None, ref_original_idx_col=None, x_col='X', y_col='Y')
Unpack metacell-level matches to individual cell matches.
Handles two cases: 1. Only aligned has metacells: ref_df is individual cells (simple unpacking) 2. Both have metacells: both need unpacking using nearest neighbor
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
metacell_matches
|
DataFrame
|
Matches at metacell level (output from run_same on metacells) Must have columns: aligned_idx, ref_idx |
required |
metacell_aligned_df
|
DataFrame
|
Aligned metacell dataframe (output from greedy_triangle_collapse) Must have column: members (list of original cell indices) |
required |
metacell_ref_df
|
DataFrame
|
Reference dataframe - can be either: - Individual cells (no 'members' column): simple case - Metacells (has 'members' column): requires unpacking both sides |
required |
aligned_df
|
DataFrame
|
Original aligned cells with X, Y coordinates Required if strategy='nearest' or if ref has metacells |
None
|
ref_df
|
DataFrame
|
Original reference cells with X, Y coordinates Required if metacell_ref_df has metacells |
None
|
aligned_original_idx_col
|
str
|
If provided, interpret members in metacell_aligned_df as values from this column and use aligned_df.set_index(aligned_original_idx_col) to look up coordinates. If not provided, members are assumed to be valid aligned_df index values (legacy behavior). |
None
|
ref_original_idx_col
|
str
|
Analogous to aligned_original_idx_col, for ref_df lookups when ref has metacells. |
None
|
x_col
|
str
|
Coordinate column names in aligned_df/ref_df. |
'X'
|
y_col
|
str
|
Coordinate column names in aligned_df/ref_df. |
'X'
|
strategy
|
str
|
How to distribute matches: - 'distribute': all aligned members → same ref (only valid if ref is individual cells) - 'nearest': each aligned member → nearest ref member (required if both are metacells) |
'distribute'
|
Returns:
| Name | Type | Description |
|---|---|---|
individual_matches |
DataFrame
|
Matches at individual cell level Columns: aligned_idx, ref_idx |
Examples:
Case 1: Only aligned has metacells
>>> metacell_aligned, _ = greedy_triangle_collapse(aligned_df)
>>> metacell_matches, _ = run_same(metacell_aligned, ref_df, ...)
>>> individual_matches = unpack_metacell_matches(
... metacell_matches, metacell_aligned, ref_df
... )
Case 2: Both have metacells
>>> metacell_aligned, _ = greedy_triangle_collapse(aligned_df)
>>> metacell_ref, _ = greedy_triangle_collapse(ref_df)
>>> metacell_matches, _ = run_same(metacell_aligned, metacell_ref, ...)
>>> individual_matches = unpack_metacell_matches(
... metacell_matches, metacell_aligned, metacell_ref,
... aligned_df=aligned_df, ref_df=ref_df, strategy='nearest'
... )