Identify group splits and merges from multi-individual trajectory data
identify_splits_and_merges.Rd
Detects splits and merges (a.k.a. fissions and fusions) using "sticky-DBSCAN" method from Libera et al. 2023.
Arguments
- xs
UTM eastings matrix (
n_inds
xn_times
matrix where xsi,t gives the easting of individual i at time step t)- ys
UTM northings matrix (
n_inds
xn_times
matrix where ysi,t gives the northing of individual i at time step t)- timestamps
vector of timestamps (POSIXct), must have same dimensions as columns of
xs
andys
matrices- R_inner
inner distance threshold to identify periods of connectedness (numeric)
- R_outer
outer distance threshold to identify periods of connectedness (numeric)
- breaks
indexes to breaks in the data (default NULL treats data as a contiguous sequence). If specified, overrides
break_by_day
- names
optional vector of names (if NULL, will be defined as
as.character(1:n_inds)
)- break_by_day
whether to break up data by date (T or F)
- verbose
whether to print out statements as code progresses
Value
a list containing:
events_detected
: data frame with info on detected fissions and fusions, and limited info for shuffles
all_events_info
: list of information about all fission-fusion (or shuffle)
events.
groups_list
: list of subgroups in each timestep
together
: n_inds x n_inds x n_times array of whether dyads are
connected (1) or not (0) or unknown (NA)
R_inner
: inner radius used in the computations (same as R_inner
above)
R_outer
: outer radius used in the computations (same as R_outer
above)
Details
Start by defining an adjacency matrix (together
in the code) of which dyads
are "connected" at any moment in time. Dyads are considered to be connected if
they go within a distance R_inner
of one another, and they continue to be
connected until they leave a distance R_outer
of one another on both ends
(before and after) of the period where their distance dropped below R_inner
.
This double threshold makes periods of connectedness more stable by removing
the "flicker" that would result from having a single threshold.
NAs are handled by ending a period of connectedness if an individual has an NA
at the point immediately before / after (if the were connected after / before
that NA). Individuals with NAs are not included in the together matrix and will
not be included in the groups.
Once connectedness of dyads is determined, merge dyads together into groups by
using DBSCAN on 1 - together
as the distance matrix, with eps
equal to
something small (.01 in the code).
Store these groups in groups_list
, a list of lists whose length is equal to
n_times
.
Stepping through the groups_list
, identify all changes in group membership,
i.e. consecutive time points when the groups do not match. The algorithm flags
any change, including instances where individuals disappear or reappear in
groups due to missing data (but these are later ignored). Store in changes
data frame.
In a last step, identify all splits ('fission'), merges ('fusion'), and things that
cannot be classified as either fissions or fusions because they contain elements
of both ('shuffle'). This is done by constructing a bipartite network at each time
step t, where groups at time t are connected to groups at time t + 1 if they share
at least 1 member. Then, we identify the connected components of this bipartite
network. Components where a single group (node) at time t is connected to multiple
groups (nodes) at time t + 1 get identified and classified as event_type = 'fission'
.
Components where multiple nodes at time t are connected to a single node at time t + 1
are classified as event_type = 'fusion'
. Components where a single node at time t
is connected to a single node at time t + 1 are skipped (they are not fissions, fusions,
or shuffles). All other events where more complex things happen are classified as
event_type = 'shuffle'
.
After events are identified, various event features are computed and saved in a data frame. See list of outputs for more details.
Additional information about returned objects
events_detected
data frame:
events_detected$event_idx
: unique id number of the event
events_detected$tidx
: (initial) time index of the event
events_detected$event_type
: "fission" or "fusion" or "shuffle
events_detected$n_groups_before
: number of groups prior to the event
events_detected$n_groups_after
: number of groups after the event
events_detected$big_group_idxs
: indexes of all the individuals involved in the event
events_detected$big_group
: names of all the individuals involved in the event
events_detected$group_A_idxs
, $group_B_idxs
, $group_C_idxs
, etc.: individual idxs of subgroup members
events_detected$group_A
, $group_B
, $group_C
, etc.: names of subgroup members
events_detected$n_A
, $n_B
, $n_C
etc.: number of individuals in each subgroup
events_detected$n_big_group
: number of individuals in the big group (original group for fissions, subseq group for fusions)
(NOTE: big_group_idxs
, big_group
, group_A_idxs
etc.,
group_A
etc. n_A
etc. and n_big_group
are set to NA for shuffles...
you can get more detailed info for shuffles from all_events_info
object))
all_events_info
list:
all_events_info[[i]]
contains the following info for event i:
all_events_info[[i]]$t
: time index of the event
all_events_info[[i]]$groups_before
: (list of lists) list of groups before the event (at time t)
all_events_info[[i]]$groups_after
: (list of lists) list of groups after the event (at time t + 1)
all_events_info[[i]]event_type
: 'fission', 'fusion', or 'shuffle' (character string)
all_events_info[[i]]$n_groups_before
: number of groups before the event
all_events_info[[i]]$n_groups_after
: number of groups after the event
groups_list
list:
groups_list[[t]]
gives a list of the subgroups
groups_list[[t]][[1]]
gives the vector of the first subgroup, etc.