Workload Layer: user defines and describes target DNN models, target parallelization strategies, and training loops → essentially where the real work is placed
System Layer: implements collective communication algorithms, schedules compute/communication operations, and manages compute-communication overlaps → essentially where scheduling of operations is done
Network API: communication times are computing using analytical or event-driven simulators (NS-3).
Within this code, the following init() is called. EventType::StreamInit is used as input to the respective simulation instance’s collective communication algorithm used (e.g. HalvingDoubling).
void StreamBaseline::init() {
initialized = true;
last_init = Sys::boostedTick();
if (!my_current_phase.enabled) {
return;
}
my_current_phase.algorithm->run(EventType::StreamInit, nullptr);
if (steps_finished == 1) {
queuing_delay.push_back(last_phase_change - creation_time);
}
queuing_delay.push_back(Sys::boostedTick() - last_phase_change);
total_packets_sent = 1;
}
After initialization using EventType::StreamInit, the scheduler calls this again with EventType::Generalwhich prepares the stream->owner->front_end_sim_send() and stream->owner->front_end_sim_recv()functions. The following is the code within HalvingDoubling that starts the calls to sim_send or sim_recv:
void HalvingDoubling::run(EventType event, CallData* data) {
if (event == EventType::General) {
free_packets += 1;
ready();
iteratable();
} else if (event == EventType::PacketReceived) {
total_packets_received++;
insert_packet(nullptr);
} else if (event == EventType::StreamInit) {
for (int i = 0; i < parallel_reduce; i++) {
insert_packet(nullptr);
}
}
}
Some good resources: https://docs.google.com/document/d/14T4fAQe4d9dPq7dZEoEQ_dF6kSq0FaGlFlZLZdSSfx0/