Aug 22 - Chakra: Advancing Performance Benchmarking and Co-design using Standardized Execution Traces

Questions:

In Chakra, it seems like they rely on the simulation programs to take their “Chakra Schema” for it to use the ET as input properly (e.g. ASTRA-v2.0 예시). Instead of relying on simulation programs to use a certain schema, our project aims to make an automated system that outputs ETs that are compatible with all simulators. Apart from that, it seems like the general idea of extracting operation info for simulations/projections and ML modeling is similar. Would this difference be enough for a strong paper? It seems like the difference exists in the implementation of the system.

Chakra converter exists (convert traces from PyTorch/Tensorflow/FlexFlow to Chakra ET)

Problem:

Cannot fully disclose AI model details due to intellectual properties and technologies
Hard to derive/reproduce exact workload details
HW companies tend to over-optimize a few use cases based on the derived parameters from MLPerf or other benchmarks

Goal: Provide infrastructure for HW-SW co-design for distributed ML systems

Chakra Overview:

Execution Traces: encode critical information related to compute and communication operator dimensions and dependencies while not revealing model or dataset details

Facilitate exchange of ML Execution Traces without revealing model specifics
memory access pattern, computational load, network communication, parallelization strategies