Computer Vision Conference (CVC) 2026
21-22 May 2026
Publication Links
IJACSA
Special Issues
Computer Vision Conference (CVC)
Computing Conference
Intelligent Systems Conference (IntelliSys)
Future Technologies Conference (FTC)
International Journal of Advanced Computer Science and Applications(IJACSA), Volume 17 Issue 5, 2026.
Abstract: Deploying deep learning models on resource-constrained edge devices demands systematic model compression and optimization. PyTorch supplies low-level quantization and pruning primitives, yet a typical quantization-aware training workflow still requires approximately 40 to 60 lines of observer wiring, calibration, and conversion boilerplate, which slows iteration for researchers and students. We present Malak, an open-source Python toolkit that wraps these primitives behind a small task-oriented application programming interface exposed through both a Python module and an edgeai command-line interface. The toolkit covers the full prototyping loop: model training; post-training quantization (dynamic and static) and quantization-aware training; magnitude and structured pruning; knowledge distillation; export to the Open Neural Network Exchange format; per-layer latency profiling; and a Kullback–Leibler-divergence drift check for deployed models. We empirically validate the com-pression subset (post-training quantization, quantization-aware training, pruning, and knowledge distillation) on the CIFAR-10 and Fashion-MNIST image-classification benchmarks with five architectures: MobileNetV2, ResNet18, ResNet50, EfficientNet-B0, and a custom convolutional network we refer to as Sim-pleCNN. Across three random seeds, quantization-aware training yields a 3.48× reduction in model size on MobileNetV2 with accuracy statistically indistinguishable from the 32-bit floating-point baseline (77.91±0.83 per cent versus 77.68±0.92 per cent). Dynamic post-training quantization preserves accuracy within 0.13 per cent across all tested architectures, and magnitude pruning at 50 per cent sparsity holds within roughly one percentage point of the baseline after three fine-tuning epochs. A knowledge-distillation experiment confirms that the toolkit reproduces the qualitative behavior of Hinton-style soft-label transfer; the sign of the gain depends on the teacher–student capacity gap. On-device latency measured on a deployment-class server processor (single-input inference, 200 measurements) shows a 2.16× wall-clock speedup for the statically quantized 8-bit-integer SimpleCNN over its 32-bit floating-point counterpart, and the same 8-bit-integer binary builds, fits within 445 kilobytes of on-chip memory, and runs end-to-end on a simulated STM32H7 (Arm Cortex-M7) microcontroller target under the Renode hardware simulator. The toolkit is released under the MIT license.
Mohammed Hassan Alnemari. “Malak: A Python Toolkit for Edge AI Model Optimization”. International Journal of Advanced Computer Science and Applications (IJACSA) 17.5 (2026). http://dx.doi.org/10.14569/IJACSA.2026.0170575
@article{Alnemari2026,
title = {Malak: A Python Toolkit for Edge AI Model Optimization},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2026.0170575},
url = {http://dx.doi.org/10.14569/IJACSA.2026.0170575},
year = {2026},
publisher = {The Science and Information Organization},
volume = {17},
number = {5},
author = {Mohammed Hassan Alnemari}
}
Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.