The Science and Information (SAI) Organization
  • Home
  • About Us
  • Journals
  • Conferences
  • Contact Us

Publication Links

  • IJACSA
  • Author Guidelines
  • Publication Policies
  • Outstanding Reviewers

IJACSA

  • About the Journal
  • Call for Papers
  • Editorial Board
  • Author Guidelines
  • Submit your Paper
  • Current Issue
  • Archives
  • Indexing
  • Fees/ APC
  • Reviewers
  • Apply as a Reviewer

IJARAI

  • About the Journal
  • Archives
  • Indexing & Archiving

Special Issues

  • Home
  • Archives
  • Proposals
  • ICONS_BA 2025

Computer Vision Conference (CVC)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Computing Conference

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Intelligent Systems Conference (IntelliSys)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Future Technologies Conference (FTC)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact
  • Home
  • Call for Papers
  • Editorial Board
  • Guidelines
  • Submit
  • Current Issue
  • Archives
  • Indexing
  • Fees
  • Reviewers
  • RSS Feed

DOI: 10.14569/IJACSA.2026.0170575
PDF

Malak: A Python Toolkit for Edge AI Model Optimization

Author 1: Mohammed Hassan Alnemari

International Journal of Advanced Computer Science and Applications(IJACSA), Volume 17 Issue 5, 2026.

  • Abstract and Keywords
  • How to Cite this Article
  • {} BibTeX Source

Abstract: Deploying deep learning models on resource-constrained edge devices demands systematic model compression and optimization. PyTorch supplies low-level quantization and pruning primitives, yet a typical quantization-aware training workflow still requires approximately 40 to 60 lines of observer wiring, calibration, and conversion boilerplate, which slows iteration for researchers and students. We present Malak, an open-source Python toolkit that wraps these primitives behind a small task-oriented application programming interface exposed through both a Python module and an edgeai command-line interface. The toolkit covers the full prototyping loop: model training; post-training quantization (dynamic and static) and quantization-aware training; magnitude and structured pruning; knowledge distillation; export to the Open Neural Network Exchange format; per-layer latency profiling; and a Kullback–Leibler-divergence drift check for deployed models. We empirically validate the com-pression subset (post-training quantization, quantization-aware training, pruning, and knowledge distillation) on the CIFAR-10 and Fashion-MNIST image-classification benchmarks with five architectures: MobileNetV2, ResNet18, ResNet50, EfficientNet-B0, and a custom convolutional network we refer to as Sim-pleCNN. Across three random seeds, quantization-aware training yields a 3.48× reduction in model size on MobileNetV2 with accuracy statistically indistinguishable from the 32-bit floating-point baseline (77.91±0.83 per cent versus 77.68±0.92 per cent). Dynamic post-training quantization preserves accuracy within 0.13 per cent across all tested architectures, and magnitude pruning at 50 per cent sparsity holds within roughly one percentage point of the baseline after three fine-tuning epochs. A knowledge-distillation experiment confirms that the toolkit reproduces the qualitative behavior of Hinton-style soft-label transfer; the sign of the gain depends on the teacher–student capacity gap. On-device latency measured on a deployment-class server processor (single-input inference, 200 measurements) shows a 2.16× wall-clock speedup for the statically quantized 8-bit-integer SimpleCNN over its 32-bit floating-point counterpart, and the same 8-bit-integer binary builds, fits within 445 kilobytes of on-chip memory, and runs end-to-end on a simulated STM32H7 (Arm Cortex-M7) microcontroller target under the Renode hardware simulator. The toolkit is released under the MIT license.

Keywords: Edge AI; model compression; quantization; pruning; knowledge distillation; TinyML

Mohammed Hassan Alnemari. “Malak: A Python Toolkit for Edge AI Model Optimization”. International Journal of Advanced Computer Science and Applications (IJACSA) 17.5 (2026). http://dx.doi.org/10.14569/IJACSA.2026.0170575

@article{Alnemari2026,
title = {Malak: A Python Toolkit for Edge AI Model Optimization},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2026.0170575},
url = {http://dx.doi.org/10.14569/IJACSA.2026.0170575},
year = {2026},
publisher = {The Science and Information Organization},
volume = {17},
number = {5},
author = {Mohammed Hassan Alnemari}
}



Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.

IJACSA

Upcoming Conferences

Computer Vision Conference (CVC) 2026

21-22 May 2026

  • Amsterdam, The Netherlands

Computing Conference 2026

9-10 July 2026

  • London, United Kingdom

Artificial Intelligence Conference 2026

3-4 September 2026

  • Amsterdam, The Netherlands

Future Technologies Conference (FTC) 2026

15-16 October 2026

  • Berlin, Germany
The Science and Information (SAI) Organization
BACK TO TOP

Computer Science Journal

  • About the Journal
  • Call for Papers
  • Submit Paper
  • Indexing

Our Conferences

  • Computer Vision Conference
  • Computing Conference
  • Intelligent Systems Conference
  • Future Technologies Conference

Help & Support

  • Contact Us
  • About Us
  • Terms and Conditions
  • Privacy Policy

The Science and Information (SAI) Organization Limited is a company registered in England and Wales under Company Number 8933205.