The Science and Information (SAI) Organization
  • Home
  • About Us
  • Journals
  • Conferences
  • Contact Us

Publication Links

  • IJACSA
  • Author Guidelines
  • Publication Policies
  • Digital Archiving Policy
  • Promote your Publication
  • Metadata Harvesting (OAI2)

IJACSA

  • About the Journal
  • Call for Papers
  • Editorial Board
  • Author Guidelines
  • Submit your Paper
  • Current Issue
  • Archives
  • Indexing
  • Fees/ APC
  • Reviewers
  • Apply as a Reviewer

IJARAI

  • About the Journal
  • Archives
  • Indexing & Archiving

Special Issues

  • Home
  • Archives
  • Proposals
  • Guest Editors
  • SUSAI-EE 2025
  • ICONS-BA 2025
  • IoT-BLOCK 2025

Future of Information and Communication Conference (FICC)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Computing Conference

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Intelligent Systems Conference (IntelliSys)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact

Future Technologies Conference (FTC)

  • Home
  • Call for Papers
  • Submit your Paper/Poster
  • Register
  • Venue
  • Contact
  • Home
  • Call for Papers
  • Editorial Board
  • Guidelines
  • Submit
  • Current Issue
  • Archives
  • Indexing
  • Fees
  • Reviewers
  • Subscribe

DOI: 10.14569/IJACSA.2023.01411142
PDF

Mukh-Oboyob: Stable Diffusion and BanglaBERT enhanced Bangla Text-to-Face Synthesis

Author 1: Aloke Kumar Saha
Author 2: Noor Mairukh Khan Arnob
Author 3: Nakiba Nuren Rahman
Author 4: Maria Haque
Author 5: Shah Murtaza Rashid Al Masud
Author 6: Rashik Rahman

International Journal of Advanced Computer Science and Applications(IJACSA), Volume 14 Issue 11, 2023.

  • Abstract and Keywords
  • How to Cite this Article
  • {} BibTeX Source

Abstract: Facial image generation from textual generation is one of the most complicated tasks within the broader topic of Text-to-Image (TTI) synthesis. It is relevant in several fields of scientific research, cartoon and animation development, online marketing, game development, etc. There have been extensive studies on Text-to-Face (TTF) synthesis in the English language. However, the amount of relevant existing work in Bangla is limited and not comprehensive. As the TTF field is not vastly prospected for Bangla language, the objective of this study sets forth to explore the possibilities in the field of Bangla Natural Language Processing and Computer Vision. In this paper, a novel system for generating highly detailed facial images from textual descriptions in the Bangla language is proposed. The proposed system named Mukh-Oboyob consists of two essential components: a pre-trained language model, BanglaBERT, and Stable Diffusion. BanglaBERT, a transformer-based pre-trained text encoder, is a language model used to transform Bangla sentences into vector representations. Stable Diffusion is used by Mukh-Oboyob to generate facial images utilizing the text embedding of the Bangla sentences. Moreover, the work uti-lizes CelebA Bangla, a modified version of the CelebA dataset consisting of face images, Bangla facial attributes, and Bangla text descriptions to develop and train the proposed system. This paper establishes a system for image synthesis with excellent performance and detailed image outcomes, as evidenced by a comprehensive analysis incorporating both qualitative and quantitative measures, leading to the system under consideration achieving an impressive FID score of 34.6828 and an LPIPS score of 0.4541.

Keywords: Bangla text-to-face synthesis; Natural Language Processing (NLP); Bangla NLP; Computer Vision (CV); Generative Model; stable diffusion; BanglaBERT

Aloke Kumar Saha, Noor Mairukh Khan Arnob, Nakiba Nuren Rahman, Maria Haque, Shah Murtaza Rashid Al Masud and Rashik Rahman, “Mukh-Oboyob: Stable Diffusion and BanglaBERT enhanced Bangla Text-to-Face Synthesis” International Journal of Advanced Computer Science and Applications(IJACSA), 14(11), 2023. http://dx.doi.org/10.14569/IJACSA.2023.01411142

@article{Saha2023,
title = {Mukh-Oboyob: Stable Diffusion and BanglaBERT enhanced Bangla Text-to-Face Synthesis},
journal = {International Journal of Advanced Computer Science and Applications},
doi = {10.14569/IJACSA.2023.01411142},
url = {http://dx.doi.org/10.14569/IJACSA.2023.01411142},
year = {2023},
publisher = {The Science and Information Organization},
volume = {14},
number = {11},
author = {Aloke Kumar Saha and Noor Mairukh Khan Arnob and Nakiba Nuren Rahman and Maria Haque and Shah Murtaza Rashid Al Masud and Rashik Rahman}
}



Copyright Statement: This is an open access article licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, even commercially as long as the original work is properly cited.

IJACSA

Upcoming Conferences

IntelliSys 2025

28-29 August 2025

  • Amsterdam, The Netherlands

Future Technologies Conference 2025

6-7 November 2025

  • Munich, Germany

Healthcare Conference 2026

21-22 May 2026

  • Amsterdam, The Netherlands

Computing Conference 2026

9-10 July 2026

  • London, United Kingdom

IntelliSys 2026

3-4 September 2026

  • Amsterdam, The Netherlands

Computer Vision Conference 2026

15-16 October 2026

  • Berlin, Germany
The Science and Information (SAI) Organization
BACK TO TOP

Computer Science Journal

  • About the Journal
  • Call for Papers
  • Submit Paper
  • Indexing

Our Conferences

  • Computing Conference
  • Intelligent Systems Conference
  • Future Technologies Conference
  • Communication Conference

Help & Support

  • Contact Us
  • About Us
  • Terms and Conditions
  • Privacy Policy

© The Science and Information (SAI) Organization Limited. All rights reserved. Registered in England and Wales. Company Number 8933205. thesai.org