Turistler İçin, Engelli Bireylere Yönelik Ekler de İçeren, Görüntü Altyazılama Destekli Bilgilendirme ve Öneri Sistemi

Muhammed Salih Tatar; Rabia Kök; Aybars Ugur

doi:10.53070/bbd.1349149

Research Article

Turistler İçin, Engelli Bireylere Yönelik Ekler de İçeren, Görüntü Altyazılama Destekli Bilgilendirme ve Öneri Sistemi

Year 2023, Volume: IDAP-2023 : International Artificial Intelligence and Data Processing Symposium Issue: IDAP-2023, 180 - 190, 18.10.2023

Muhammed Salih Tatar Rabia Kök Aybars Ugur

https://doi.org/10.53070/bbd.1349149

Abstract

Turizm, bireylerin farklı kültürlerle etkileşimi için en önemli araçlardan biridir. Turizm sektörüne önemli bir teknolojik ve sosyal yenilik getirecek, turistler için engelli bireylere yönelik ekler de içeren görüntü altyazılama destekli bilgilendirme ve öneri sistemi işlevlerinden oluşan bir web uygulaması geliştirilmiştir. Bu uygulamada, GPS sistemiyle elde edilen konum bilgisi kullanılarak yakın çevredeki gözde mekanların, tarihi ve turistik bölgelerin turistlere daha efektif bir şekilde ulaştırılması amacıyla İzmir’deki gözde mekanları kapsayan bir veri seti oluşturulmuş ve istatistiksel bir yöntem olan Apriori Algoritması kullanılmıştır. Görme engelli bireylerin şehri keşfetmelerine yardımcı olmak amacıyla kullanıcıdan alınan girdi görüntüsü için transfer öğrenme modellerinden olan VGG16 ve LSTM modelleri ile görüntü altyazısı üretme ve nesne tanıma işlevlerini gerçekleştiren bir Kolaylaştırıcı Modül tasarlanmıştır. Bu modül sayesinde görme engelli bireylerin, şehirdeki sokakların ve nesnelerin anlık görüntülerini metinsel ve işitsel olarak tasvir etmeleri ayrıca daha önceden sisteme girdikleri anahtar kelimelerin (yangın, metro, heykel vs.) görüntüde olma durumunu belirleyebilmeleri sağlanmıştır. MS-COCO, Flicker8k, Flicker30k ve Tourism48 veri setlerindeki görüntülerden oluşturulan veri seti; %80 eğitim, %20 test verisi olarak ayrılmıştır. Yapılan testlerde başarı değerleri, BLEU-1 için %55,41 ve BLEU-2 için ise %30,15 olarak elde edilmiştir.

Keywords

Turizm, Derin Öğrenme, Nesne Tanıma, Görüntü Altyazısı Üretme, Öneri Sistemi

Supporting Institution

TÜBİTAK

Project Number

1919B012222734

Thanks

Bu çalışma, TÜBİTAK 2209-A programı desteği ile yürütülmüştür. Proje adı: (Turistler İçin Engelli Bireylere Yönelik Ekler de İçeren Görüntü Altyazılama Destekli Bilgilendirme ve Öneri Sistemi), proje numarası: (1919B012222734)

References

Bayrak, A. T., Öner, S. C., Gencer, M., Cerit, O. S., Oymagil, A., & Dalva, D. (2022, May). Using word embedding methods for product recommendation. In 2022 30th Signal Processing and Communications Applications Conference (SIU) (pp. 1-4). IEEE.
Bounab, Y., Oussalah, M., & Ferdenache, A. (2020, November). Reconciling image captioning and user’s comments for urban tourism. In 2020 Tenth International Conference on Image Processing Theory, Tools and Applications (IPTA) (pp. 1-6). IEEE.
Burke, R. (2002). Hybrid recommender systems: Survey and experiments. User modeling and user-adapted interaction, 12, 331-370.
Çilingir İ., medium.com, https://medium.com/@iremcilingir/%C3%B6neri-sistemleri-recommendation-systems-28a3f341c0a9, (Erişim Tarihi: 04.06.2023)
Dereli S. M., Veri Bilimi Okulu, https://www.veribilimiokulu.com/oneri-sistemleri-101/, (Erişim Tarihi: 04.06.2023) Ercan, F. (2020). Turizm pazarlamasında yapay zekâ teknolojilerinin kullanımı ve uygulama örnekleri. Ankara Hacı Bayram Veli Üniversitesi Turizm Fakültesi Dergisi, 23(2), 394-410.
Hodosh, M., Young, P., & Hockenmaier, J. (2013). Framing image description as a ranking task: Data, models and evaluation metrics. Journal of Artificial Intelligence Research, 47, 853-899.
Kuyu, M., Erdem, A., & Erdem, E. (2018). Altsözcük Ögeleri ile Türkçe Görüntü Altyazılama Image Captioning in Turkish with Subword Units. Küresel Amaçlar, Eşitsizliklerin Azaltılması – Madde:10.2, Madde:10.3, Madde:10.4, https://www.kureselamaclar.org/amaclar/esitsizliklerin-azaltilmasi/, (Erişim Tarihi: 05.06.2023) Küresel Amaçlar, İnsana Yakışır İş ve Ekonomik Büyüme – Madde:8.2, Madde:8.9, https://www.kureselamaclar.org/amaclar/esitsizliklerin-azaltilmasi/, (Erişim Tarihi: 05.06.2023)
Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., ... & Zitnick, C. L. (2014). Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13 (pp. 740-755). Springer International Publishing.
Liu, X., Xu, Q., & Wang, N. (2019). A survey on deep neural network-based image captioning. The Visual Computer, 35(3), 445-470.
Plummer, B. A., Wang, L., Cervantes, C. M., Caicedo, J. C., Hockenmaier, J., & Lazebnik, S. (2015). Flickr30k entities: Collecting region-to-phrase correspondences for richer image-to-sentence models. In Proceedings of the IEEE international conference on computer vision (pp. 2641-2649).
Sertçelik, Ş., & Önder, E. (2023). Yönetim Bilişim Sistemleri Kapsamında Akademik Araştırma Alanlarının İncelenmesi: Apriori Algoritması ile Bir Analiz. Gümüşhane Üniversitesi Sosyal Bilimler Dergisi, 14(2), 680-690.
Shaikh, F. (2018). Automatic image captioning using deep learning (CNN and LSTM) in PyTorch, Analytics vidhya.
Wang, C., Yang, H., Bartz, C., & Meinel, C. (2016, October). Image captioning with deep bidirectional LSTMs. In Proceedings of the 24th ACM international conference on Multimedia (pp. 988-997).
Zhang, X., Zou, J., He, K., & Sun, J. (2015). Accelerating very deep convolutional networks for classification and detection. IEEE transactions on pattern analysis and machine intelligence, 38(10), 1943-1955.

Image Caption Generation Supported Information and Recommender System for Tourists, Including Supplements for Individuals With Disabilities

Year 2023, Volume: IDAP-2023 : International Artificial Intelligence and Data Processing Symposium Issue: IDAP-2023, 180 - 190, 18.10.2023

Muhammed Salih Tatar Rabia Kök Aybars Ugur

https://doi.org/10.53070/bbd.1349149

Abstract

Tourism is one of the most important tools for individuals to interact with different cultures. A web application has been developed that includes information and recommendation system functions supported by image captioning, which will bring significant technological and social innovation to the tourism sector and includes additions for disabled individuals. In this application, a data set covering popular places in Izmir, including nearby popular, historical, and touristic places was created using location information obtained with the GPS system. The Apriori Algorithm, which is a statistical method, was used to deliver popular places more effectively to tourists. A Facilitator Module was designed that performs image captioning and object recognition functions using VGG16 and LSTM models, which are transfer learning models for user input images obtained from visually impaired individuals to help them explore the city. With this module, visually impaired individuals can describe the instant images of streets and objects in the city in textual and auditory form and can also determine whether previously entered keywords (fire, metro, statue, etc.) are present in the image. The data set created from images in MS-COCO, Flicker8k, Flicker30k and Tourism48 data sets was divided into 80% training and 20% test data. Success values were obtained as %55.41 for BLEU-1 and %30.15 for BLEU-2 in the tests conducted.

Keywords

Tourism, Deep Learning, Object Recognition, Image Captioning, Recommender System

Project Number

1919B012222734

References

Bayrak, A. T., Öner, S. C., Gencer, M., Cerit, O. S., Oymagil, A., & Dalva, D. (2022, May). Using word embedding methods for product recommendation. In 2022 30th Signal Processing and Communications Applications Conference (SIU) (pp. 1-4). IEEE.
Bounab, Y., Oussalah, M., & Ferdenache, A. (2020, November). Reconciling image captioning and user’s comments for urban tourism. In 2020 Tenth International Conference on Image Processing Theory, Tools and Applications (IPTA) (pp. 1-6). IEEE.
Burke, R. (2002). Hybrid recommender systems: Survey and experiments. User modeling and user-adapted interaction, 12, 331-370.
Çilingir İ., medium.com, https://medium.com/@iremcilingir/%C3%B6neri-sistemleri-recommendation-systems-28a3f341c0a9, (Erişim Tarihi: 04.06.2023)
Dereli S. M., Veri Bilimi Okulu, https://www.veribilimiokulu.com/oneri-sistemleri-101/, (Erişim Tarihi: 04.06.2023) Ercan, F. (2020). Turizm pazarlamasında yapay zekâ teknolojilerinin kullanımı ve uygulama örnekleri. Ankara Hacı Bayram Veli Üniversitesi Turizm Fakültesi Dergisi, 23(2), 394-410.
Hodosh, M., Young, P., & Hockenmaier, J. (2013). Framing image description as a ranking task: Data, models and evaluation metrics. Journal of Artificial Intelligence Research, 47, 853-899.
Kuyu, M., Erdem, A., & Erdem, E. (2018). Altsözcük Ögeleri ile Türkçe Görüntü Altyazılama Image Captioning in Turkish with Subword Units. Küresel Amaçlar, Eşitsizliklerin Azaltılması – Madde:10.2, Madde:10.3, Madde:10.4, https://www.kureselamaclar.org/amaclar/esitsizliklerin-azaltilmasi/, (Erişim Tarihi: 05.06.2023) Küresel Amaçlar, İnsana Yakışır İş ve Ekonomik Büyüme – Madde:8.2, Madde:8.9, https://www.kureselamaclar.org/amaclar/esitsizliklerin-azaltilmasi/, (Erişim Tarihi: 05.06.2023)
Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., ... & Zitnick, C. L. (2014). Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13 (pp. 740-755). Springer International Publishing.
Liu, X., Xu, Q., & Wang, N. (2019). A survey on deep neural network-based image captioning. The Visual Computer, 35(3), 445-470.
Plummer, B. A., Wang, L., Cervantes, C. M., Caicedo, J. C., Hockenmaier, J., & Lazebnik, S. (2015). Flickr30k entities: Collecting region-to-phrase correspondences for richer image-to-sentence models. In Proceedings of the IEEE international conference on computer vision (pp. 2641-2649).
Sertçelik, Ş., & Önder, E. (2023). Yönetim Bilişim Sistemleri Kapsamında Akademik Araştırma Alanlarının İncelenmesi: Apriori Algoritması ile Bir Analiz. Gümüşhane Üniversitesi Sosyal Bilimler Dergisi, 14(2), 680-690.
Shaikh, F. (2018). Automatic image captioning using deep learning (CNN and LSTM) in PyTorch, Analytics vidhya.
Wang, C., Yang, H., Bartz, C., & Meinel, C. (2016, October). Image captioning with deep bidirectional LSTMs. In Proceedings of the 24th ACM international conference on Multimedia (pp. 988-997).
Zhang, X., Zou, J., He, K., & Sun, J. (2015). Accelerating very deep convolutional networks for classification and detection. IEEE transactions on pattern analysis and machine intelligence, 38(10), 1943-1955.

There are 14 citations in total.

Details

Primary Language	Turkish
Subjects	Computer Vision, Image Processing, Deep Learning, Big Data, Data Mining and Knowledge Discovery, Artificial Intelligence (Other)
Journal Section	PAPERS
Authors	Muhammed Salih Tatar 0009-0007-8244-8042 Rabia Kök 0000-0002-2467-0688 Aybars Ugur 0000-0003-3622-7672
Project Number	1919B012222734
Publication Date	October 18, 2023
Submission Date	August 24, 2023
Acceptance Date	August 26, 2023
Published in Issue	Year 2023 Volume: IDAP-2023 : International Artificial Intelligence and Data Processing Symposium Issue: IDAP-2023

Cite

APA	Tatar, M. S., Kök, R., & Ugur, A. (2023). Turistler İçin, Engelli Bireylere Yönelik Ekler de İçeren, Görüntü Altyazılama Destekli Bilgilendirme ve Öneri Sistemi. Computer Science, IDAP-2023 : International Artificial Intelligence and Data Processing Symposium(IDAP-2023), 180-190. https://doi.org/10.53070/bbd.1349149