The burden of diet-related diseases is high in Central Asia. In recent years, the field of food computing has gained prominence due to advancements in computer vision (CV) and the increasing use of smartphones and social media. These technologies provide promising potential in many applications by facilitating real-time information retrieval from food images for efficient digital food journaling, smart restaurants, and supermarkets etc. Yet, to develop a robust CV model for food information retrieval, a large-scale high quality dataset is required. Several food dataset have been developed covering Western, Mediterranean, Chinese etc. cuisines. These dataset solve the simpler problem of food classification with single food item per image, which is not practical for real-life scenarios, where meals typically consist of multiple food items. To address this gap, we developed a large-scale high-quality Central Asian Food Scenes Dataset for food localization and detection. The dataset contains 21,306 images across 239 food categories, 69,856 instances. ed images. To evaluate the dataset, we performed the parametric experiments with the object detection models, with the best results achieved using YOLOv8xl (mAP50 score of 0.677).
The rapid growth of food-related big data, driven by social media, the Internet of Things (IoT), and Artificial Intelligence (AI), has given rise to an interdisciplinary research field known as food computing. Due to its significant implications for human health, diet, and disease, food computing has become a key area of focus in disciplines such as computer vision, multimedia, medicine, health informatics, agriculture, and bioengineering. First introduced in 2015, the term "food computing" encompasses several major tasks, including food recognition, retrieval, and recommendation. Among these, food recognition is foundational, involving the identification and classification of food items from visual data (e.g., images or videos), which is crucial for applications such as nutrition tracking, food authentication, smart restaurants and supermarkets, and waste management. Particularly in response to the rise in diet-related diseases, various tools are being developed to enable fast and accurate food logging for dietary monitoring. Increasing people's nutrition literacy and improving dietary habits start with increasing their awareness of current eating patterns. The importance of these solutions lies in the potential to foster healthy dietary patterns and serve as a preventive measure against chronic diseases including obesity, diabetes, and cardiovascular disease (CVD), which is mainly caused by the high intake of red meat and processed meats including obesity, diabetes, and cardiovascular disease (CVD), which is mainly caused by the high intake of red meat and processed meat.
According to the Global Burden of Disease Studies, the incidence of diet-related CVD deaths and Disability-adjusted Life Years (DALYs) in 2019 amounted to 6.9 million and 153.2 million, respectively, indicating a substantial increase of 43.8% and 34.3% since 1990. Many countries with high diet-related CVD deaths and DALYs deaths are situated in Central Asia, while the lowest death rates are in the high-income Asia Pacific region. Among the specific dietary patterns identified as having a major contribution to CVD deaths and DALYs are diets characterized by high salt intake, excessive empty calories, insufficient fruit, nuts, seeds, and vegetable consumption, and diets with low Omega-3 levels.
Over the last decade, Artificial Intelligence (AI) has found a wide range of applications in the food industry and agriculture. In combination with other components such as smart sensors, big data, and blockchain technologies, AI is being used in various stages of the food chains, such as food classification, production developments, quality testing and improvement, supply chain management, and food safety monitoring. For example, in food safety, machine learning has greatly improved the detection of potential contamination sources during production. Convolutional Neural Networks (CNN) can assist in automated quality control, preventing inferior products from entering the market by precisely identifying small defects or foreign objects on food items. Deep learning (DL) technologies like recurrent neural networks (RNN) and long short-term memory networks (LSTM) were used for real-time monitoring and detailed classification of product appearance production lines, enhancing overall product quality. In agriculture, several AI-enabled surveillance systems have been developed to assist farmers with detecting pests and monitoring crops and soil issues to maximize the harvest yield. Precision agriculture is getting interested, which uses AI systems in all steps from planting and watering to the final crop harvesting. To automate the manual crop collection process several computer vision (CV) models have been developed for fruit and vegetable detection. Another work has been done on the ripeness assessment of the fruits and vegetables, which is essential not only in the fields to automate the harvest collection process but also in the stocks and food chains for the products' freshness check.
Besides this, the development of CV significantly increased the functionality of innovative tools, such as mobile applications, to offer convenient ways for individuals to automatically monitor and manage their nutritional intake. In this work, we are focusing on the application of AI to facilitate nutrition literacy and dietary interventions. Empowering individuals to make well-informed decisions regarding their dietary selections can help reduce the burden of dietary-related diseases and enhance overall health outcomes. This starts with raising awareness of their current behaviors. Capturing food images not only reduces the burden associated with maintaining traditional food diaries and benefits individuals lacking access to expert healthcare resources or consultation, but also boosts the social support to achieve collectively healthy eating objectives when shared on social media platforms. Furthermore, food images contain precise contextual information that can be utilized by healthcare professionals to provide personalized diagnoses and treatment recommendations.
The development of automatic food logging using CV requires a high-quality dataset. A major challenge is associated with the nature of the food domain such as intra-class and inter-class variability, visual characteristics, and high variability in shape. The problem of diet tracking using CV can be addressed in two approaches: image classification problem and object detection problem. In the first case, the entire image is classified by a single category or class. In this case, a single food item must be present on an image. Several web-crawled publicly available food classification datasets have been released, such as the Food-101 dataset, which features 101 European food categories with 1,000 images per class, establishing itself as a benchmark for numerous recognition models. Furthermore, the ISIA Food-500 dataset containing Asian, European, and African cuisines contains 500 food categories with over 400,000 images. A DL-based food recognition system has been developed and trained on a dataset of 400,000 internet-scraped images, identifying 756 food classes predominantly consumed in Singapore.
The second approach of food detection is a more complex task, since, it combines object classification and localization sub-tasks. In this case, food scenes containing multiple food items are considered. For this approach, very few publicly available datasets have been created. For example, the BTBUFood-60 dataset contains 60,000 images with 78,000 labeled instances across 60 food categories from Japanese cuisine. Another publicly available dataset is UNIMIB2016, which contains 1027 images with a total of 3,616 instances spanning 73 food categories from Western cuisine. Currently, most works on the application of CV for food recognition solve the classification problem. The main limitation of the classification datasets is that only one food item per image can be predicted, which may not be suitable in real life since food scenes comprising several food items are common. Therefore, for user convenience and fast food logging, food detection and localization datasets and models are required. Yet, no publicly available dataset covers Central Asian cuisine. This study aims to develop a food scene dataset that contains commonly consumed food items in Central Asia and to train a computer vision model for automatic food detection. The dataset includes annotations for each food item, labeled with rectangular bounding boxes and corresponding food class names.
In our previous work, we presented the first Central Asian Food Dataset (CAFD) for a classification task that contains 16,499 images of 42 Central Asian food items. This work presents the first Central Asian Food Scenes Dataset (CAFSD), containing 21,306 images with 69,865 instances across 239 food classes. The dataset encompasses a wide array of food categories, including local Central Asian cuisine as well as Western, Mediterranean, Chinese, and others commonly consumed in Central Asia. This dataset presents a significant contribution to CV-assisted food and dietary tracking applications, aiming to enhance their utility by encompassing the diverse culinary landscape of Central Asia.