Monday, June 3, 2019
Food Clustering For Diabetes Diet Health And Social Care Essay
Food Clustering For Diabetes Diet Health And Social C argon EssayThe common way for Diabetes Educators to assure diabetes patients of their nutrition therapy is by introducing aliment substitution. The existing categorization mechanism is not efficiently for classify the nutrition for diabetic patient. Clustering info Mining (DM) Techniques give the axe be a very useful tool to collect intellectual nourishment items with the analogous elements into groups. This paper looks at the use of K-mean to Cluster food infoset into groups based on food elements utilise RapidMiner tool .The output from the clustering algorithm will help other recommendation systems software to provide patient with a close recommendation for there diabetes provender.Keywordsselective information archeological site diabetes, data set ,K-meant.1. IntroductionFood and nutrition are a key to have good health. They are important for everyone to maintain a healthy diet especially for diabetic patients wh o have several limitations. Nutrition therapy is a major solution to prevent, manage and tally diabetes by managing the nutrition based on the belief that food provides vital medicine and maintains a good health. Typically, diabetic patients need to avoid additional cacography and fat for finding the substitution from the same food group 4.The effective clustering from the various actual nutrients is needed to apply. The clustering will encourage diabetics to carry off the widest possible variety of permitted food to ensure getting the full range of trace elements and other nutrients. This paper is set out as follows. Section 2, introduces almost related work of data mining and diabetic diet. Section 3, describes the apply data set and summarize the main features that it contains. Data preparation work is presented in Section 4. Section 5, describes the materials and methods apply in this study. In Section 6, the conclusion is given.2. Literature ReviewLi et al 1, this study proposed an automated food ontology constructed for diabetes diet care. The methods include generating an ontology skeleton with hierarchical clustering algorithms (HCA)also it is used intersection naming for class naming and instance ranking by granular ranking and fix .This study based on dataset from food nutrition composition database of the Department Of Health the dataset. Phanich et al 2, proposed Food Recommendation System (FRS) by using food clustering analysis for diabetic patients. The system will recommend the proper substitutedfoods in the context of nutrition and food characteristic. They used Self-Organizing Map (SOM) and K-mean clustering for food clustering analysis which is based on the similarity of eight-spot significant nutrients for diabetic patient. This study is based on the dataset Nutritive values for Thai food provided by Nutrition Division, Department of Health, Ministry of Public Health (Thailand).3. Dataset DescriptionThis study is based on the datase t provided by The USDA National Nutrient Database for Standard role (SR)3.the Values in the database based on the results of laboratory analyses or calculated by using appropriate algorithms, factors, or recipes, as indicated by the source in the Nutrient Data file. Not every food item contains a complete nutrient profile. The used data set is an abbreviated file with few nutrients but all the food items was included. The Dataset contains all the food items with nutrients with 7540 records and 52 attributes. Table1, 2 and 3 show data set attributes and their description. In order to check for abstracted value I used Rapid Miner tool. Table 4 present sample of data set.4. Data PreparationThe quality of the results of the mining process is directly proportional to the quality of the data. I need first to prepare the data set by applying Data preprocessing strategies. Data preprocessing is an important and critical footprint in the data mining process, and it has a huge impact on t he success of a data mining project. The purpose of data preprocessing is to cleanse the cloudy/noise data. Fig. 1 shows the different strategies in the data preprocessing phase. In this study I focused on data cleaning and data reduction. envision 1 strategies in data preprocessingTable 1 description of data set attributes from 1- 24Table 2 description of data set attributes from 25-48Table 3 description of data set attributes from 49-52Table 4 Sample of datasetShrt_DescWaterEnerg_KcalProteinLipid_TotAshCarbohydrtSugar_TotothersBUTTER,WITH SALT15.877170.8581.112.110.060.06BUTTER,WHIPPED,WITH SALT15.877170.8581.112.110.060.06BUTTER OIL,ANHYDROUS0.248760.2899.48000CHEESE,BLUE42.4135321.428.745.112.340.5CHEESE,BRICK41.1137123.2429.683.182.790.51Data CleaningData cleaning, also called data cleansing or scrubbing, deals with detecting and removing errors and Inconsistencies from data in order to improve the quality of data 6. The aim of data cleaning is to win the data quality to a le vel suitable for the clustering analyses. The Methods used for data cleaning are fill in missing values and abolish data redundancy.Missing valueIt is common for the dataset to have fields that contain unknown or missing values. There are a variety of original reasons why this can happen. There are a number of methods for treating records that contain missing values 71. Omit the ill-judged field(s)2. Omit the entire record that contains the incorrect field(s)3. Automatically enter/correct the data with default values e.g. select the mean from the range4. Derive a model to enter/correct the data5. Replace all values with a global constantWithin this study both missing and unknown data have been set to zero. matchingd RecordsDuplicate records do not share a common key and/or they contain errors that make duplicate matching a difficult task. Errors are introduced as the result of transcription errors, incomplete information, lack of standard formats, or any combination of these fact ors 7 . The data set used in this study include data objects that are duplicate. Using RapidMiner to removing duplication .As result from this process the 7540 records decreased to 7139 record.Data ReductionData reduction can be achieved in many ship canal one way is by selecting features 5, The used data set contains many Irrelevant features that contain almost no useful information for data mining task As 2 I will focus only on eight attributes out of fifty two attributes, as they are important for diabetes diet.The eight nutrients includeCarbohydrateEnergyFatproteinFibervitamin EVitamin B1(also known as thiamine)Vitamin CData NormalizationData normalization is one of the preprocessing procedures in data mining, where the attribute data are scaled so as to fall within a small specified range such as -1.0 to 1.0 or 0.0 to 1.0.Normalization before clustering is specially needed for distance metric, such as Euclidian distance, which are sensitive to differences in the magnitude or sc ales of the attributes.The K-Means typically uses Euclidean distance to measure the distortion between a data object and its cluster centroid .However, the clustering results can be greatly touched by differences in scale among the dimension from, which the distances are computed. Data normalization is the linear transformation of data to a specific range. Therefore, it is worthwhile to enhance clustering quality by normalizing the dynamic range of input data objects into specific range 8.in this study I will normalize data to the range of 0, 1 . Figure 2 show the result from the data preprocessingFigure 2 Result from Preprocessing(Data cleaning , Data Reduction , Data Normalization)5. Data Analysis Methodology by and by data preparation, a second step is using a K-means to cluster food data set. In order to work with optimal k-value as 2 used the Davies-Bouldin index 9 to evaluate the optimal k-value. The k-value is optimal when the related index is smallest. For this study,I used K=19 since it gives the smallest value.The final result is the food clusters which foods in the same group provide the approximate amount of the eight nutrients. Data analysis solution RapidMiner was used to analysis the data set and cluster food item. The square process sequence shown in figure 3.figure 4, 5, 6 shows the final result.Figure 3 data analysis processFigure4 food Items clustered into 19 clustersFigure4 dispersion of 8 Nutrients into clusters from (0-12)Figure4 distribution of 8 Nutrients into clusters from (13-18)5.1 K-mean Evaluationa performance based on the number of clusters.This operation builds a derived index from the number of clusters by using the formula 1 (k / n) with k number of clusters and n covered examples. It is used for optimizing the coverage of a cluster result in respect to the number of clusters. By applying the K-mean model to this data set the Cluster number index = 0.997 witch indicate a good coverage.6. ConclusionData mining has been widely used in many health care fields. The Diabetes Diet Care was one of the health problems that data mining play role on it .this experiment are conducted based on USDA National Nutrient dataset. The results demonstrate that K-mean is very effective and it can successfully create food groups that will help in many recommendations systems.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.