Improved probable clustering based on data dissemination for retrieval of web URLs


Sunita,Vijay Rana,



Intelligent service classification,Natural Language Processing,Location sensitive searching,


The programmable paradigm in web technologies is evolving into a web service model where services and information can be reused by distinct users. Diverse information is present over the web and the problem of relevant information discovery based on location is a big challenge for web information retrieval system. Lack of Intelligent classification of information compounded the problem further. This paper presents an approach that extends information similarity analysis using probable clustering procedure and introduces specific results based on the current location of the user using Google location services. To capture the similarity of functional text, feature vector techniques are employed. Dissimilar words are classified as stop words and eliminated from the query string to reduce the complexity of search space. Location sensitive mechanism fetches only relevant information belonging to the current location of a user. Experiments were performed to compare classification accuracy with respect to various models used for feature vector extraction and result in emphasis the effectiveness of Semantic similarity extractor location-based web service model.


I. A. R. Patil, “An Innovative Approach to Classify and Retrieve Text Documents
using Feature Extraction and Hierarchical Clustering based on Ontology,”
International Conference on Computing, Analytics and Security Trends (CAST)
IEEE, pp. 371–376, 2016.
II. A. I. Pratiwi, “On the Feature Selection and Classification Based on
Information Gain for Document Sentiment Analysis,” Applied Computational
Intelligence and Soft Computing, pp.33-37, 2018.
III. A. Cocos, M. Apidianaki, and C. Callison-burch, “Word Sense Filtering
Improves Embedding-Based Lexical Substitution,” In Proceedings of the 1st
Workshop on Sense, Concept and Entity Representations and their
Applications. pp. 110–119, 2017.
IV. C. Xiong and K. Lv, “An Improved K-means Text Clustering Algorithm By
Optimizing Initial Cluster Centers,” 7th International Conference on Cloud
Computing and Big Data (CCBD) IEEE. pp. 272–275, 2016.
V. D. Sumeet and P. Chowriappa, “Feature Selection and Extraction Strategies in
Data Mining,” Data Mining for Bioinformatics, CRC Press, pp. 113–144, 2012.
VI. D. Li, W. Zhang, S. Shen, and Y. Zhang, “SES-LSH : Shuffle-Efficient
Locality Sensitive Hashing for Distributed Similarity Search,” International
Conference on Web Services (ICWS) IEEE, pp. 822-827, 2017.
VII. F. T. Garc, J. Garc, A. Lucila, S. Orozco, F. Dami, and T. Kim, “Locating
Similar Names Through Locality Sensitive Hashing and Graph Theory,”
Multimedia Tools and Applications Springer, vol. 10, no.12, pp.1-14, 2018.
VIII. H. Shen, T. Li, Z. Li, and F. Ching, “Locality Sensitive Hashing Based
Searching Scheme for a Massive Database,” Third International Conference on
Digital Telecommunications (icdt 2008) IEEE, vol. 47, no. 52, IEEE, pp. 0–5,
IX. H. A. Atabay, “A Clustering Algorithm based on Integration of K-Means and
PSO,” 1st Conference on Swarm Intelligence and Evolutionary Computation
(CSIEC), IEEE. pp. 59–63, 2016.
X. J. K. Mandal, Advanced Computing and Communication Technologies
Springer, vol. 452, pp.494, 2016.
XI. J. Singh Chouhan and A. Gadwal, “Improving Web Search User Query
Relevance using Content based Page-Rank,” IEEE Int. Conf. Comput.
Commun. Control. IC4 , pp. 1-5, 2016.
XII. J. G, “RKE-CP : Response-based Knowledge Extraction from Collaborative
Platform of Text-based Communication,” International Journal of Advanced
Computer Science and Applications (IJACSA), vol. 8, no. 5, pp. 93–98, 2017.
XIII. K. Mishina, “Word Sense Disambiguation of Adjectives using Dependency
Structure and Degree of Association Between Sentences,” International
Conference on Asian Language Processing (IALP),IEEE. pp. 342–345, 2017.
XIV. M. Lapata and F. Keller, “Web-based Models for Natural Language
Processing,” Transactions on Speech and Language Processing (TSLP) ACM,
vol. 2, no. 1, pp. 1–30, 2005.

XV. M. Kaur, “Text Classification using Clustering Techniques and and PCA,”
Fourth International Conference on Parallel, Distributed and Grid Computing
(PDGC), IEEE, pp. 642-646, 2015.
XVI. M. Aydar and S. Ayvaz, “An Improved Method of Locality-Sensitive Hashing
for Scalable Instance Matching,” Knowledge and Information Systems, vol.58,
no.2, pp. 275-294, 2018.
XVII. R. Collobert, J. Weston, and M. Karlen, “Natural Language Processing from
Scratch,” Transactions on Speech and Language Processing ACM, vol. 1, pp.
1–34, 2000.
XVIII. S. Sharma, Sunita, A. Kumar, and V. Rana, “An Optimum Approach for
Preprocessing of Web User Query,” International Journal of Informatics and
Communication Technology (IJ-ICT), vol. 7, no. 1, pp. 8–12, 2018.
XIX. Sunita, and V. Rana,” Removing Ambiguity Problem Based on Clustering in a
Web Search,” First International Conference on Secure Cyber Computing and
Communication (ICSCCC) IEEE, pp. 9-12, 2018.
XX. Z. Jin, Y. Lai, J. Y. Hwang, S. Kim, and A. J. Teoh, “Ranking Based Locality
Sensitive Hashing Enabled Cancelable Biometrics : Index – of – Max Hashing,”
Transactions on Information Forensics and Security IEEE, vol. 60, no.13, pp.
393-407, 2018.
XXI. Z. Lu, Q. Liao, and D. Li, “Locality Sensitive Hashing Based Deepmatching
for Optical Flow Estimation,” International Conference on Acoustics, Speech
and Signal Processing (ICASSP) IEEE, pp. 1472–1476, 2017.

View Download