Hello everyone,
I want to know what threshold method can be used in web page cleaning. I mean web page cleaning that is removing boilerplate and extracting main content from a web page. Can you suggest how should I do it?
You can read this paper http://www.cs.uic.edu/~liub/publications/ijcai03-webClean.pdf
Hello, please can you share info with me about how to count the stop words and tokens for text. I would like clarification with examples. Thanks
10 November 2014 7,272 4 View
Hello everyone, I want to some keywords for web page classification such as news, sport and etc. I want these keywords for matching and for training the classification. May you help me how to get...
10 November 2014 2,710 0 View
09 October 2014 8,861 13 View
I wanna know how to find the HTML web pages data set? Can you help me?
08 September 2014 8,574 3 View
Now I use CETR dataset but most web page don't have correct html format. And then I don't want to use JTidy . Because I propose my research that is not used DOM. Therefore, I can't use this JTIdy...
08 September 2014 915 0 View
Hello, everyone I am interesting the Content Extraction from HTML web pages. Now I use the HTML tags for dividing the block of web page and use the tag-to-text ratio and anchor-text-to-text ratio...
08 September 2014 5,314 7 View
Hi, I am after the reference below, my library says it cannot obtain a copy either locally or internationally, any help appreciated! Chris Wang ZM, Heshka S, Wielopolski L, Pi-Sunyer FX, Pierson...
03 March 2021 6,193 1 View
Hi, I am planning to apply for the PhD degree in the Supply Chain Mgt. with specific area of "Cold Storage warehouses" during Pandemics and wars. Where lock downs and shut downs are frequent....
02 March 2021 285 2 View
I feel that the practice in teacher education in my country is below the expected performance level due to very poor management system. Hope I will learn something from your experiences.
02 March 2021 1,516 4 View
I would like to research on MR images (0.5T and 3T). Can you please suggest some websites that I can download dataset including both 0.5T and 3T MR images? Thank you.
02 March 2021 7,735 3 View
For my research i will need to measure plant quantitative traits (especially leaves area and roots length, but would be nice to add some more information). I recently discovered...
01 March 2021 5,035 2 View
Hello, The Cranet website does not provide any lead to the data obtained from their multinational survey on HRM practices. Any help on accessing the data would be most appreciated. Thank...
01 March 2021 9,536 3 View
As the organizations getting prepared for the fourth industrial revolution, to enhance the skills of their workforce what type of changes are necessary and going to occur in the training and...
01 March 2021 3,199 3 View
01 March 2021 3,905 5 View
Which is suitable for use with Python? MySQL or SQL Server? What is your suggestion?
01 March 2021 3,422 3 View
Hi, I'm looking for data (mainly related to management: growth rate, canopy size, soil and climate preferences, etc.) about tropical trees used in tropical agroforestry. Have you ever heard about...
28 February 2021 7,356 8 View