What is the most efficient way to categorize an unseen website?

05 May 2012 1 3K Report

I am building a system that automatically tells the category of any website, by just putting the URL as an input. Of course, this is a classical classification problem. I was wondering:

1. What input data should I use for classification (what part of the website is most informative of its category - home page content, meta keywords, domain name, a mixture of the above?)

2. What classification algorithm to use, so that the processing time of a new website is minimal. I am considering Bayesian filters (one for each category) but it is not the most computationally effective one( since it will try to evaluate each website as many times as categories there are). Another option I am considering are neural networks, or maybe even SVM.

Any suggestions are welcome!

Paul Nichol

Hi Steve, I have several devices I use for web analytics's, these are to evaluate various factors throughout the website let me have a bit more info and I will send you a link. What are you wanting to investigate? -

Speed, Age, Back links, Traffic (people visiting the site) Value - the more info you provide the better the tool I will suggest - most of them are cost free. (My website and contact details - http://www.apexbusinesssupport.com/).

Badges
Science topic

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Can we mark 'EFL Learners shifting from general digital to AI technologies' as technological transition?

After COVID-19 it has seen that EFL learners technological affiliation has raised. In addition, in the post-COVID period learners started to engage AI technologies like ChatGPT while learning...

08 August 2024 8,964 4 View

What are examples of AI for good projects a teacher can assign to students?

So I am organizing an AI seminar. What are possible AI projects in the AI for good spirit? something the students can do and have an impact?

08 August 2024 9,437 4 View

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

The rate of glucose consumption by the neocortex is reduced by over 80% during anesthesia (Sibson et al. 1998), which disables the synapses (Richards 2002) that are inundated by glial tissue (Engl...

08 August 2024 3,118 0 View

How to design human-centered classroom in the age of A.I.?

08 August 2024 347 5 View

Do experts have journals in the field of artificial intelligence and big data that are not indexed by SCI or EI?

05 August 2024 8,836 2 View

Measuring the Intelligence of a Species?

Larger brains, which typically contain more neurons, store and transfer more information (Tehovnik and Chen 2015), but the precise relationship between number of neurons and information has yet to...

05 August 2024 1,238 2 View

What's the role of IT & AI in Telecommunication Industry?

05 August 2024 8,264 3 View

Can usage of AI tools like chat GPT in research work is recommendable ?

AI tools like ChatGPT can enhance research work significantly when used responsibly and in conjunction with thorough human oversight.

05 August 2024 1,842 3 View